Difficult to match multiline log file with multiple patterns

Hi –
New Graylog user here. I’ve run into a problem with sending a resin (java JSP server similar to tomcat) log file into graylog. The log file has multiple “types” of multi-line log messages, which makes using a single filebeat rule (even if I use multiple OR statements in the regexp) difficult (or impossible as far as I can tell). I’ve included a sample here showing some single line, and multi line entries.

    [00:54:04.866] {http--8000-15$268023904} Serious error occurrred: java.lang.NullPointerException
    [00:54:04.866] {http--8000-15$268023904} java.lang.NullPointerException
    [00:54:04.866] {http--8000-15$268023904} Error: java.lang.NullPointerException
    [00:54:04.866] {http--8000-15$268023904}     
    [00:55:47.533] {Timer-6} 8/11/17 12:55 AM | SessionCache.Perge - 0ms (75/171) n=30
    [00:55:56.359] {DefaultQuartzScheduler_QuartzSchedulerThread} 00:55:56.359 [DefaultQuartzScheduler_QuartzSchedulerThread] DEBUG org.quartz.core.QuartzSchedulerThread - batch acquisition of 0 triggers
    [01:11:33.155] {http--8000-10$1894935270} boards.exceptions.RedirectException
    [01:11:33.155] {http--8000-10$1894935270}       at boards.request.Request.redirect(Request.java:703)
    [01:11:33.156] {http--8000-10$1894935270}       at com.caucho.util.ThreadPool$Item.runTasks(ThreadPool.java:743)
    [01:11:33.156] {http--8000-10$1894935270}       at com.caucho.util.ThreadPool$Item.run(ThreadPool.java:662)
    [01:11:33.156] {http--8000-10$1894935270}       at java.lang.Thread.run(Thread.java:619)
    [01:28:02.403] {DefaultQuartzScheduler_QuartzSchedulerThread} 01:28:02.403 [DefaultQuartzScheduler_QuartzSchedulerThread] DEBUG org.quartz.core.QuartzSchedulerThread - batch a
    cquisition of 0 triggers
    [01:28:07.357] {http--8000-4$1868584300} Error: com.caucho.java.JavaCompileException: /boards/test/realcategorystats.jsp:42: cannot find symbol
    [01:28:07.357] {http--8000-4$1868584300} symbol  : method getRealStatsURL(java.lang.String)
    [01:28:07.357] {http--8000-4$1868584300} location: class boards.util.URL
    [01:28:07.357] {http--8000-4$1868584300}       out.print(( URL.getRealStatsURL(mr.getParameter(Schema.TEST_ID))));
    [01:28:07.357] {http--8000-4$1868584300}                      ^
    [01:28:07.357] {http--8000-4$1868584300} 1 error
    [01:28:07.357] {http--8000-4$1868584300}        at com.caucho.java.AbstractJavaCompiler.run(AbstractJavaCompiler.java:102)
    [01:28:07.357] {http--8000-4$1868584300}        at java.lang.Thread.run(Thread.java:619)
    [01:28:07.357] {http--8000-4$1868584300} 
    [01:28:28.923] {DefaultQuartzScheduler_QuartzSchedulerThread} 01:28:28.923 [DefaultQuartzScheduler_QuartzSchedulerThread] DEBUG org.quartz.core.QuartzSchedulerThread - batch acquisition of 0 triggers
    [01:38:02.776] {http--8000-20$2105617913} Error: java.lang.NumberFormatException: For input string: "5 and 1=1"
    [01:38:02.776] {http--8000-20$2105617913}       at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
    [01:38:02.776] {http--8000-20$2105617913}       at com.caucho.server.http.HttpRequest.handleRequest(HttpRequest.java:273)
    [01:38:02.776] {http--8000-20$2105617913}       at java.lang.Thread.run(Thread.java:619)
    [01:38:02.776] {http--8000-20$2105617913} 
    [01:38:13.803] {http--8000-18$1403411429} Error: java.lang.NumberFormatException: For input string: "5 or (1,2)=(select*from(select name_const(CHAR(111,108,111,108,111,115,104,101,114),1),name_const(CHAR(111,108,111,108,111,115,104,101,114),1))a) -- and 1=1"
    [02:43:46.351] {http--8000-19$302001047} No random.  Size: 5

Because there are several types of multi-line log entries, I’m trying to figure out the best way to get these into graylog as a single message. All of the multiline messages do share a common string, but the string changes with each message. (4$1868584300 from above, for example). Is it possible to group together any messages that share a common string that changes per message grouping?

Alternatively, i’m open to any other ideas that folks may have, since I’m not very well versed in this world. Thanks in advance.

I’d recommend using Logstash and its multiline codec to merge the multiline log messages again before sending them to Graylog:
https://www.elastic.co/guide/en/logstash/5.5/plugins-codecs-multiline.html

Thanks for the response. I’m hesitant to add logstash to the stack, at least partially because it seems daunting to integrate logstash into a world I’m still struggling to figure out the basics for. I’m also not excited about having to add another server link to the chain.

My preference would be to use the existing stack, or alternatively something simple on the client side (like beats, nxlog, etc) to match the changing, but common string that is present at the beginning of each line of the multi line log entry (The ones starting with http–8000).

If there isn’t a way to combine log entries based on a changing common string between them, then I’d consider looking at integrating logstash. Is there any sort of guide on integrating logstash into graylog? The documents I’ve found seem to be old, and rely on adding queueing software into the stack, in addition to logstash.

Thanks!

Filebeat supports merging multiple lines into a single event:

As far as I can tell filebeat doesn’t support multiple multiline.pattern, multiline.negate, and multiline.match entries for the same log. Am I missing something? I know I can do an “OR” pattern in the multiline.pattern entry, but that doesn’t help me because I can only have a single negate and/or match entry.

Is there any method that supports grouping multiline entries based on a common unique ID (like a Thread ID/PID/etc) that gets printed on all the lines in the multiline entry?

IE, I’d like to group these four lines into 2 multi-line entries based on the “ThreadID” Value:
[01:28:07.357] ThreadID:111111 Muti-line entry 1
[01:28:07.358] ThreadID:111111 Multiline Entry 2
[01:28:07.367] ThreadID:222222 Muti-line entry 1
[01:28:07.368] ThreadID:222222 Multiline Entry 2

In my case the “ThreadID” multiline entries would all be written one right after another, and not separated by other threadID’s, which hopefully makes this easier.

I suggest posting a question in the Elastic discussion forums:

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.