Help with parsing in text logfile to Graylog2


(Victor Hooi) #1

I have an application that generates text logfiles, which I want to parse into Graylog for analysis.

What’s the best way of parsing in such logfiles into Graylog?

This earlier post mentions something about extractors vs pipelines, but I don’t quite follow the arguments.

What is the “best practice” way currently in 2018?

Secondly - the loglines look something like this:

2018-09-06T00:37:06.470ZI [10752] sync_engine.cc:1197:cello::sync::`anonymous-namespace'::SyncEngineImpl::OnInitializeComplete Starting repeating push task timer with interval 30000ms

So there’s a timestamp, a process ID, a code reference, and the actual payload.

Do I use a pipeline here with a rule, then set_fields to strip out the various fields?

Some lines will have more fields - e.g.:

2018-09-06T00:57:06.168ZI [10752] sync_engine.cc:1205:cello::sync::`anonymous-namespace'::SyncEngineImpl::OnInitializeComplete::<lambda_3fe069024adb1854b3541ddbc3601c19>::operator () Sync engine activity state:
operation_queue_size: 0
all_downloads: 0
operation_queue_unique_size: 0
change_ids_up_to_date: true

Do I just use a rule, to detect for strings in those lines, then more complex set_fields?

The other issue is - that last line is multiline =(. (Yeah, I know, that makes parsing hard). Is there a good way of handling that? (Assuming the next message must start with a timestamp).


(Jan Doberstein) #2

if you can identify the logline - the beginning of a new logline - using filebeat as shipper for multiline logs should not be a problem.

Then you would write some pipeline rules that will extract all information out of the messages that are needed. That isn’t that special and should be doable


(Victor Hooi) #3

I can probably use the timestamp to identify the start of the line. I assume I’d have to be fairly unlucky to split a line that also contained a timestamp in exactly the same format. However, not sure of a way to be more robust/precise than that.

Filebeat as in this one right?

https://www.elastic.co/products/beats/filebeat

And I just use with this plugin?

Is there any official documentation or tutorials around this? Or recommended starting material?


(Jan Doberstein) #4

filebeat is the one from elastic yes. For Graylog you do not need additional software. The beats protocol is supported native by Graylog.

In the definition of the start pattern of a multiline message (the timestamp) you can add simple ^ that the timestamp needs to be the first field on a line.

Maybe the getting started guide is something that helps you: http://docs.graylog.org/en/2.4/pages/getting_started.html


(system) #5

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.