I have an application that generates text logfiles, which I want to parse into Graylog for analysis.
What’s the best way of parsing in such logfiles into Graylog?
This earlier post mentions something about extractors vs pipelines, but I don’t quite follow the arguments.
What is the “best practice” way currently in 2018?
Secondly - the loglines look something like this:
2018-09-06T00:37:06.470ZI  sync_engine.cc:1197:cello::sync::`anonymous-namespace'::SyncEngineImpl::OnInitializeComplete Starting repeating push task timer with interval 30000ms
So there’s a timestamp, a process ID, a code reference, and the actual payload.
Do I use a pipeline here with a rule, then set_fields to strip out the various fields?
Some lines will have more fields - e.g.:
2018-09-06T00:57:06.168ZI  sync_engine.cc:1205:cello::sync::`anonymous-namespace'::SyncEngineImpl::OnInitializeComplete::<lambda_3fe069024adb1854b3541ddbc3601c19>::operator () Sync engine activity state: operation_queue_size: 0 all_downloads: 0 operation_queue_unique_size: 0 change_ids_up_to_date: true
Do I just use a rule, to detect for strings in those lines, then more complex
The other issue is - that last line is multiline =(. (Yeah, I know, that makes parsing hard). Is there a good way of handling that? (Assuming the next message must start with a timestamp).