Graylog extractor for changing key value pairs

We currently have a single grok extractor for processing our F5 BigIP ASM log traffic that is being sent in the Key-Value pairs (Splunk) format.

It has been working great, though there is likely a better way to do it.

Here is a sample of the existing grok pattern that has been in place for years:
unit_hostname="%{HOSTNAME:unit_hostname;string}",management_ip_address="%{IPV4:management_ip_address;string}",http_class_name="%{DATA:http_class_name;string}",web_application_name="%{DATA:web_application_name;string}",policy_name="%{DATA:policy_name;string}",policy_apply_date="%{DATA:policy_apply_date;date;yyyy-MM-dd HH:mm:ss}"

We have now upgraded our BigIP, and plan to keep it more up to date moving forward. One result, something we expect to happen every so often, was that the delivered log format has changed.
No existing fields were removed or changed, but new fields were injected. This broke processing for new log entries and the expected pattern could not be matched due to the inserted fields.

I wrote a new grok pattern to allow the existing fields to be parced regardless if new fields where added any where in the set of data.

Here is a sample of the new pattern:
unit_hostname="%{HOSTNAME:unit_hostname;string}",.?management_ip_address="%{IPV4:management_ip_address;string}",.?http_class_name="%{DATA:http_class_name;string}",.?web_application_name="%{DATA:web_application_name;string}",.?policy_name="%{DATA:policy_name;string}",.*?policy_apply_date="%{DATA:policy_apply_date;date;yyyy-MM-dd HH:mm:ss}"

All I really did was add ".*? between each field so that if any new fields were inserted there, it would still match.

The problem is that this performs horibly, at least enough that with our traffic level it overloads a single server.

I’m looking for a much less expensive way to get the same effect while avoiding switching to another extraction method or changing the grok pattern to explicitly match the new pattern.

The idea is to put something in place that can adapt to change.

I will likely be adding some of the new fields, but in the mean time, when this happens, I don’t want logging to be broken!


You could change your field extraction to happen in a pipeline and and us the key_value() function there. Assuming it holds true to key value, it would automatically adjust to added fields. Plus you would have some pipeline flexibility you don’t get with extractors.

on a side note, GROK can be really slow and even lock up processing if you are bad at it (like me) one of the (many) ways you can make it more efficient is by adding the regex start of line “^” to the beginning to keep it from searching on non-start-of-line matches.(assuming you are starting your GROK on the fist thing in the desired message)

This is an option I’m looking at for the future, but for now, I am currently just looking at improving things where they are.

Going with a new method would require renaming many fields and introducing new things.

Right now I have to have something working in the next few hours


This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.