GROK pattern match fails

I have to parse some McAfee Web Gateway Logs and defined a set of GROK Pattern, but it seems, that I’m missing something:

<30>%{DATE_MWG:mwg_date} %{DATA:mwg_hostname} mwg: |auth_user=%{DATA:mwg_user}|src_ip=%{IPV4:mwg_srcip}|server_ip=%{IPV4:mwg_serverip}|host=%{DATA:mwg_host}|url_port=%{NUMBER:mwg_urlport}|status_code=%{NUMBER:mwg_statuscode}|bytes_from_client=%{NUMBER:mwg_bytesFROMclient;int}|bytes_to_client=%{NUMBER:mwg_bytesTOclient;int}|categories=%{DATA:mwg_categories}|rep_level=%{DATA:mwg_replevel}|method=%{WORD:mwg_method}|url=%{DATA:mwg_url}|media_type=%{DATA:mwg_mediatype}|application_name=%{DATA:mwg_appname}|user_agent=%{DATA:mwg_useragent}|block_res=%{NUMBER:mwg_blockcode}|block_reason=%{DATA:mwg_blockreason}|virus_name=%{DATA:mwg_virusname}|hash=%{DATA:mwg_hash}|filename=%{DATA:mwg_filename}|filesize=%{NUMBER:mwg_filesize;int}|

Example Log message:
<30>Nov 6 09:10:09 x2il0001 mwg: |auth_user=anonymous|src_ip=172.21.118.3|server_ip=193.99.144.85|host=www.heise.de|url_port=80|status_code=301|bytes_from_client=57|bytes_to_client=622|categories=|rep_level=|method=GET|url=http://www.heise.de/|media_type=text/html|application_name=|user_agent=|block_res=0|block_reason=|virus_name=|hash=|filename=|filesize=162|

Extractor preview:
grafik

Many thanks in advance!

  1. I dont’n know where is your problem? only first 2 fields are extracted? or?
  2. It is a correct RFC 3641 syslog message, so why you try to extract from full message? I’d rather use message as a extractor field. This way, level and severity fields are automatically extracted by graylog
  3. Create normal Syslog input, not raw, because is normal syslog
  4. <30> is syslog header, and it changed based on facility and serverity, so don’t include it (<30>) in your grok pattern
  5. So your grok pattern can start like this:
    %{SYSLOGTIMESTAMP:timestamp} %{DATA:source} mwg
  6. Check also this content pack
    https://marketplace.graylog.org/addons/254d7e97-fda8-4522-81f9-16874770d655

Thanks for your answer. Indeed, only first two fields are extracted, but it is not clear for me why.

Originally, I created the input as generic syslog, but messages weren’t imported. After I’ve changed the input to RAW/Plaintext, it was working.

The extractor field is message and I already noticed, that I can replace the first fields by: <30>%{SYSLOGBASE}

The extractor is working better now:
grafik

but the rest of the message is still not parseable. The severity won’t change on this input, so it is okay to keep it there. The parser tells me, that there is nothing to extract, if I remove it from the pattern…

I also tried syslog-tcp again as input, but still messages are not imported, but visible in the statistics.

Regarding the content-pack. The pattern are mostly taken from there, but the pack itself couldnt be imported. Not sure, if there is a mismatch in the graylog version or…

Sorry I didn’t noticed first, it’s so simple. You have to escape pipe | char with backslash
So result will be:

<30>%{SYSLOGBASE} \|auth_user=%{DATA:mwg_user}\|src_ip=%{IPV4:mwg_srcip}\|server_ip=%{IPV4:mwg_serverip}\|host=%{DATA:mwg_host}\|url_port=%{NUMBER:mwg_urlport}\|status_code=%{NUMBER:mwg_statuscode}\|bytes_from_client=%{NUMBER:mwg_bytesFROMclient;int}\|bytes_to_client=%{NUMBER:mwg_bytesTOclient;int}\|categories=%{DATA:mwg_categories}\|rep_level=%{DATA:mwg_replevel}\|method=%{WORD:mwg_method}\|url=%{DATA:mwg_url}\|media_type=%{DATA:mwg_mediatype}\|application_name=%{DATA:mwg_appname}\|user_agent=%{DATA:mwg_useragent}\|block_res=%{NUMBER:mwg_blockcode}\|block_reason=%{DATA:mwg_blockreason}\|virus_name=%{DATA:mwg_virusname}\|hash=%{DATA:mwg_hash}\|filename=%{DATA:mwg_filename}\|filesize=%{NUMBER:mwg_filesize;int}\|

Many, many thanks! That’s it. Too simple :wink: