JSON extractor not working

Hello Everyone

I have spent hours and hours reading every forum post and search article possible

What I am trying to do is run a json extractor on a message field

Regardless of what I do i keep getting “nothing will be extracted” despite having valid data

We are looking at rolling out graylog enterprise license for many customers and really need to get this json extractor working in order to do so

here is the data ( i changed some of it for privacy)

{“event_type”:“FirewallAggregated_Event”,“ipv4”:“192.168.0.13”,“hostname”:“server”,“source_uuid”:“6bad89ef-0a8f-4d13-9df8-d4b7d84b7dba”,“occured”:“05-Aug-2020 02:27:59”,“severity”:“Warning”,“event”:“Security vulnerability exploitation”,“source_address”:“1.2.2.1”,“source_address_type”:“IPv4”,“source_port”:34927,“target_address”:“192.168.0.13”,“target_address_type”:“IPv4”,“target_port”:7071,“protocol”:“TCP”,“account”:“NT AUTHORITY\NETWORK SERVICE”,“process_name”:“C:\Windows\System32\svchost.exe”,“inbound”:true,“threat_name”:“Incoming.Attack.Generic”,“aggregate_count”:1}

otherwise if it can be done with grok or regex any guidance on this would be great

Im trying to get this last bit working in order to get my boss to approve the license for several customer sites

I just need to pull out a few of the fields such as:

IPv4
hostname
account

I have tried doing this on my own but no matter what i read and even using grok pattern generators i cant get this to work.

How do you ingest messages? What type of input do you use? GELF, Syslog,Beat?
Please paste your message in ```message``` to see raw data.

Thank you for the reply
Messages are coming from syslog

Here is the raw data

{“event_type”:“FirewallAggregated_Event”,“ipv4”:“192.168.0.13”,“hostname”:“server”,“source_uuid”:“6bad89ef-0a8f-4d13-9df8-d4b7d84b7dba”,“occured”:“05-Aug-2020 02:27:59”,“severity”:“Warning”,“event”:“Security vulnerability exploitation”,“source_address”:“1.2.2.1”,“source_address_type”:“IPv4”,“source_port”:34927,“target_address”:“192.168.0.13”,“target_address_type”:“IPv4”,“target_port”:7071,“protocol”:“TCP”,“account”:“NT AUTHORITY\NETWORK SERVICE”,“process_name”:“C:\Windows\System32\svchost.exe”,“inbound”:true,“threat_name”:“Incoming.Attack.Generic”,“aggregate_count”:1}

Hi, your problem is with \ in fields account and process_name. It’s not valid json without escape backslash \\ in your message. Every backslash should be escaped 2 times to work in graylog, so json extractor can extract it.

If you can’t update incoming message, you can use one of this pipeline rules, to fix it:

  1. First pipeline rule fixes backlash and extract all json fields
rule "extract-json-syslog1"
when
    starts_with(to_string($message.message), "{") && ends_with(to_string($message.message), "}")
then
    let fix_backslash = replace(to_string($message.message), "\\", "\\\\");
    let json = parse_json(to_string(fix_backslash));
    let map = to_map(json);
    set_fields(map);
    //set_fields(map, "prefix_"); // use if you want to prefix fields with prefix_ (uncomment this and comment previous line)
end
  1. Second pipeline rule fixes backslash and extract only selected fields from json (json path):
rule "extract-json-syslog2"
when
    starts_with(to_string($message.message), "{") && ends_with(to_string($message.message), "}")
then
    let fix_backslash = replace(to_string($message.message), "\\", "\\\\");
    let json = parse_json(to_string(fix_backslash));
    let json_fields = select_jsonpath(json, { ipv4: "$.ipv4", hostname: "$.hostname", account: "$.account"});
    set_fields(json_fields);
end

Hope this helps

Alternatively you could use key_value() for the message:(I have an unnatural fear of json) The json would be better if you wanted to pick/rename specific fields though… and since @shoothub posted working json code, I fear it a little less. :slight_smile:

rule "extract-syslog-kv"
when
   starts_with(to_string($message.message), "{") && ends_with(to_string($message.message), "}")
then
   let mess  = substring(to_string($message.message),1,-1);             // removes brackets
   let keysv = key_value(mess,",",":",true,true,"take_last","\"","\""); // removes quotes
   set_fields(keysv);
end
1 Like

thank you all for the help and suggestions
I ended up getting it to work by using a regex

I will post it soon for any others that have the same issue

many thanks

this is the regex that i used

“hostname”:([^,]+)

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.