Zenarmor Pipeline Rule JSON Parsing problems

1. Describe your incident:

I am trying to create a pipeline for Sunny Valley Zenarmor logs that I am sending from my OPNSense Firewall. The logs are being sent via syslog, but Zenarmor is sending the data as a json, so the data is not parsing correctly within Graylog. I have searched within the community but have not been able to find anything comparable to use as an example.

2. Describe your environment:

  • OS Information:

Graylog Graylog 5.1.0+14ba491
OS: Ubuntu 22.04.2 LTS x86_64
Kernel: 5.15.0-72-generic
Shell: bash 5.1.16
CPU: AMD Ryzen 9 6900HX with Radeon Graphics (16) @ 3.300GHz
Memory: 4978MiB / 28839MiB

OPNsense 23.1.7_3-amd64
OS: FreeBSD 13.1-RELEASE-p7
CPU: Intel(R) Core™ i7-1165G7 @ 2.80GHz (4 cores, 8 threads)
Memory: 9970/32463 MB

  • Package Version:

Open Threat Exchange - Threat Intel Plugin
Whois Threat Intel Plugin
OPNSense Content Pack

  • Service logs, configurations, and environment variables:

Sample Zenarmor log:

{
  "gl2_accounted_message_size": 2119,
  "level": 6,
  "gl2_remote_ip": "192.168.2.1",
  "gl2_remote_port": 46934,
  "streams": [
    "646cc1b7f75e3a714c8a7c4a"
  ],
  "gl2_message_id": "01H17GYT7G000007RW83DETGP1",
  "source": "sample.secdoc.local",
  "message": "sample.secdoc.local zenarmor[8765]: index=http, data={"start_time":1684946740000,"transport_proto":"TCP","policyid":"0","cloud_policyid":"rs3br75On9","interface":"vlan02","vlanid":"0","conn_uuid":"33688efe-1eb0-444d-81ff-92a3d671ccbc","src_hwaddr":"80615f1084af","src_username":"","ip_src_saddr":"192.168.30.101","ip_src_port":49086,"src_hostname":"maul.secdoc.local","src_dir":"EGRESS","dst_hwaddr":"009027e83320","dst_username":"","ip_dst_saddr":"xx.xx.xx.109","ip_dst_port":9000,"dst_hostname":"sample.secdoc.local","dst_dir":"INGRESS","is_blocked":0,"is_local":0,"encryption":"CLEAR","src_geoip":{"timezone":"","continent_code":"","city_name":"","country_name":"","country_code2":"","country_code3":"","dma_code":"0","region_name":"","region_code":"","postal_code":"","area":"0","metro":"0","asn":"0","latitude":0.0,"longitude":0.0,"location":{"lat":0.0,"lon":0.0}},"dst_geoip":{"timezone":"","continent_code":"","city_name":"Allen","country_name":"US","country_code2":"","country_code3":"","dma_code":"0","region_name":"","region_code":"","postal_code":"","area":"0","metro":"0","asn":"0","latitude":00.117698669433597,"longitude":-00.6791000366211,"location":{"lat":00.117698669433597,"lon":-00.6791000366211}},"device":{"id":"80615f1084af","name":"","category":"10","vendor":"Chrome OS","os":"Chrome OS","osver":""},"method":"GET","version":"HTTP/1.1","host":"gambit.secdoc.tech","category":"","user_agent":"Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/113.0","referrer":"http://sample.secdoc.local:9000/system/overview","uri":"/api/cluster/jobs","status_msg":"200","cli_hdr_names":"Host,Accept,Accept-Language,Accept-Encoding,X-Requested-With,X-Requested-By","srv_hdr_names":"X-Graylog-Node-ID,Cache-Control,X-Content-Type-Options,X-Frame-Options,Content-Security-Policy,X-Runtime-Microseconds","cookie_vars":"authentication,ph_phc_fmJsCXBb0sqPpUCAJ51C0sT933i8LHUT6Zqm4oCGuK7_posthog","uri_vars":"","proxied":"","browser":"","req_body_len":0,"rsp_body_len":52}",
  "gl2_source_input": "646cc0edf75e3a714c8a7526",
  "facility_num": 0,
  "gl2_source_node": "143077a1-1f14-4cc9-8794-4b1a211f4aad",
  "_id": "5d9b6f1f-fa62-11ed-8b62-0242b474b08e",
  "facility": "kernel",
  "timestamp": "2023-05-24T18:39:50.000Z"

The specific portion I am trying to parse for obvious reasons is the following:

sample.secdoc.local zenarmor[8765]: index=http, data={"start_time":1684946740000,"transport_proto":"TCP","policyid":"0","cloud_policyid":"rs3br75On9","interface":"vlan02","vlanid":"0","conn_uuid":"33688efe-1eb0-444d-81ff-92a3d671ccbc","src_hwaddr":"80615f1084af","src_username":"","ip_src_saddr":"192.168.30.101","ip_src_port":49086,"src_hostname":"maul.secdoc.local","src_dir":"EGRESS","dst_hwaddr":"009027e83320","dst_username":"","ip_dst_saddr":"xx.xx.xx.109","ip_dst_port":9000,"dst_hostname":"sample.secdoc.local","dst_dir":"INGRESS","is_blocked":0,"is_local":0,"encryption":"CLEAR","src_geoip":{"timezone":"","continent_code":"","city_name":"","country_name":"","country_code2":"","country_code3":"","dma_code":"0","region_name":"","region_code":"","postal_code":"","area":"0","metro":"0","asn":"0","latitude":0.0,"longitude":0.0,"location":{"lat":0.0,"lon":0.0}},"dst_geoip":{"timezone":"","continent_code":"","city_name":"Allen","country_name":"US","country_code2":"","country_code3":"","dma_code":"0","region_name":"","region_code":"","postal_code":"","area":"0","metro":"0","asn":"0","latitude":00.117698669433597,"longitude":-00.6791000366211,"location":{"lat":00.117698669433597,"lon":-00.6791000366211}},"device":{"id":"80615f1084af","name":"","category":"10","vendor":"Chrome OS","os":"Chrome OS","osver":""},"method":"GET","version":"HTTP/1.1","host":"gambit.secdoc.tech","category":"","user_agent":"Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/113.0","referrer":"http://sample.secdoc.local:9000/system/overview","uri":"/api/cluster/jobs","status_msg":"200","cli_hdr_names":"Host,Accept,Accept-Language,Accept-Encoding,X-Requested-With,X-Requested-By","srv_hdr_names":"X-Graylog-Node-ID,Cache-Control,X-Content-Type-Options,X-Frame-Options,Content-Security-Policy,X-Runtime-Microseconds","cookie_vars":"authentication,ph_phc_fmJsCXBb0sqPpUCAJ51C0sT933i8LHUT6Zqm4oCGuK7_posthog","uri_vars":"","proxied":"","browser":"","req_body_len":0,"rsp_body_len":52}

3. What steps have you already taken to try and solve the problem?

I tried a more complex regex, but I think the complexity was causing problems, but this was the original regex:

\{"start_time":(\d+),"transport_proto":"(\w+)","policyid":"(\d+)","cloud_policyid":"([\w\d]+)","interface":"(\w+)","vlanid":"(\d+)","conn_uuid":"([\w-]+)","src_hwaddr":"([\w\d]+)","src_username":"([^"]*)","ip_src_saddr":"([\d.]+)","ip_src_port":(\d+),"src_hostname":"([^"]*)","src_dir":"(\w+)","dst_hwaddr":"([\w\d]+)","dst_username":"([^"]*)","ip_dst_saddr":"([\d.]+)","ip_dst_port":(\d+),"dst_hostname":"([^"]*)","dst_dir":"(\w+)","is_blocked":(\d+),"is_local":(\d+),"encryption":"(\w+)","src_geoip":\{"timezone":"([^"]*)","continent_code":"([^"]*)","city_name":"([^"]*)","country_name":"([^"]*)","country_code2":"([^"]*)","country_code3":"([^"]*)","dma_code":"(\d+)","region_name":"([^"]*)","region_code":"([^"]*)","postal_code":"([^"]*)","area":"(\d+)","metro":"(\d+)","asn":"(\d+)","latitude":([\d.]+),"longitude":([\d.]+),"location":\{"lat":([\d.]+),"lon":([\d.]+)\}\},"dst_geoip":\{"timezone":"([^"]*)","continent_code":"([^"]*)","city_name":"([^"]*)","country_name":"([^"]*)","country_code2":"([^"]*)","country_code3":"([^"]*)","dma_code":"(\d+)","region_name":"([^"]*)","region_code":"([^"]*)","postal_code":"([^"]*)","area":"(\d+)","metro":"(\d+)","asn":"(\d+)","latitude":([\d.]+),"longitude":([\d.]+),"location":\{"lat":([\d.]+),"lon":([\d.]+)\}\},"device":\{"id":"([\w\d]+)","name":"([^"]*)","category":"(\d+)","vendor":"([^"]*)","os":"([^"]*)","osver":"([^"]*)"\},"method":"(\w+)","version":"([^"]+)","host":"([^"]+)","category":"([^"]*)","user_agent":"([^"]+)","referrer":"([^"]+)","uri":"([^"]+)","status_msg":"(\d+)","cli_hdr_names":"([^"]*)","srv_hdr_names":"([^"]*)","cookie_vars":"([^"]*)","uri_vars":"([^"]*)","proxied":"([^"]*)","browser":"([^"]*)","req_body_len":(\d+),"rsp_body_len":(\d+)\}

I am trying to write a pipeline rule that is the following:

rule "Parse Zenarmor Log"
when
  contains(to_string($message.message), "gambit.secdoc.tech zenarmor")
then
  let parsedData = regex("^.*data=(\\{.*\\})$", to_string($message.message));
  let parsedFields = parse_json(parsedData[0]);
  set_fields(parsedFields);
end

I get two errors as seen in the image below:

4. How can the community help?

I have probably been looking at this far to long, but cannot seem to find a solution or alternative, so any insight or assistance would be greatly appreciated…

You need to index with a string:

let parsedFields = parse_json(parsedData[“0”]);

@patrickmann I tried the recommendation and get the same error.

rule "Parse Zenarmor Log"
when
  contains(to_string($message.message), "sample.secdoc.local zenarmor")
then
  let parsedData = regex("^.*data=(\\{.*\\})$", to_string($message.message));
  let parsedFields = parse_json(parsedData["0"]);
  set_fields(parsedFields);
end

The example/function explanations also seems to fit with the syntax, so that is was is rather frustrating…

Do you have another recommendation or approach to take?

I think there is a to_string missing as well:

let parsedFields = parse_json(to_string(parsedData[“0”]));

@patrickmann Thank you…get the following error after making the change:

But started making some changes based on that variation and came up with the following:

rule "Parse Zenarmor Log"
when
  contains(to_string($message.message), "gambit.secdoc.tech zenarmor")
then
  let parsedData = regex("^.*data=(\\{.*\\})$", to_string($message.message));
  let parsedFields = parse_json(to_string(parsedData,'0'));
  set_fields(parsedFields);
end

which then leaves a singular error:

Based on the error and the available functions, I think the select_jasonpath function would be correct:

Do you have experience with this function or thoughts on syntax?

Hey @secdoc

Have you tired to create two rules in one pipe?

  • rule #1 regex into named field
  • rule #2 parse_json

When debugging pipeline rules it is best to break it down and check all the intermediate results. Here is my test rule:

rule "Test"
when
  true
then
  let parsedData = regex("^.*data=(\\{.*\\})$", to_string($message.message));
  set_field("a", to_string(parsedData));
  set_field("b", is_json(to_string(parsedData["0"])));
  set_field("c", to_string(parsedData["0"]));
  let parsedFields = parse_json("{\"start_time\":1684946740000,\"transport_proto\":\"TCP\"}");
  set_fields(to_map(parsedFields));
end

Result:

a: {0={"start_time":1684946740000,"transport_proto":"TCP"}}
b: false
c: {"start_time":1684946740000,"transport_proto":"TCP"}
start_time: 1684946740000
transport_proto: TCP

First off, you are still not indexing the match groups correctly. Unless you specified group names, they are indexed with string values “0”, “1”, etc.

Secondly, parse_json fails because it expects an escaped JSON. Yours is not escaped. Unfortunately there’s no pipeline function to escape a JSON string.

Thirdly, you need to apply to_map to the output of json_parse so you can use it in set_fields.

Graylog 5.1 now has a handy dandy instant rule simulator where you can test these things very easily.

@patrickmann thank you for the feedback. I will keep trying to work through this…