Pipeline rule -> when -> "grok().matches == true" vs. "grok_exists()"

Dear community,

I am using Graylog 6.3.2 (docker) to gather all sorts of log data in my homelab.

I have recently switched my Unifi USG-3P for a UCG-Ultra. My old pipeline rules need some love to work again.

Unifi sends all sorts of different logs to a stream. I would like to set up several pipeline rules to parse the messages into fields and then forward the messages to different streams with different retention times.

example 1

UCG-Ultra [DMZ_LOCAL-D-2147483647] DESCR="[DMZ_LOCAL]Block All Traffic" IN=br40 OUT= MAC=45:00:00:20:ff:8a:40:00:40:11:52:99:c0:a8 SRC=192.168.40.1 DST=255.255.255.255 LEN=32 TOS=00 PREC=0x00 TTL=64 ID=65418 DF PROTO=UDP SPT=44828 DPT=10001 LEN=12 MARK=1a0000

example 2

UCG-Ultra [VPN_CUSTOM2-A-10000] DESCR="VPN to unsecure (Allow All)" IN=tlprt1 OUT=br30 MAC= SRC=192.168.2.2 DST=192.168.30.20 LEN=60 TOS=00 PREC=0x00 TTL=63 ID=35249 DF PROTO=TCP SPT=48712 DPT=8123 SEQ=885508839 ACK=0 WINDOW=65535 SYN URGP=0 MARK=1a0000

example 3

Aug 30 16:15:02 UCG-Ultra CEF:0|Ubiquiti|UniFi Network|9.3.45|401|WiFi Client Disconnected|2|UNIFIcategory=Monitoring UNIFIsubCategory=WiFi UNIFIhost=UCG Ultra UNIFIlastConnectedToDeviceName=UAP-AC-LR UNIFIlastConnectedToDeviceIp=192.168.1.3 UNIFIlastConnectedToDeviceMac=f0:9f:c2:fc:55:e7 UNIFIlastConnectedToDeviceModel=UAP-AC-LR UNIFIlastConnectedToDeviceVersion=6.6.77 UNIFIclientAlias=Android Device 88:a3 UNIFIclientIp=192.168.20.249 UNIFIclientMac=3a:fb:b9:5d:88:a3 UNIFIwifiChannel=6 UNIFIwifiChannelWidth=20 UNIFIwifiName=csesnsas_mobile UNIFIwifiBand=ng UNIFIwifiAirtimeUtilization=14 UNIFIwifiInterference=1 UNIFIlastConnectedToWiFiRssi=-76 UNIFIduration=39m 34s UNIFIusageDown=520.73 MB UNIFIusageUp=19.90 MB UNIFInetworkName=MobileDevices UNIFInetworkSubnet=192.168.20.0/24 UNIFInetworkVlan=20 msg=Android Device 88:a3 disconnected from csesnsas_mobile. Time Connected: 39m 34s. Data Used: 19.90 MB (up) / 520.73 MB (down). Last Connected To: UAP-AC-LR at -76 dBm.

I was planing to set up several pipeline rules in multiple stages to handle different types of log messages. Each rule should only be applied to messages that have not yet been parsed by stages before. (I have set an additional field “parsed” = “true”.)

Working pipeline rule:

rule "Unifi UCG-Ultra: parsing firewall messages"

when
    grok(
        pattern: "^%{DATA:device_name} \\[%{DATA:firewall_rule}\\] DESCR=\"%{DATA:rule_description}\" IN=%{DATA:interface_in} OUT=%{DATA:interface_out} MAC=%{GREEDYDATA:mac} SRC=%{IP:source_ip} DST=%{IP:destination_ip} %{GREEDYDATA:type_of_service} PROTO=%{WORD:network_transport}(?: (?:SPT=%{NUMBER:source_port}|DPT=%{NUMBER:destination_port}|%{GREEDYDATA:other}))*",
        value: to_string( $message.message )
        ).matches == true

then
  set_fields(
    grok(
      pattern: "^%{DATA:device_name} \\[%{DATA:firewall_rule}\\] DESCR=\"%{DATA:rule_description}\" IN=%{DATA:interface_in} OUT=%{DATA:interface_out} MAC=%{GREEDYDATA:mac} SRC=%{IP:source_ip} DST=%{IP:destination_ip} %{GREEDYDATA:type_of_service} PROTO=%{WORD:network_transport}(?: (?:SPT=%{NUMBER:source_port}|DPT=%{NUMBER:destination_port}|%{GREEDYDATA:other}))*",
      value: to_string($message.message),
      only_named_captures: true
    )
  );
  set_field("parsed", "true");
end

Why does this rule not work with when grok_exists()?

rule "Unifi UCG-Ultra: parsing firewall messages"

when
    grok_exists(
       "^%{DATA:device_name} \\[%{DATA:firewall_rule}\\] DESCR=\"%{DATA:rule_description}\" IN=%{DATA:interface_in} OUT=%{DATA:interface_out} MAC=%{GREEDYDATA:mac} SRC=%{IP:source_ip} DST=%{IP:destination_ip} %{GREEDYDATA:type_of_service} PROTO=%{WORD:network_transport}(?: (?:SPT=%{NUMBER:source_port}|DPT=%{NUMBER:destination_port}|%{GREEDYDATA:other}))*"
)

then
  set_fields(
    grok(
      pattern: "^%{DATA:device_name} \\[%{DATA:firewall_rule}\\] DESCR=\"%{DATA:rule_description}\" IN=%{DATA:interface_in} OUT=%{DATA:interface_out} MAC=%{GREEDYDATA:mac} SRC=%{IP:source_ip} DST=%{IP:destination_ip} %{GREEDYDATA:type_of_service} PROTO=%{WORD:network_transport}(?: (?:SPT=%{NUMBER:source_port}|DPT=%{NUMBER:destination_port}|%{GREEDYDATA:other}))*",
      value: to_string($message.message),
      only_named_captures: true
    )
  );
  set_field("parsed", "true");
end

(I did not yet include the condition for the not has_field(“parsed”)…)

The Grok-pattern is the same and as far as I understand the Rules quick reference, this should work as well.

Any recomendations to make these rules more efficient is very much welcome.

This one is super confusing. This function doesn’t do this, so if it appears to be working then something else is actually broken lol.

Grok_exists checks if that grok pattern ie DATA or USERNAME has been setup in graylog, it doesn’t check a value in a field etc.

It’s confusing because pattern has multiple meanings.

Hi @Joel_Duffield,

thank you for your answer. I am not fully sure, if I understand you correctly. Are you saying that my WHEN-condition with the grok() does not work, as I might expect it? The messages get parsed as I want them to - so it must return true - for the wrong reasons, you are saying?

when
    grok(
        pattern: "^%{DATA:device_name} \\[%{DATA:firewall_rule}\\] DESCR=\"%{DATA:rule_description}\" IN=%{DATA:interface_in} OUT=%{DATA:interface_out} MAC=%{GREEDYDATA:mac} SRC=%{IP:source_ip} DST=%{IP:destination_ip} %{GREEDYDATA:type_of_service} PROTO=%{WORD:network_transport}(?: (?:SPT=%{NUMBER:source_port}|DPT=%{NUMBER:destination_port}|%{GREEDYDATA:other}))*",
        value: to_string( $message.message )
        ).matches == true
...

I got this from here: Reddit: How can I use IF/ELSE in the THEN section of a pipeline rule?

Did I understand you correctly that grok_exists() is not a function to be used in WHEN-conditions?

Thank you for clarification.

Chris

EDIT:

I have seen your recommendation on key-value here: Grok Pattern in Pipelines - #2 by Joel_Duffield

Would you use a Grok-pattern to get the first two objects and then use a key-value-function to get the rest? Then I would have to drop or rename certain fields. Would this be more efficient?

Sorry I was thinking you used grok_exists in both. Yes grok() == true kind of thing is the right way to do it in the when section.

I’m a big fan of key value pairs, they are so easy and flexible, however I think the folks at unifi may have closed that door for you at least with some of these examples. Several of them are missing quotes and also contain spaces both between the pairs but also inside of values, so for those I dont think KV is going to work.

Unless you are running a huge scale the efficiency of one function vs another shouldn’t matter.

Hi @Joel_Duffield ,

thank you for the clarification. My Unifi-Pipeline has now 7 steps to cover all different sorts of message types and to include the geo-location. In my homelab, I get some 1.4GB data per month from 4 inputs. I assume, this is considered a small installation. :slight_smile:

Yes, I love Unifi-products and I hate them for some aspects. The inconsistency of their log messages is one of them.

Thank you for your support,

Chris