Saving only part of a log

Graylog 4.3.3

Hi, I’m receiving large logs from a SOPHOS firewall of which I’m only interested in %20 of their content, so saving the whole log is taking so much storage, and as I’m already using extractors to give me the fields I’m interested in, Is there a way to strip the logs and only save specific fields to storage?

You can use the remove_field() function in a pipeline to drop the fields you don’t want. I have also used the regex_replace() function to remove standardized wording out of some of the larger $message.message results - I want the meat of the message but not the disclaimer that comes with with each one…

That should get you started - if you post more specific information (Obfuscated of course) I might be able to give you a more specific answer! :smiley:

2 Likes

Thank you very much, that’s an eye opener, the logs I’m getting look like this:

<30>device=“SFW” date=2022-08-09 time=07:58:34 timezone=“+03” device_name=“XG450” device_id=C43101J88JXXXXX log_id=010101600001 log_type=“Firewall” log_component=“Firewall Rule” log_subtype=“Allowed” status=“Allow” priority=Information duration=0 fw_rule_id=59 nat_rule_id=43 policy_type=1 user_name=“” user_gp=“” iap=18 ips_policy_id=0 appfilter_policy_id=11 application=“” application_risk=0 application_technology=“” application_category=“” vlan_id=“” ether_type=Unknown (0x0000) bridge_name=“” bridge_display_name=“” in_interface=“XXXXX” in_display_interface=“XXXXXX” out_interface=“Port4” out_display_interface=“Port4” src_mac=E4:XX:4C:00:XX:XX dst_mac=00:XX:CD:5D:XX:XX src_ip=XXX.XXX.XXX.XXX src_country_code=R1 dst_ip=XXX.XXX.XXX.XXX dst_country_code=R1 protocol=“TCP” src_port=XXXXX dst_port=XXXXX sent_pkts=1 recv_pkts=1 sent_bytes=52 recv_bytes=40 tran_src_ip=XXX.XXX.XXX.XXX tran_src_port=0 tran_dst_ip= tran_dst_port=0 srczonetype=“LAN” srczone=“LAN” dstzonetype=“WAN” dstzone=“WAN” dir_disp=“” connevent=“Stop” connid=“91244XXXX” vconnid=“” hb_health=“No Heartbeat” message=“” appresolvedby=“Signature” app_is_cloud=0

I’m only interested in these fields:

log_component=“Firewall Rule”
fw_rule_id=59
user_name=“”
user_gp=“”
application=“”
in_display_interface=“XXX-XXX”
src_mac=E4:XX:4C:00:XX:XX
dst_mac=00:XX:CD:5D:XX:XX
src_ip=XXX.XXX.XXX.XXX
dst_ip=XXX.XXX.XXX.XXX
src_port=XXXXX
dst_port=XXXXX
srczone=“LAN”
dstzone=“WAN”

Since I’m getting millions of these every day, any cuts will benefit greatly, also I have extractors for these values if this is any help…

If you can work in extractors to only get the values you want out of the message, that would work - perhaps an well built GROK statement…either in extractors OR in the pipeline

You could leave the extractors in place and then drop the extra fields in the pipeline with the aforementioned functions… or you could drop the extractors all together and separate out your fields in the pipeline where you would only create the fields you want… Someone created a pipeline rule to do just that here It’s not necessarily efficient… but it works…

1 Like

Hello @Alper

Adding on to @tmacgbay suggestion.

To be honest a pipeline /w regex probably be you best way to create all those fields, I use a lot REGEX extractors for those fields and noticed the resource consumption is greater then just using one pipeline.

2 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.