What options do I have when Graylog parses message fields incorectly?
For example here’s a message from a FortiGate firewall:
date=2023-10-05 time=10:32:59 devname=“hostname” logid=“0317013312” type=“utm” subtype=“webfilter” reqtype=“referral” url=“https://yt3.ggpht.com/mFSCqiot6mjEbniR-uqMGnRcekR4BbU3gg5O5_qb9KUlZlVlXXnSwM5ngs3dzuWpEt65lvGJzA=s88-c-k-c0x00ffffff-no-rj” sentbyte=1852 rcvdbyte=1706
Graylog incorectly parses fields for this type of mesages including URLs. For example in this case I get:
The URL filed value is correct but there’s an extra second field that is just data part of the URL, it’s not an actual filed of the log message.
How should I deal with these extra fields? Is there some pipeline rule I can build to delete them? Any option to get rid of these extra fileds is welcome.
What input are you using? and are you using any extractors or pipelines to parse fields?
On my Fortinet/Forigate FW’s I only extract what i need, this saves room on my drives.
For example what I do is use regex.
You can either use an Extractor ( what I did) or use this in a pipeline.
If you go the pipeline route you can set it up with Key values and it should short all you fields out.
Something like this, but you may need to adjust this example for your firewall.
Note: the more fields you create the more volume it will need.
hope that helps
EDIT: as for deleting fields created all ready, well… not sure . Only thing I can think of is send the new logs to a different index set and delete the old index set. If that is not what you want, those fields should get removed aafter index retention priod is completed, sum it up, if you set you index retention for 30 days then thats how long it will take.