RAW Input with “Length-prefixed framing”

Hello Graylog community,

I’m trying to collect logs from Fortinet firewall boxes. Shortened example of on-the-wire log format:
204 <190>logver=604061879 devname="FG646" cfgattr="source[factory->factory\]certificate[->\]private-key[->\]password[ENC LoXX9uSFSfKK+NVMmg==->ENC MT2VySK/Lu2JDxqbAA==\]" msg="Edit certificate.local CertTest"

This “syslog” format uses length-prefixed framing, which the graylog’s syslog input handles mostly OK.
Unfortunately, when the received values contain embedded equal signs, it creates bogus fields.
Example would create bogus field “NVMmg” with value “=->ENC” and “Lu2JDxqbAA” with value “=]”.

I know that the key=value tokenizer extractor would parse the message fields flawlessly.
I verified this on Fortimanager which uses newline-framing with same key=value format. That works.

But now the question is: How do I ingest this length-prefixed framing?
Is there a way to receive “length-prefixed” messages with no further parsing?
Alternatively, is there a way to mass-delete all fields created by the syslog parser?

I found only this question by u/Sylvain, who did not know about length-prefixed framing:

Hello && Welcome

Need to ask a couple questions. What version of Graylog are you using? and bascially what have you tried so far?
We have multiple Fortgate firewalls and have a separate INPUT for those devices with extractors configure to create the fields we want. In your case have you looked into pipeline configuration?

I believe the Graylog servers are fully up to date, so 4.1.
I’ve tried RAW TCP input with newline delimiter framing, which concatenates messages into long walls of text.
I’ve tried RAW TCP input with null delimiter framing, which concatenates messages into long walls of text.
I’ve tried Syslog TCP input, which mostly works, but does non-optional parsing that creates bogus fields with unpredictable names when certificates get updated.

I’ve tried the CEF TCP input, but that (presumably) does not like the CEF dialect spoken by Fortigates (and FortiManager, for that matter) and just blackholes all messages

The site is connected via WiFi, making UDP a non-option.

I am aware of both Extractors and Pipeline processors, but I don’t know of any feature that would allow me to either fix framing errors caused by Raw input or to mass-remove bogus fields caused by Syslog input.

What do you do differently? What do your messages look like on the wire? Thank you for trying to help.


There is not a button I know of on Graylog you can push that will do this, hence why I suggested a pipeline.

Our logging server/s are internal for each environment we have.
As for our fortinet 60’s, 80, 100, 200 series that feeds Graylog within there evironment. So, for our wifi this is an example of the flow. Fortinet AP → Fortinet FW–> Graylog Server

Input Raw/Plaintext UDP, ports above 1024.
We have a couple grok/regex extractors for that input. Example:

Messages look like this, sorry I had to cut out personal information.

I’m not very good at pipelines yet, but I know some community members here are very good. Not 100% sure it can be done but I have seen some amazing things from pipelines. I’m quite sure if you execute a global search here in the forum you will come acress a pipeline examples that will extract what you need and drop the rest of the mesage, and/or convert it into something else.

You can get some ideas for pipelines here. I use these examples for learning.

Hope that helps

I think UDP uses “one message per UDP packet” framing and that is why it works for you.
My setup is stuck trying to find an Input that will work.

Thank you for the details of your setup. Both the GROK example and the testsuite link were very useful.

I think I can let the Syslog TCP input do it’s non-ideal parsing and let it create it’s bogus fields. Then I create a new message in pipeline processor, parse it from scratch and delete original. It will be ugly because Graylog will parse the message twice, but it will work. Thanks!

1 Like

Welp, that kinda worked and did what I asked.

This functions successfully kills the original message and it’s bogus fields…

rule "TEST FGT-Direct re-parser"
  from_input(name: "FGT special TCP Syslog 10514")
  let msgtext = to_string($message.message);
  let msgfull = to_string($message.full_message);
  let msgsource = to_string($message.source);
  // FIXME: Next line is Fortinet-specific.
  // Generic solution: let eventtime = to_date($message.timestamp);
  let eventtime = parse_unix_milliseconds(to_long($message.eventtime)/1000000);
  let nmsg = create_message(msgtext, msgsource, eventtime);
  set_field(field:"full_message", value:msgfull, message:nmsg);
  // FIXME: Parsing for your logs goes here... or into the next pipeline stage
  let k_v = key_value(value:msgtext, delimiters:" ", kv_delimiters:"=", trim_value_chars:"\"");
  set_fields(fields:k_v, message:nmsg);
  let level_int = lookup_value("SyslogLevels", nmsg.level, 0);
  set_field(field:"level", value:to_long(level_int), message:nmsg);

  route_to_stream(name: "ZZ Scratchpad", message: nmsg, remove_from_default: true);

… but it ultimately failed to do what I need,
because the key_value function in Pipelines, while more configurable than its key=value tokenizer extractor counterpart, seems unable to handle spaces in quoted fields :slightly_frowning_face:

I guess I’ll have to ask about that in a new topic: Running key=value tokenizer extractor from Pipelines?

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.