Syslog message not being parsed when using Graylog Sidecar with filebeat input

Log messages are not automatically separated into all fields… in some cases depending on the log shipper (nxlog, beats…) or the Graylog Input you are using, some of the fields are broken out… for instance, winlogbeats separates out a lot of fields, but not all that I want…

That’s where extractors and/or pipeline rules come in - you use those to pull out the particular data that you want. You can use either or both, they pretty much do the same thing in the end.

If you had a complicated series of things you wanted to do with a message, the pipeline may be a better fit…

I do have a syslog input running but it is receiving logs from appliances - linux machines have beats installed.

1 Like

:laughing: @tmacgbay ill be honest, I have been struggling over here with pipelines, but I kept practicing with your help. And thank you for the compliment.:+1:

1 Like

Ok, so I’m now trying to set up a rule to grab all auditd messages and extract all key-value fields (I think is the term) prefix them with “auditd_”.

So maybe I’m a bit slow, but even after having read through the documentation: Rules - Processing Pipelines I cannot see exactly how to get this simple part working and processing my audit log messages:
(the rule is mapped to the debian audit log stream via pipeline)

rule “auditd_identify_and_tag”

when
has_field(“what do I put here”)
then

So for “has_field” I’ve tried id “id-auditlog” as used in my auditd filebeat collector config as well as filebeat fields “audit_log: true” - also used in filebeat collector config. None of these work, however, so at this point it’s not really clear to me how to make the rule work…

On the stream which I’ve connected the pipeline to, I have only audit logs coming in, so I guess another option would be to do a sort of catch all rule, if that is possible?

EDIT:
I realised that all the debian audit log messages seem to have the string “type=” at the start of each log line, so I’ve tried to use that value for has_field, but to no avail :confused:

As this is a new question, I would pose it as a new thread in Graylog Central - that way it is available for others to view and answer. Also as said before, please use the </> tool to post code to make it more easily readable.

Assuming you have extracted key_value fields wither with and extractor or in a pipeline rule that is in a previous stage (all rules within a stage generally run synchronously) the field you are looking for in your when part of your rule would be auditid_id-auditlog ? It’s not clear.

If you had a field called my_phone_number and it contained the data 555-123-4556 it would look something like this:

rule "sample rule"
when
    has_field("my_phone_number")
then
     // do something with the message or broken out fields from the message here
     //
     // use $ tail -f /var/log/graylog-server/server.log to watch for the results of the below debug message
     //
     debug(concat("============ my_phone_number: ",to_string($message.my_phone_number)));
end

In your new question thread, please pose the question and include and example message, and how you are breaking the message out into fields.

1 Like

Not a new question as I see it as it is tightly related to the previous posts in this thread.

And my bad for being lazy and not formatting the couple lines of code :wink:

Assuming you have extracted key_value fields wither with and extractor or in a pipeline rule that is in a previous stage (all rules within a stage generally run synchronously) the field you are looking for in your when part of your rule would be auditid_id-auditlog ? It’s not clear.

I have not extracted any fields yet - that’s what I’m trying to achieve with the rule+pipeline combo :slight_smile:

So, in essence I have auditd log lines coming in on my beats input / filebeat collector with this config:

# Needed for Graylog
fields_under_root: true
fields.collector_node_id: ${sidecar.nodeName}
fields.gl2_source_collector: ${sidecar.nodeId}

filebeat.inputs:
- type: filestream
  id: id-auditlog
  paths:
    - /var/log/audit/audit.log
  fields:
    audit_log: true
output.logstash:
   hosts: ["graylog.<redacted>.com:5142"]
   ssl.verification_mode: full
path:
  data: /var/lib/graylog-sidecar/collectors/filebeat/data2
  logs: /var/lib/graylog-sidecar/collectors/filebeat/log

Here’s an example log entry from the stream that is attached to this Beats input:
First, in GUI:

Here are the contents of the “message” field from above screendump:

type=SYSCALL msg=audit(1659700716.423:14983737): arch=c000003e syscall=41 success=yes exit=14 a0=2 a1=2 a2=0 a3=561ffb9438b0 items=0 ppid=1 pid=593 auid=4294967295 uid=107 gid=112 euid=107 suid=107 fsuid=107 egid=112 sgid=112 fsgid=112 tty=(none) ses=4294967295 comm=“snmpd” exe=“/usr/sbin/snmpd” subj==unconfined key="mdatp"ARCH=x86_64 SYSCALL=socket AUID=“unset” UID=“Debian-snmp” GID=“Debian-snmp” EUID=“Debian-snmp” SUID=“Debian-snmp” FSUID=“Debian-snmp” EGID=“Debian-snmp” SGID=“Debian-snmp” FSGID=“Debian-snmp”

On my mission to break off this entire message into fields I have then created a pipeline, connected it to the stream that receives audit log data from my Beats input, and created just a single rule which is supposed to break up key values into fields and prefix these with “auditd_”.

The rule looks like this:

rule "auditd_keys_to_fields"

when
has_field("type")
then
      // extract all key-value from "message" and prefix it with auditd_
set_fields(
    fields:
        key_value(
            value: to_string($message.message),
            trim_value_chars: "\""
        ),
        prefix: "auditd_"
);
end

As you can see I’ve used the keyword “type” in the has_field condition, as it seems that “type=” is present in the beginning of every auditd log message, but this configuration somehow does not match any messages (throughput=0).
I’ve then done a bit of trial and error and tried to use fields set in the filebeat collector for the has_field condition (have tried id-auditlog, audit_log, audit_log: true) but to no avail.

Any help with getting this thing working would be much appreciated…

:expressionless:

When continuing with this thread, you are pretty much locked in that only @gsmith and I will be reading it and we both volunteer our time for support. Tightly related for you, loosely coupled for us while we are also doing our day jobs.

However - I do have enough information now to give you some direction… :smiley:

===

The has_field() function looks for a broken out field so based on the log entry example you have, an example may be the first broken out field as below.

has_field("beats_type")

Though it isn’t very unique. In your case if you want to see if the message starts with type=SYSCALL you could have something like this:

rule "auditd_keys_to_fields"

when

   starts_with(to_string($message.message),"type=SYSCALL", true)

then
      // extract all key-value from "message" and prefix it with auditd_
set_fields(
    fields:
        key_value(
            value: to_string($message.message),
            trim_value_chars: "\""
        ),
        prefix: "auditd_"
);
end
1 Like

I tried implementing the rule exactly as you’ve suggested and unfortunately I still get a message throughput of zero on the rule :confused:

So I’ve gone back and played around with input extractors again, and using that, I can easily have Graylog split the message string into fields, but I’m lacking the essential option to prefix these individual fields with i.e. “auditd_”

Is there some way to achieve this using extractors solely, or is that sort of transformation only possible using rules/pipeline (which I have had no luck with whatsoever)?

Does anyone know the answer to how I might be able to split auditd messages into one field per each key=value segment, and prefix these fields with “auditd_” ?
Seems to be it would have to be by the use of pipeline/rules, but cannot for the life of me get that working (no messages being processed by rule, no matter what sort of rule config I have tried).