Message format issue

Hello,

I’m setting up new logs management for an application with Graylog.
Logs are sent to Graylog using an UdpAppender on the application log4net.config file.
A specific input was created te receive these messages, and there are well received.

Now, I would like to add an extractor to extract information from a message but a simple regex doesn’t match.
Message example:
2022-08-23 11:57:17,925 | INFO | Operation | MY-HOST-NAME-001 | Message Text

Simple regex that doesn’t match (I know my regex is stupid but it is just to show you the issue):
.*(INFO).*

But the following regex is matching (that suppose there is unvisible chars between each letter of the message):
.*(I\SN\SF\SO).*

And if I try to add the message to a query using “Add to query” feature, I have the following behaviour:


There is unwanted chars between each letters of the message.

Do you think that this behaviour is coming from the application log4net.config file or from Graylog itself?
Do you already had this kind of issue and how did you solve it?

Thanks for your help.

Hello @MazeOfFate

I seen something similar to this before.

Graylog only interrupt what logs has been sent, to the type of input being used, have you tried different inputs?

Correct me if I’m wrong , are you trying to separate the message into different fields?

2022-08-23 11:57:17,925 | INFO | Operation | MY-HOST-NAME-001 | Message Text

Or are you trying to just search for messages?

EDIT: Here is an example of regex grabbing the first pipe, this is a demo.

^.*?\|(.*?)\|.*$

Results:

If this is handled in the pipeline you can use the split() function as I did in the referenced post and use the | as the marker for splitting…

1 Like

Hey,
Are you referring to something like this?

let m = split("\\|", to_string($message.message));
set_field("info", m[0]);
set_field("hosts", m[1]);

I’m working in my lab, haven’t done this yet so I figured i give it a go.

@tmacgbay

Think I got it :smiley:

rule "fields"
when
  true
then
 let m = split("\\|", to_string($message.message));
 set_field("datetime", m[0]);
 set_field("info", m[1]);
 set_field("ops", m[2]);
 set_field("hosts", m[3]);
 set_field("mgms", m[4]);
 end

Results:

1 Like

Hello,

Thanks for your answers.

But my issue is not really on how to retreive values by spliting them but more why there is unexpected chars between letters of the message.

I’m using a ’ Raw/Plaintext UDP’ input, I tried with a ‘Syslog UDP’ input, same issue.

The only one workaround I found, based on the @gsmith proposal is adding a regex to remove these extra chars but it is a little bit dirty:

rule "NewRule"
when
  from_input(name:"My Input")
then
 let m = split("\\|", regex_replace( "[^a-zA-Z0-9\\s-:;,\\-=\\._]" , to_string($message.message), "" ));
 set_field("datetime", m[0]);
 set_field("lvl", m[1]);
 set_field("ops", m[2]);
 set_field("host", m[3]);
 set_field("msg", m[4]);
end

Did encoding from the log4net appender could be the cause?
Because only message field is impacted. Others fields like timestamp are not concerned:

I found the root cause and it was my log4net configuration…
Encoding has to be set to avoid these extra chars.

Previous config:

	<appender name="MyUDPAppender" type="log4net.Appender.UdpAppender">
        <remoteAddress value="000.000.000.000" />
        <remotePort value="12345" />
        <layout type="log4net.Layout.PatternLayout">
            <conversionPattern value="%date | %level | Module | %message%newline" />
        </layout>
    </appender>

New config:

    <appender name="MyUDPAppender" type="log4net.Appender.UdpAppender">
        <remoteAddress value="000.000.000.000" />
        <remotePort value="12345" />
		<encoding value="utf-8" />
        <layout type="log4net.Layout.PatternLayout, log4net">
            <conversionPattern value="%date | %level | Module | %message%newline" />
        </layout>
    </appender>

Thanks for your help on this topic.
Have a good day.

2 Likes

Ha! That was going to be my suggestion just as I was reading through this completely - I had an issue a while back with UTF8 encoding…

Glad you found it!

Good work by @gsmith coming up with an excellent and unaskedfor pipeline rule solution too!! :smiley:

Mark your post as a solution for future UTF8 searchers!!

2 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.