Message format issue

MazeOfFate · August 24, 2022, 8:23am

Hello,

I’m setting up new logs management for an application with Graylog.
Logs are sent to Graylog using an UdpAppender on the application log4net.config file.
A specific input was created te receive these messages, and there are well received.

Now, I would like to add an extractor to extract information from a message but a simple regex doesn’t match.
Message example:
2022-08-23 11:57:17,925 | INFO | Operation | MY-HOST-NAME-001 | Message Text

Simple regex that doesn’t match (I know my regex is stupid but it is just to show you the issue):
.*(INFO).*

But the following regex is matching (that suppose there is unvisible chars between each letter of the message):
.*(I\SN\SF\SO).*

And if I try to add the message to a query using “Add to query” feature, I have the following behaviour:

There is unwanted chars between each letters of the message.

Do you think that this behaviour is coming from the application log4net.config file or from Graylog itself?
Do you already had this kind of issue and how did you solve it?

Thanks for your help.

gsmith · August 24, 2022, 10:11pm

Hello @MazeOfFate

I seen something similar to this before.

Graylog only interrupt what logs has been sent, to the type of input being used, have you tried different inputs?

Correct me if I’m wrong , are you trying to separate the message into different fields?

2022-08-23 11:57:17,925 | INFO | Operation | MY-HOST-NAME-001 | Message Text

Or are you trying to just search for messages?

EDIT: Here is an example of regex grabbing the first pipe, this is a demo.

^.*?\|(.*?)\|.*$

Results:

tmacgbay · August 24, 2022, 11:30pm

If this is handled in the pipeline you can use the split() function as I did in the referenced post and use the | as the marker for splitting…

gsmith · August 24, 2022, 11:34pm

Hey,
Are you referring to something like this?

let m = split("\\|", to_string($message.message));
set_field("info", m[0]);
set_field("hosts", m[1]);

I’m working in my lab, haven’t done this yet so I figured i give it a go.

gsmith · August 25, 2022, 12:07am

@tmacgbay

Think I got it

rule "fields"
when
  true
then
 let m = split("\\|", to_string($message.message));
 set_field("datetime", m[0]);
 set_field("info", m[1]);
 set_field("ops", m[2]);
 set_field("hosts", m[3]);
 set_field("mgms", m[4]);
 end

Results:

MazeOfFate · August 25, 2022, 10:35am

Hello,

Thanks for your answers.

But my issue is not really on how to retreive values by spliting them but more why there is unexpected chars between letters of the message.

I’m using a ’ Raw/Plaintext UDP’ input, I tried with a ‘Syslog UDP’ input, same issue.

The only one workaround I found, based on the @gsmith proposal is adding a regex to remove these extra chars but it is a little bit dirty:

rule "NewRule"
when
  from_input(name:"My Input")
then
 let m = split("\\|", regex_replace( "[^a-zA-Z0-9\\s-:;,\\-=\\._]" , to_string($message.message), "" ));
 set_field("datetime", m[0]);
 set_field("lvl", m[1]);
 set_field("ops", m[2]);
 set_field("host", m[3]);
 set_field("msg", m[4]);
end

Did encoding from the log4net appender could be the cause?
Because only message field is impacted. Others fields like timestamp are not concerned:

MazeOfFate · August 25, 2022, 1:17pm

I found the root cause and it was my log4net configuration…
Encoding has to be set to avoid these extra chars.

Previous config:

	<appender name="MyUDPAppender" type="log4net.Appender.UdpAppender">
        <remoteAddress value="000.000.000.000" />
        <remotePort value="12345" />
        <layout type="log4net.Layout.PatternLayout">
            <conversionPattern value="%date | %level | Module | %message%newline" />
        </layout>
    </appender>

New config:

    <appender name="MyUDPAppender" type="log4net.Appender.UdpAppender">
        <remoteAddress value="000.000.000.000" />
        <remotePort value="12345" />
		<encoding value="utf-8" />
        <layout type="log4net.Layout.PatternLayout, log4net">
            <conversionPattern value="%date | %level | Module | %message%newline" />
        </layout>
    </appender>

Thanks for your help on this topic.
Have a good day.

tmacgbay · August 25, 2022, 1:24pm

Ha! That was going to be my suggestion just as I was reading through this completely - I had an issue a while back with UTF8 encoding…

Glad you found it!

Good work by @gsmith coming up with an excellent and unaskedfor pipeline rule solution too!!

Mark your post as a solution for future UTF8 searchers!!

system · September 8, 2022, 1:24pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Difficulties to apply extractors using regex Graylog Central (peer support) key_value	47	2704	April 22, 2022
Extractor help needed Graylog Central (peer support)	2	834	March 15, 2021
Exctractors Check Graylog Central (peer support)	4	1345	September 13, 2018
Graylog 4 - pipeline regex causes lost messages? Graylog Central (peer support) pipeline-rules , route-to-streampl , debuggingpl	6	1550	February 1, 2021
Unable to get some REGEX extractors working Graylog Central (peer support)	3	831	November 19, 2019

Message format issue

Related topics