Pipeline regex rule

I’m trying to create a custom field. Previous version I had used an extractor. But I understand that has gone away in lieu of pipelines and rules. (we are using ver 4.29).

Basically there is a log message with a client name in the middle. For example…
beginning of log message CLIENTA end of log message
beginning of log message CLIENTB end of log message
beginning of log message CLIENTA end of log message

I’m trying to count how many times each client appears per 12 hour period.

So. I’m trying to create this rule to use in a pipeline to create the field.

rule “Extract ClientName from Log”
when
regex(“beginning of log message(.+)end of log message”, to_string($message.message)).matches == true;
then
let clientname = regex(“beginning of log message(.+)end of log message”);
set_field(“clientname”,clientname);
end

But I’m getting an error when I try to save the rule. I’ve been looking through forums and documentation but can’t seem to sort it out.

Is this a good approach? And if so, can someone point me in the right direction on the rule creation?

THANK YOU in advance for any assistance.

I think it’s your variable description that’s causing a problem. You didn’t give it a data type or a field to look in for the string. (,to_string($message.message))

Try something like this, but replace it with your own regex. Note the quotes around the expression itself. Those are mandatory.

let result = regex((“<\d+>.*%(\w±\d±\d+):.*INTERNET:(\d+.\d+.\d+.\d+)\/(\d+).to (\S+):(\d+.\d+.\d+.\d+)\/(\d+).$”),to_string($message.message));

You will also need to convert yours to java regex, rather than perl. Ironically, my own is in java regex, but discourse automatically converts it back into perl.

I use RegexPlanet: online regular expression testing for Java to convert my “regular” regex into java compliant regex. As you should have been able to see from my example above, essentially it just double-escapes everything.

Then you just take the java-ized regex and paste it into your rule.

Tip: Be sure to choose field names that conform to the Graylog Information Model whenever possible. There will be an increasing amount of community content and learning material that presumes the GIM is already applied.

https://schema.graylog.org/en/stable/

1 Like

Chris, thank you!! That did help, and I finally was able to save my rule. I think it is running but not totally sure. I can’t find that the field was created. Maybe my regex is still wrong? Or maybe I need to set the variable some how? Is there a way to tell if the pipeline is even being called? The stream is definitely logging messages, yet the pipeline still says 0 msgs throughput.

So here’s an example from message field in the stream that is associated to the pipeline:

Apr 7, 2023 12:20:14 PM Command - Error : A subdirectory or file X:\CLIENTA\incoming\ already exists.

And to recap, I’m trying to extract CLIENTA

So I’ve written this rule. But I get errors until I changed it to the Java Regex using regexplanet.

rule “Extract ClientName from Log”
when
regex((“Command - Error : A subdirectory or file X:\\(.+)\\incoming\\ already exists\.”), to_string($message.message)).matches == true
then
let result = regex((“Command - Error : A subdirectory or file X:\\(.+)\\incoming\\ already exists\.”),to_string($message.message));
set_field(“clientname_incoming”,result[“0”]);
end

(Only when I do the double-escaping will it allow me to save.)

But still I do not see clientname_incoming, nor name.clientname_incoming as a field in the search, and I still see 0 msgs as the pipeline throughput. How do I know if clientname_incoming is being extracted and saved as a field for this stream?

Shouldn’t I be able to now create an alert based on clientname_incoming as a usable field?

(BTW, as you said, my regex actually does have 4 slashes and 2 slashes before the last period.)

UPDATE, ok looks like it is working! I needed to re-order the Message Processors Configuration (System/configurations > configurations > Message Processors Configuration > Update in order to make the message filter chain run before the Pipeline Processor.

After I did that I then saw some throughput in the Pipelines. And I do see some values being written to the variable when I view the incoming messages!

Here’s what I see when browsing the messages…

clientname_incoming
{“0”:“CLIENTA”}

Now I just need to find the proper syntax to utilize them in a search and alert > notification scenario…

If you post the rule and the sample message, we may be able to make a suggestion.

I did post the rule and sample message. But not necessary since it seems to be working once I re-arranged the Message Processors Configuration order.

Now, I do see the new field in the message view. But how do I use it in a search or in an alert/notification??

Here’s what I see when browsing the messages…

clientname_incoming
{“0”:“CLIENTA”}

How to I use the new field clientname_incoming in a search or alert trigger?

FYI, Here is the rule and sample message again.

So here’s an example from message field in the stream that is associated to the pipeline:

Apr 7, 2023 12:20:14 PM Command - Error : A subdirectory or file X:\CLIENTA\incoming\ already exists.

rule “Extract ClientName from Log”
when
regex((“Command - Error : A subdirectory or file X:\(.+)\incoming\ already exists.”), to_string($message.message)).matches == true
then
let result = regex((“Command - Error : A subdirectory or file X:\(.+)\incoming\ already exists.”),to_string($message.message));
set_field(“clientname_incoming”,result);
end

The fact that your field contains {“0”:“CLIENTA”} looks to me like the regex is returning an array but you are somehow setting the whole array with to_string rather than just the value you want. It looks like maybe you were trying to do the right thing with variable[0] but i think something is off in your syntax, i think maybe the “” around the zero.

Thanks guys! You were both really helpful. and now it is working well. yes, it must be finding it as an array so I set the variable as only the first item in the array and now it looks great. AND it is displaying/prompting as a defined variable in the search, alerts, etc. Thanks again!

this did the trick.
set_field(“clientname_incoming”,result[“0”]);

now the new field displays properly.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.