Graylog pipeline regex multi match

Hello,

From what I understand from several places (e.g. there: Can a pipeline rule to match the same pattern multiple times?), if a group in my regex happens several times in the original string, I should be able to get several matches. However, I’m not, so there’s something I must misunderstand. For example, here’s a very basic example of a rule, input, and what I get in the pipeline simulator:

rule:

when 
  has_field("test")
then
  let result = regex("([0-9])", to_string($message.message));
  set_field("result_0", result["0"]);
  set_field("result_1", result["1"]);
  set_field("full_result", to_string(result));
end

input:

{
  "test": 1,
  "message": "hel 1 flrj 4 l3j4 9",
  "timestamp": 1
}

So I would expect the field result_0 to be 1, and the field result_1 to be 4, and the field full_result to be a string representation of the map containing everything. Instead, this is what I get from the pipeline simulator:

Added fields

full_result
    {0=1}
test_result_0
    1

Hence, it seems only the first match is considered. What am I understanding wrongly?

Thanks!

  • In the example you have given, you let result hold your regex results yet you are using test_result to set the field fulll_result to the map.

  • You don’t want quotes on your index number so it should be : result[0]
    EDIT: you DO want quotes for regex (I looked at split() )… result[“0”]

  • your regex would still pick up the numbers in " 13j4 "

Your first point is correct, it’s a typo on my side, I’ll fix the OP, thanks.

Your second point is wrong, see the doc: https://docs.graylog.org/en/3.0/pages/pipelines/functions.html#regex: If not named, the groups names are strings starting with "0".

Your third point is my point: it’s how it’s supposed to work, but not how it works in practice.

You are right, it’s not possible, graylog only return first match. But if you know number of numbers in advance you can still use this:

let repa = regex(".*([0-9]).*([0-9]).*([0-9]).*)",to_string($message.message));
set_field("repa1", repa["0"]);
set_field("repa2", repa["1"]);
set_field("repa3", repa["2"]);

My second point was wrong… the rule i pulled up to check myself used regex then split() which used non quoted index numbers. - will correct that.