Pipeline rule, multiple values using regex function return an empty table


#1

Hi everyone,
here is a pipeline rule who normaly catch multiple quoted values like ‘xxxxx’ :slight_smile:

rule “quote extractor”
when
true
then
let result = regex("’.*?’",to_string($message.message));
set_field(“quotedvalue_1”,to_string(result));
end

Here is a log message exemple:

hostname hlspullpusher.hlspusher/41497 zf103: ‘QUOTED VALUE 1’ - unable to push HLS fragment ‘QUOTED VALUE 2’ : Server error ‘QUOTED VALUE 3’|#warning,hlspusher,local4,support

!!! warning blockquote formatted text change the quote ’ to ‘ …warning if you copy paste the log message !!!

Tested with freeformater (https://www.freeformatter.com/java-regex-tester.html#ad-output), regex is working well.

On pipeline simulator, i have en empty table :

i convert table to a string to have all values listed on the field, i try with

set_field(“quotedvalue_1”,result[“0”]);

i have an empty value too.

My graylog processors configuration :

I have thtoughput over my pipeline and rule :

Any ideas ?


#2

The linked page doesn’t give me a match.
So you have to work on your regex pattern.
Have you tried the escape the ’ character?


#3

Hi @macko003, the blockquote formatting change quote :slight_smile:

the regexex:

‘.*?’

log entrie for test:

hostname hlspullpusher.hlspusher/41497 zf103: ‘QUOTED VALUE 1’ - unable to push HLS fragment ‘QUOTED VALUE 2’ : Server error ‘QUOTED VALUE 3’|#warning,hlspusher,local4,support


#4

Yes, but when I tested, it worked with escaped ’


#5

Ok, il will try my rules by escaping quotes…good idea :smile:


#6

after somes test using debug() function on the rules … it look strange … the regex function catch nothing . i try with another “easy” regex.

“warning” value is available on the log line:

Rules on my pipeline:

rule “quote extractor”
when
true
then
let result = regex(“war[acn]ing”,to_string($message.message));
debug (result);
set_field(“quotedvalue_1”,to_string(result));
end

debug output :

2019-01-16 12:36:20,027 INFO : org.graylog.plugins.pipelineprocessor.ast.functions.Function - PIPELINE DEBUG: {}
2019-01-16 12:36:20,027 INFO : org.graylog.plugins.pipelineprocessor.ast.functions.Function - PIPELINE DEBUG: {}
2019-01-16 12:36:20,028 INFO : org.graylog.plugins.pipelineprocessor.ast.functions.Function - PIPELINE DEBUG: {}
2019-01-16 12:36:20,028 INFO : org.graylog.plugins.pipelineprocessor.ast.functions.Function - PIPELINE DEBUG: {}
2019-01-16 12:36:20,028 INFO : org.graylog.plugins.pipelineprocessor.ast.functions.Function - PIPELINE DEBUG: {}
2019-01-16 12:36:21,056 INFO : org.graylog.plugins.pipelineprocessor.ast.functions.Function - PIPELINE DEBUG: {}

i forgot something on my rules ? :exploding_head:

I’m on

graylog 2.5.0

Using

UDP syslog input —> a stream —> 1 pipeline connected to and 1 rule

And my pipeline works well :


#7

It seems to be ok.
Try debug more, to localize the problem.

eg.
use debug for the $message.message.
try regexp in the when part. ( regex(".*", to_string($message.name)).matches == true)(I use somethink like that in my pipeline)
etc

maybe it is a bug.


#8

Hi @macko003 thanks for your help …

It has the taste of a bug :smile:

last test (i tested with the “matches” on when condition too but without result) :

rule "Generic REGEX quote extractor"
when
    true
then
debug ($message.syslog_message);
let result = regex("\'.*?\'",to_string($message.syslog_message));
debug (result);
set_fields(result);
end

Debug() output:

2019-01-17 16:18:21,607 INFO : org.graylog.plugins.pipelineprocessor.ast.functions.Function - PIPELINE DEBUG: start archive cleaner on channel '2025279_multi'|#phanes,ott,support,local4
2019-01-17 16:18:21,607 INFO : org.graylog.plugins.pipelineprocessor.ast.functions.Function - PIPELINE DEBUG: {}
2019-01-17 16:18:21,611 INFO : org.graylog.plugins.pipelineprocessor.ast.functions.Function - PIPELINE DEBUG: start purge channel '2025279_multi' from '1' (1970-01-01 00:00:01) to '1547741586' (2019-01-17 16:13:06)|#phanes,ott,support,local4
2019-01-17 16:18:21,611 INFO : org.graylog.plugins.pipelineprocessor.ast.functions.Function - PIPELINE DEBUG: {}  

Regex() is still catching nothing …


#9

try regex “.*” only.
As far as I know in graylog you need to use double escape (\\') (I think I forgot it before, sorry)


#10

HI @macko003, don’t worry i appreciat your help :wink:

i try with double escape, still nothing :

rule "Generic REGEX quote extractor"
when
    true
then
debug ($message.syslog_message);
let result = regex("\\'.*?\\'",to_string($message.syslog_message));
debug (result);
set_fields(result);
end

i try with “.*” in my regex, still nothing … strange :roll_eyes:

rule "Generic REGEX quote extractor"
when
    true
then
debug ($message.syslog_message);
let result = regex(".*",to_string($message.syslog_message));
debug (result);
set_fields(result);
end

Debug output:

2019-01-22 11:49:56,494 INFO : org.graylog.plugins.pipelineprocessor.ast.functions.Function - PIPELINE DEBUG: '2025279_multi-239.255.80.1:10004:lan6' - no data received (timeout is '3')|#error,mezzanine,local4,support
2019-01-22 11:49:56,494 INFO : org.graylog.plugins.pipelineprocessor.ast.functions.Function - PIPELINE DEBUG: {}
2019-01-22 11:49:56,494 INFO : org.graylog.plugins.pipelineprocessor.ast.functions.Function - PIPELINE DEBUG: '2025279_multi-239.255.80.1:10007:lan6' - no data received (timeout is '3')|#error,mezzanine,local4,support
2019-01-22 11:49:56,494 INFO : org.graylog.plugins.pipelineprocessor.ast.functions.Function - PIPELINE DEBUG: {}

#11

i have mo more ideas :frowning:


#12

me too … i will create a ticket on github. It’s strange.


(Ben van Staveren) #13

You may want to add grouping to your regex :wink:

And maybe change the regex to this: regex("('.*?')+", to_string($message.message)) - this will grab all quoted values at once, and you can then use result[0], result[1], etc. to grab them. They do include the ’ quote in the value, so you could then strip that off.


#14

that crazy thanks @benvanstaveren … regex catch now the first group but not the second.

rule "Generic REGEX quote extractor"
when
true
then
debug ($message.syslog_message);
let result = regex("('.*?')+",to_string($message.syslog_message));
debug (result);
set_fields(result);
end

2019-01-22 14:47:29,483 INFO : org.graylog.plugins.pipelineprocessor.ast.functions.Function - PIPELINE DEBUG: '2025279_multi-239.255.80.1:10004:lan6' - no data received (timeout is '3')|#error,mezzanine,local4,support

2019-01-22 14:47:29,483 INFO : org.graylog.plugins.pipelineprocessor.ast.functions.Function - PIPELINE DEBUG: {0='2025279_multi-239.255.80.1:10004:lan6'}

2019-01-22 14:47:29,483 INFO : org.graylog.plugins.pipelineprocessor.ast.functions.Function - PIPELINE DEBUG: '2025279_multi-239.255.80.1:10002:lan6' - no data received (timeout is '3')|#error,mezzanine,local4,support

2019-01-22 14:47:29,483 INFO : org.graylog.plugins.pipelineprocessor.ast.functions.Function - PIPELINE DEBUG: {0='2025279_multi-239.255.80.1:10002:lan6'}

2019-01-22 14:47:29,483 INFO : org.graylog.plugins.pipelineprocessor.ast.functions.Function - PIPELINE DEBUG: '2025279_multi-239.255.80.1:10003:lan6' - no data received (timeout is '3')|#error,mezzanine,local4,support

2019-01-22 14:47:29,484 INFO : org.graylog.plugins.pipelineprocessor.ast.functions.Function - PIPELINE DEBUG: {0='2025279_multi-239.255.80.1:10003:lan6'}

crazy … regex function catch only the first group :exploding_head:


(Ben van Staveren) #15

Heuh… okay, umm that’s weird because I could swear it should grab all - unless it requires the /g modifier to do it on the entire message but I’m sure java’s regex already does that…


#16

Weird …for sure … i have no more idea :sweat:


(Ben van Staveren) #17

Yeah, I’m kind of out of ideas here too :frowning:


#18

I could be helpful to have a push from the graylog team :hugs:


#19

no one ? :relieved:

Any help will be appreciated.


(Jan Doberstein) #20

When you use regex and groups, you need to include what group you want to use.

rule "extract {content}"
when
    true
then
    let message=to_string($message.message);
   
    let ex = regex(pattern: "\\{(.*?)\\}", value: message);
    
    set_field("first", ex["0"]);
    set_field"("second", ex["1"]);
end

In this long read I wasn’t able to find a log entry that I could copy&paste or a pipeline rule I can copy&paste - so sorry no direct help.