Grok Parse in Pipeline (bug or invalid escapes?)

Hey all,

Just found this. Trying to parse a few URIPATHs (field is URI_path) that look like this:

section=bottom&templateUrl=https://www.domain1.ca/parser.asp?app=tickets
section=doctype&templateUrl=https://www.domain1.ca/parser.asp?app=tickets
section=doctype&templateUrl=https://www.domain2.ca/en/Parser.asp
section=bottom&templateUrl=https://www.domain2.ca/en/Parser.asp

All I care about is domain1/domain2 and the app being used:

section=%{WORD:parse_section}&templateUrl=https://www.%{WORD:parse_organization}.ca/(en/)?(p|P)arser.asp(\?app=%{WORD:parse_application})?

works in the grok debugger… but fails in the pipeline rule. I’m sure there is a character throwing it off - but I seem to keep finding new and creative ways to break out of my grok/regex strings. Can anyone spot where I’m messing it up?

Full rule

rule "Simplify IIS Parse URI" 
when (
    has_field("log_type") AND 
    contains(to_string($message.log_type),"IIS",true)
    ) AND (
    has_field("URI_path") AND 
    contains(to_string($message.URI_path),"/parser/parser.ashx",true)
    )
then
    
let unparsed = to_string($message.URI_path);
let parsed = grok(pattern:"section=%{WORD:parse_section}&templateUrl=https://www.%{WORD:parse_organization}.ca/(en/)?(p|P)arser.asp(\?app=%{WORD:parse_application})?",value: unparsed,only_named_captures: true);
set_fields(parsed);
end

UPDATE: This exact pattern works in the GROK extractor for any test message - but still won’t save in a Pipeline.

UPDATE 2: The escaped ? at the end of the the URI_path was the culprit. I had to double-escape it in the pipeline rule - lets see if it still processes. I’ll post back.

UPDATE 3: Editor allows me to save the rule, but it won’t extract.

UPDATE 4: This is the GL processing error:

For rule 'Simplify IIS Parse URI': In call to function 'grok' at 13:13 an exception was thrown: Unknown inline modifier near index 77 section=(\b\w+\b)&templateUrl=https://www.(\b\w+\b).ca/(en/)?(p|P)arser.asp(?app=(\b\w+\b))?

My regex sucks, so I can’t figure out where its wrong.

I personal would place the working GROK Pattern into a new GROK pattern in your System (System > GROK-Patterns) and use that single word in the processing pipeline. That prevents you from the double escape you need in the editor.

2 Likes

This exact pattern works in the GROK extractor for any test message - but still won’t save in a Pipeline.

That’s a common occurence. You have to use double escapes on all special characters, so instead of

\(en\)

You’ll have to put

\\(en\\)

in the graylog grok-Filter.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.