Pipelines and regex (and some extractor talk)

The TL;DR is that i’m trying to make a pipeline with a rule that parses my bind9/named query log messages.

New to graylog pipelines but not new to regex :slight_smile:
But a trued and tried regex that works in regex testers with my message text gives bunch of errors in Graylog pipeline rule editor. It’s a valid Javascript regex as far as i can see. Tested in regex101

And i don’t understand why.

Example string to be parsed:
ns1 named[47638]: client @0x7f61fd6a0b68 10.10.10.10#49172 (www.domain.tld): query: www.domain.tld IN A + (10.10.10.2)

My Regex:
let result = regex("^(ns\d) .+ client (@\S+) ([0-9\.:]+)#(\d+) \((\S+)\): query: (\S+) (\w+) (\w+) ([0-9\+A-Z]+) \((.+)\)$",to_string($message.message));

As somebody will surely ask for the whole rue here it is:

rule "NS Bind9 Query message parser"

when
    has_field("message") && contains(to_string($message.message),"ns1 named") && contains(to_string($message.message),"query: ")
    OR
    has_field("message") && contains(to_string($message.message),"ns2 named") && contains(to_string($message.message),"query: ")
then
    let result = regex("^(ns[0-9]) .+ client (@\S+) ([0-9\.:]+)#(\d+) \((\S+)\): query: (\S+) (\w+) (\w+) ([0-9\+A-Z]+) \((.+)\)$",to_string($message.message));
    set_field("nameserver", result["0"]);
    set_field("clientobject", result["1"]);
    set_field("src_ip", result["2"]);
    set_field("src_port", result["3"]);
    set_field("clientquery", result["4"]);
    set_field("query_domain", result["5"]);
    set_field("query_class", result["6"]);
    set_field("query_type", result["7"]);
    set_field("query_flags", result["8"]);
    set_field("query_nameserver", result["9"]);
end

What am i doing wrong here in the regex?
What syntax/lint problems am i overseeing, i must be blind ?

PS. The extractor Feature seems to well integrated and preforms, why is there no talk about extractors and even the link to extractors docs is not working within Graylog self.? Are Extractors just and old “feature” that is not yet removed and should not be used ? Is it deprecated or not ?

Hi @kawaiipantsu,

You need to convert it into a java regex string. Try this:

“^(ns\d) .+ client (@\S+) ([0-9\.:]+)#(\d+) \((\S+)\): query: (\S+) (\w+) (\w+) ([0-9\+A-Z]+) \((.+)\)$”

I use this site to convert my perl regex into java.

Hi Chris,

Thanks for the site. But is that site not for Java and not Javascript ?
Tried your regex but yeah that just yields the same type of error output as mine.

It’s like the whole regex is being interpreted as code.
But yes i’m sure that you are right, it’s simply not converted ok to javascript but it’s accepted on regex101 even when choosing Javascript or Java.

(Since im a new user i can’t post multiple images)
So here is a link to regex101 with the test and set to Javascript.

Sorry i have to stop my self there and proclaim that the answer from Chris was correct, the website he listed actually worked in converting it. But the example Chris showed was missing the double escaping of the backslashes.

So the correct regex seems to be the following:
"^(ns\\d) .+ client (@\\S+) ([0-9\\.:]+)#(\\d+) \\((\\S+)\\): query: (\\S+) (\\w+) (\\w+) ([0-9\\+A-Z]+) \\((.+)\\)$"

So just to summarize - The solution was to look at how the backslashes was not escaped correctly.

Just to show the result, thanks and proof that all is beautiful :slight_smile:

1 Like

I must have copied the wrong line. The Perl and Java lines are right on top of each other.

My apologies. Glad you figured it out anyway.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.