Field Extraction to two Fields


(Pwe123345) #1

Hello,

I have a query like this:

named[17627]: 15-May-2018 15:57:33.135 queries: info: client <IP-ADRESS>#65281: query: calender.google.com IN A +

The GROK Pattern is like this:

%{DATA:UNWANTED} client %{IPV4:Client_Address}\#%{DATA:UNWANTED} query\: %{URIHOST:URI_Host} %{WORD:DNS_Direction} %{WORD:DNS_type}

Now I want to convert the part

... calender.google.com ....

to two fields:

URIHOST: calender.google.com
URITLD: google.com

Is this even possible?

kind regards
P


(Jochen) #2

Sure. You can either create a custom Grok pattern or process the result (“URI_Host”) accordingly.


(Pwe123345) #3

That sounds great.

Can you point me in the correct direction? Im not sure how to transform one string to two substrings with the grok pattern.

To build the correct pattern should be no problem.

kind regards
Philipp


(Jochen) #4

The following regular expression would work:

\b(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}\.)*(?<domain>[0-9A-Za-z][0-9A-Za-z-]{0,62}\.[0-9A-Za-z][0-9A-Za-z-]{0,62})(?:\.?\b)

For example, the result in the “domain” group when matched against calender.google.com would contain google.com and when matched against foo.b-a-r.example.org it would contain example.org.


(Pwe123345) #5

Hi Jochen,

that looks great, thanks.

In the GROK Patterns, I just adjusted

\b(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}\.)*(?<domain>[0-9A-Za-z][0-9A-Za-z-]{0,62}\.[0-9A-Za-z][0-9A-Za-z-]{0,62})(?:\.?\b)

to

\b(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}\.)*(?<URI_TLD>[0-9A-Za-z][0-9A-Za-z-]{0,62}\.[0-9A-Za-z][0-9A-Za-z-]{0,62})(?:\.?\b)

But the data still gets filled to the “domain” field.

kind regards
Philipp


(Jochen) #6

That’s why I provided a regular expression and a link to the regex() function.


(system) #7

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.