Order of GROK patterns changes result


#1

Hi there,

given two stored patterns and if they are linked with “|”, one of the stored patterns matches a certain string and the other stored pattern does partly. Then the result depends on the order of the stored patterns. The result is only right if the fully matching pattern is mentioned last. Which also means its wrong in every other case. AND there is no “right order” as order breaks it for other inputs. Maybe I’m missing something here but if not; this would mean Graylog is no longer an option for my team as we depend on working with syslogs in this manner.

The main Question is: Does chaining nested GROK patterns with " | " not work with the current Graylog GROK engine or am I misunderstanding something.

Example :

imap-login: proxy(testuser): disconnecting 8.8.8.8 (Disconnected by client): user=<testuser>, method=PLAIN, rip=8.8.8.8, lip=127.0.0.1, TLS, session=<zlgOH3xdHQBScWJe>

Pattern = Result:
%{DOVECOT_PROXY1} = Nothing, no match (correct)
%{DOVECOT_PROXY2} = Full match. All fields filled (correct)
(%{DOVECOT_PROXY1}|%{DOVECOT_PROXY2}) = Full match. All fields filled (correct)
(%{DOVECOT_PROXY2}|%{DOVECOT_PROXY1}) = Broken / Quirky result

It is fully reproducible in the extractor setup/test if you take the following data and patterns

Full syslog message:
<22>1 2017-11-08T18:42:22+01:00 dovecot-proxy dovecot - - - imap-login: proxy(testuser): disconnecting 8.8.8.8 (Disconnected by client): user=, method=PLAIN, rip=8.8.8.8, lip=127.0.0.1, TLS, session=<zlgOH3xdHQBScWJe>

GROK Patterns:
(Taken from github)
DOVECOT_PROXY1 %{WORD:proto}-login: %{WORD:proxy}\(%{USERNAME}\): started %{WORD:proxy_start} to %{IPORHOST:proxyto_host}:%{POSINT:proxyto_port}: user=<(%{USERNAME}(@%{HOSTNAME})?)?>, method=%{WORD:method}, rip=%{IP:rip}, lip=%{IP:lip}(, %{WORD:crypto})?, session=<%{DATA:session}>

DOVECOT_PROXY2 %{WORD:proto}-login: %{WORD:proxy}\(%{USERNAME}\): %{WORD:conn_status} %{IPORHOST} \(%{DATA:status_message}\): user=<(%{USERNAME}(@%{HOSTNAME})?)?>, method=%{WORD:method}, rip=%{IP:rip}, lip=%{IP:lip}(, %{WORD:crypto})?, session=<%{DATA:session}>


(Jan Doberstein) #2

@aehdings

what is your exact question now?


#3

Sorry @jan . Edited and clarified the question


(Jan Doberstein) #4

you might already notice that chained GROK Patterns need to be covered by brackets, as in the original GROK Source at Github can be seen:


#5

@jan
I tried that already as i was comming from the “largest” pattern in the file you mentioned %{DOVECOT}.
If this is supposed to be the right syntax, it is not working as i described.


(Jan Doberstein) #6

see line 67 of the document.

that is working only this way and in every implementation of GROK the same.


#7

@jan
Seems we’re not on the same page. I was referring to line 84 of the document which includes line 67 (once again nested). The problem is the same


(Jan Doberstein) #8

To make things easy to debug:

This is actually the http://grokdebug.herokuapp.com but that shows you the issue. How should grok know which is the correct pattern, if one already matches?

That type of patterns can not work in current GROK implementations. It might be in the past - but not in current ones.


(system) #9

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.