I need to filter out some logs in a Stream based on their timestamp.
Logs I want to filter out are always generated at midnight (00:0X).
I’m in UTC+2 so for Graylog which works in UTC it’s 22:0X.
I have tried with a regular expression but it doesn’t seem to work:
Logs are still routed in the Stream
At first I tried a regular expression with “00:0\d” but it doesn’t work.
Then using the Stream rule tester I remember Graylog uses UTC with the format yyyy-mm-dd’T’HH:MM:SS.XXXZ so I tried “T22:0\d” but it doesn’t work either, and then I checked in Elastic it’s stored in UTC with format" yyyy-mm-dd HH:MM:SS" so I tried " 22:0\d" but it doesn’t work too.
Is it a bug ?
I know I could use an extractor or a pipeline to extract hour and minute but a regular expression should work.
That’s strange because if I set “timestamp must match” instead of “timestamp must not match” Graylog doesn’t complain and says the log match the regex.
But thanks I’ll try.
Personally I think it is better to handle all that in the pipeline or if possible handle it in the log shipper… beats (and nxlog which I use less) can do some interesting gyrations… depends on where it is coming from.
I have found the solution
Stream rules are applied before logs are indexed, so at this moment the UTC timezone is not set yet, so you need to work with the original timezone.
So in my exemple I need to use “^\d{4}-\d{2}-\d{2}T00:0\d:” instead of “^\d{4}-\d{2}-\d{2}T22:0\d:”.
In my previous tests I was too fast and I was confused because my previous regexp was working with some specific logs (graylog metrics sent to a GELF input and some other logs whose timestamps are modified in a pipeline).
However I think there is an issue with the Stream rules tester, because it checks with the indexed timestamp whereas the rules actually check the received timestamp.