GROK Pattern with HTTPDATE fails

Hello,

I have a strange problem with a GROK pattern.
The GROK pattern

HTTPLOG:
%{IP:client_ip} - - [%{HTTPDATE:timestamp}] “%{WORD:http_method} %{URIPATHPARAM:request_path} HTTP/%{NUMBER:http_version}” %{NUMBER:http_code} %{NUMBER:response_size}?

is applied in a rule to incoming log messages.
I get a lot of (in 30 minutes ~ 300000) processing failures with this message:

failure_details
Value <26/Feb/2025:09:26:42 +0100> caused exception: Invalid format: “26/Feb/2025:09:26:42 +0100” is malformed at “/Feb/2025:09:26:42 +0100”.
failure_type
processing
message
Failed to process message with id ‘01JN0NWQBP04RJH43RNNS4CFD6’: Replaced invalid timestamp value in message <89a3ea50-f41b-11ef-9e5f-0050568773de> with current time

Here is an example of a log message which fails:

10.1.1.68 - - [26/Feb/2025:10:08:45 +0100] “GET /webviewer/rest/documents/999PO1999_RB_7a6d31aaaa1a4ce2b540fad8d0b76ac6/content?sapdocid=7a6d31aaaa1a4ce2b540fad8d0b76ac6&contrep=RB&alurl=https%3A%2F%2FVHSAPP0150.SAP-P.D001.LOC%3A44395%2Fsap%2Fopu%2Fodata%2FGKV%2FCA01_COMLINE_SRV%2FImageSet%28ContRep%3D%27CB%27%2CArcDocId%3D%277a6d31aaaa1a4ce2b540fad8d0b76ac6%27%2CComponent%3D%27%27%2CApp%3D%27%252AGKV%252ACM00A%27%29%2F%24value&token=eyJhbGciOiJIUzUxMiJ9.eyJzdWIiOiJ7XCJzYmlkXCI6XCJCQTE5NTU2XCIsXCJrZXJuc3lzdGVtXCI6XCJQTzFcIixcImpzZXNzaW9uaWRcIjpcIm1vY2tzZXNzaW9uXCIsXCJvcmlnaW5cIjpcImh0dHBzOi8vdmhzYXBwbzE1OS5zYXAtcC5kMDAxLmxvYzo0NDM5NVwiLFwiY29va2llc1wiOlwiQkExOTU1Nl9EZWZhdWx0TGlzdERvY0xpc3Q9bGlzdDsgU0FQX1NFU1NJT05JRF9QTzFfMTAwPVRTNDBnVGpWWGhUSTlJTFNuN1NtdHNjNF8tajBGUkh2anVxSm5ZeHJ5MDg9OyBNWVNBUFNTTzI9QWpFeE1EQWdBQTV3YjNKMFlXdzZRa0V4T1RVMU5vZ0FFMkpoYzJsallYVjBhR1Z1ZEdsallYUnBiMjRCQUFkQ1FURTVOVFUyQWdBRE1EQXdBd0FEVURJd0JBQU1NakF5TlRBeU1qWXdPREUyQlFBRUFBQUFEQW9BQjBKQk1UazFOVGIvQVJnd2dnRVVCZ2txaGtpRzl3MEJCd0tnZ2dFRk1JSUJBUUlCQVRFTE1Ba0dCU3NPQXdJYUJRQXdDd1lKS29aSWh2Y05BUWNCTVlIaE1JSGVBZ0VCTURJd0tqRUxNQWtHQTFVRUJoTUNSRVV4RFRBTEJnTlZCQXNUQkVveVJVVXhEREFLQmdOVkJBTVRBMUF5TUFJRVdXb0tvekFKQmdVckRnTUNHZ1VBb0Ywd0dBWUpLb1pJaHZjTkFRa0RNUXNHQ1NxR1NJYjNEUUVIQVRBY0Jna3Foa2lHOXcwQkNRVXhEeGNOTWpVd01qSTJNRGd4TmpBeldqQWpCZ2txaGtpRzl3MEJDUVF4RmdRVTlrVGJ1ZnNBeU5PeFJhYThRc3VUa1p4MjdSZ3dDUVlIS29aSXpqZ0VBd1F3TUM0Q0ZRQ0w4SmIvMElYVHdHamxTeHlUWWhINXNSOE44Z0lWQUxuZG9FZWRVV3dhek5FU2k4M0pnUWVVUy9KcDsgc2FwLXVzZXJjb250ZXh0PXNhcC1sYW5ndWFnZT1ERSZzYXAtY2xpZW50PTEwMFwiLFwiYWRtaW5cIjpmYWxzZX0iLCJleHAiOjE3NDE0MjQ5MjR9.CAUOt66t3HhslpiZ8pMvuCYyF2a1P3EmVRENZWrGUYOVZoCK148zxTfeOtgWaix480QtO_Z6fs9fVMOVqqixHw&nocache=true&page_range=-1&ot=OSCARE_DOC HTTP/1.1” 200 127572

If I go to Test grok patterns and test the GROK pattern from above with this log message it works!!! I get no error!

Does somebody has an idea what’s wrong here?

OS Information: RedHat Linux
Graylog 6.1.5

Hey @schurd!

Just a quick guess, maybe it is because the square brackets ([]) are not quoted? What I would try is to use the System -> Grok Patterns -> Create Pattern modal in Graylog with your sample message to test if it is extracting fields properly. Also, I like using this Grok Pattern Generator (click on “Grok Patterns” in the upper left) to bootstrap the initial pattern and iterate over it until it does what I want.

I hope this helps?

Have a great day,
Dennis

Hello Dennis,

thank you for the answer. I did not know that Graylog itself can test and even create GROK patterns.
Yes, the GROK pattern in my first post is missing the brackets .
Somehow they got lost while copying the pattern into the post, in the original in Graylog the brackets are existing in the right place.
But I don’t understand the “failure message”. Something seems wrong with the date format HTTPDATE.

In tests (in Graylog or online) everything works fine, the GROK pattern matches. But in Graylog I get those processing errors.
This is what I don’t understand.

Regards,
Dietmar Schurr

Hello Dennis,

I found a workaround/solution.
If I replace in the GROK pattern

%{HTTPDATE:timestamp}

with

%{HTTPDATE:timestamp;date;dd/MMM/yyyy:HH:mm:ss Z}

it works, which means I don’t get the processing failures any more.

Reagrds,

Dietmar Schurr

Hello everybody,

now, the processing errors are gone, but the datetime field is not interpreted as it should. For example the field [28/Feb/2025:09:04:17 +0100]
results in this Graylog fields:
INT
+0100
MINUTE
04
MONTH
Feb
MONTHDAY
28
SECOND
17
TIME
09:04:17
YEAR
2025
This is correct. But the resulting timestamp has a wrong timezone information:
image
The timezone should be +0100 but in fact is 0000.

What can be done to correct his?

Thanks in advance for your help,

Dietmar Schurr

You could use something like the below or set the timezone of your user.

rule "replace timestamp"
when
    has_field("timestamp")
then    
    let new_date = parse_date(to_string($message.timestamp), "yyyy-MM-dd'T'HH:mm:ss.SSS","CET"); ///Central European Time
    set_field("timestamp1", new_date);
end

Hello Wine_Merchant,

thank you for the suggestion. But this would mean to adjust the rule very soon, when the time zone switches from CET to CEST because of summertime (daylight saving time).
Right now the problem is not urgent enough to investigate further.
Thanks.

Dietmar Schurr