In all my logs I`ve got an error:
gl2_processing_error
Replaced invalid timestamp value in message <44233511-cce1-11ee-9876-b42e9948155b> with current time - Value <2024-02-16T15:37:17.777191831Z> caused exception: Invalid format: “2024-02-16T15:37:17.777191831Z” is malformed at “T15:37:17.777191831Z”.
2. Describe your environment:
OS Information:
Linux 6.1.0-13-amd64
Package Version:
5.2.3+9aee303
Service logs, configurations, and environment variables:
Elasticsearch is 7+ version, we are using a Raw/Plaintext TCP inputs.
docker → vector → graylog
3. What steps have you already taken to try and solve the problem?
As I undestand, this is a docker timestamp. All I need is to parse the message from the docker container. At first I used the standard json exctrator. Then I got tired of this gl2_processing error, and started using the pipeline, thinking that I could configure the rule in more detail. And I used all the options format_date, convert_to_date, parse_date, but perhaps I configured them incorrectly and inserted them before parsing the message itself.
4. How can the community help?
I hope that I just need a good configured pipeline rule.
in theory you just need to edit the Format to “yyyy-MM-ddTHH:mm:ss.sssssssssZ” and set the output timezone
( parse_date(value, pattern, [locale], [timezone]) : DateTime)
The only thing im unsure of is the “T” within the timestamp, maybe replace that T while set the variable for timestamp with regex ( something like “let my_timestamp=[…].replace(“T”,”“)” )
Hi,
since i dont know the exact message wich is loaded into the pipeline-function, there is no generic version of a matching regex.
i suggest using something like regex101.com to get your matching regex.
ie. if the message is:
“<2024-02-16T15:37:17.777191831Z> my message log - some data”
the your regex would be:
“^\<([0-9\-T\:Z]*)\>\ .*”
so the line to get the raw timestamp:
let my_timestamp=regex(“^\<([0-9\-T\:Z]*)\>\ .*", to_string($message.message));
explanation:
“^” = Lines starting with
“\” = escaping next character (sometimes needed, sometimes not - try it)
“([0-9\-T\:Z]*)” = Matching one of 0-9, literal “-”, literal “T”, literal “:”, literal “Z”, multiplier (*) = non to unlimited occurences, save to regexgroup 1 (by enclosing it in brackets)
"\>\ " = followed by literal “>” and a single space
“.*” = followed by any char, with any occurences
this should give you a nice hint on how to get the raw timestamp-string
This is a small victory. I did it=)
But there is next… mmm… mistake I think)
I`ve one rule that converts timstamp and parsing message json.:
rule "replace timestamp"
when
true
then
let result = regex("([0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{9}Z)", to_string($message.message));
let new_date = parse_date(to_string(result["0"]), "yyyy-MM-dd'T'HH:mm:ss.SSSSSSSSS'Z'","UTC");
set_field("timestamp", new_date);
let json_parsed = parse_json(to_string($message.message));
set_fields(to_map(json_parsed));
end
And when Im starting rule simulation in rule config - its all ok! but when I go to my stream, to check messages - Ive got nonparsed message, but without gl2_processing_error…(which is great=)
And here is my message processor config:
|1|Pipeline Processor|active|
|2|Stream Rule Processor|active|
|3|AWS Instance Name Lookup|active|
|4|GeoIP Resolver|active|
|5|Message Filter Chain|
(maybe even works with parse_json parsed values - i dont use it so i cant say for sure)
i got a different processor-config:
|1|Message Filter Chain|active|
|2|Pipeline Processor|active|
|3|GeoIP Resolver|active|
|4|AWS Instance Name Lookup|disabled|
|5|Illuminate Processor|active|
So the messages would first be routed to the stream, then i can filter messages in pipeline-rules by stream.
I dont know if this is messing with your current chain, but i use it this way since the beginning