ihe
January 14, 2022, 4:36pm
1
I am trying to use the grok-function in a pipline. But I’m failing. I do not mean the Grok-function in the Inputs, my plan is to use pipelines to extract data with Groks as described here.
2. Describe your environment:
A little more in the Problem:
I have messages running into my steam to debug. The messages are ingested as Syslog on port 6666. To make it as simple as I can use this little bash-oneliner:
echo "testword1 testword2" | nc my-graylog.internal.network 6666
To extract both words I’m using this PoC-Rule:
rule "howto Grok in Pipeline"
when
has_field("message")
then
let val = grok(
pattern:"%{WORD:part_1) %{WORD:part_2)",
value:to_string($message.message));
set_field(
field:"part_1",
value:to_string(val["part_1"])
);
set_field(
field:"part_2",
value:to_string(val["part_2"])
);
end
In the message I can see this error:
gl2_processing_error
In call to function 'grok' at 5:12 an exception was thrown: Illegal repetition near index 0
%{WORD:part_1) %{WORD:part_2)
^
My expectation would be to have two new fields, but I don’t. Where is my error?
gsmith
(GSmith)
January 15, 2022, 12:53am
2
Hello,
I might be able to help but I’m not that great in pipelines.
From what I seeing in your error…
I think it maybe something with this let val =
From what I glanced at other posts perhaps it suppose to be just
grok( pattern:"%{WORD:part_1) %{WORD:part_2)", value:to_string($message.message));
Have after looking at this maybe a comma between…
pattern:"%{WORD:part_1), %{WORD:part_2)",
So I looked here and was unable to see this function let val=
https://docs.graylog.org/docs/functions
When you created this pipeline/rule you can test it out in the Simulator. Have you tried that? Then you can change the results if you wish.
I did a quick search for this situation maybe some of this posts perhaps have hints within them that may help.
Hello
Can you please help me. I want parsed log file secure on linux like filebeat in module.
For Example filebeat has this rule for file secure:
"%{SYSLOGTIMESTAMP:system.auth.timestamp} %{SYSLOGHOST:system.auth.hostname} sshd(?:\\[%{POSINT:system.auth.pid}\\])?: %{DATA:system.auth.ssh.event} %{DATA:system.auth.ssh.method} for (invalid user )?%{DATA:system.auth.user} from %{IPORHOST:system.auth.ssh.ip} port %{NUMBER:system.auth.ssh.port} ssh2(: %{GREEDYDATA:system.auth.ssh.signature})?",
"%…
opened 03:06AM - 12 Apr 19 UTC
processing
feature
triaged
In a pipeline, you may want to do something when grok doesn't match. For example… , in one of my pipelines, I'd like to set `gl2_processing_error` when the specified pattern exists but the specified field doesn't match it.
## Expected Behavior
There are two reasonable ways this could work:
1. Add an optional match_required boolean argument to the `grok()` function; when set to true, if the string does not match the pattern, treat as an error much as when the pattern doesn't exist:
```
grok('%{FOO}','bar',false,true) #=> %{FOO} doesn't match 'bar', so an error is raised.
```
2. Add an optional default_result argument; when supplied, if the pattern and string do not match, instead of returning `{}` the function will return the value of the `default_response` argument:
```
grok('%{FOO}','bar',false,{ bing: 'bang' }) #=> %{FOO} doesn't match 'bar', so grok() returns { bing: 'bang' }
```
In this second case, I could pass `{ gl2_processing_error: "Pattern did not match" }` or similar as a default value; if the pattern did not match, then in the subsequent call to `set_fields` the relevant error would be set.
## Current Behavior
If the pattern exists, but the string does not match it, then the `grok()` function returns `{}`. Since the pipeline rule actions have no conditional support, and variables set with let don't persist between stages, the only way to react to a failed match is to store the result of the grok in a field (using `set_field`, not `set_fields`), and then in another stage have a rule like
```
rule "example"
when
has_field('__grok_result') && $message.__grok_result == {}
then
set_field("gl2_processing_error","Didn't match");
remove_field('__grok_result')
end
```
And then you still need another rule to deal get rid of the `__grok_result` field when the match was successful.
## Context
Most of our incoming messages are sent as GELF and never need any grok processing; however, some messages from legacy/third-party apps are send via filebeat and need to be parsed. Filebeat is configured to add a `log_format` field for each file its ingesting, which contains the name of the grok pattern to use to parse it.
If a filebeat configuration specifies a grok pattern that doesn't exist, it results in an error, making it easy to find mistakes by having a dashboard widget showing messages with a gl2_processing_error field; but if the pattern specified by filebeat does exist, but doesn't match the message data, it's much harder for use to detect that we need to update our pattern definition.
## Your Environment
* Graylog Version: 2.5 (planning to move to 3.0, but documented behavior of `grok()` is the same either way)
* Elasticsearch Version: 6.7.1
* MongoDB Version: 2.6.12
* Operating System: CentOS 7
* Browser version: N/A
Hello everybody,
i am Dirk from Germany. I am a newbie with graylog. The most of my problems i solved with good inet documentation and youtube :-). But now i have a problem with the pipelines rules and a grok pattern.
The following grok pattern works fine within the extracttor.
action=%{QUOTEDSTRING:action}\s*.*\s*srcip=\"%{IPV4:SourceIP}\"\s*dstip=\"%{IPV4:DestinationIP}\"\s*.*\s*srcport=\"%{POSINT:SourcePORT}\"\s*dstport=\"%{POSINT:DestinationPORT}\"
But the follwoing code for the pipeline…
Got it. The answer to both my questions. In Filebeats add a field ‘document_type’ and set to the type of log file you want parsed. In the below example I’m parsing the type ‘log4j’. In order to get the timestamp to be recognized as the “valid” one, you have to parse it as a date. In addition I’m saving the full original log line as “original_message”.
Create the rule and add it to the ‘all messages’ stream.
rule "parse log4j"
when
has_field(“document_type”) && to_string($message.document_typ…
you could have also used:
let nexus = grok(pattern: "%{GROKPATTERNS}", value: message_field, only_named_captures: true);
To be more human readable.
Hello, I am trying to do a simple replace of the source from where I get syslogs as its currently not correct. I have been playing with this for quite some time but I cant get it to work. If I use this same grok command on an extractor I get exactly what I want but not in the rule.
Sorry very new…
rule "FQDN to Source"
when
has_field("message")
then
let extract = grok(pattern: "\\s%{HOSTNAME}", value: to_string($message.source));
set_fields(extract);
end
ihe
January 19, 2022, 2:07pm
3
After reading back and forth I found my mistake:
If I would close the Patterns also with a “}” it’s working like a charm.
I’ll close this thread.
1 Like
ihe
January 19, 2022, 3:07pm
4
here an even nicer pipelinefunction to get it working:
rule "howto Grok in Pipeline"
when
has_field("message")
then
set_fields(
grok(
pattern:"%{WORD:part_1} %{WORD:part_2}",
value:to_string($message.message)
)
);
end
2 Likes
system
(system)
Closed
February 2, 2022, 3:07pm
5
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.