I am missing functionaiilty in the input que. How to deal with that !?

louis · November 28, 2022, 12:17pm

Hello. I installed graylog (4.30) in a TrueNas jail and connected my firewall.

To make that work, I defined an input and a stream and did write and copied about 25 extractors. And that works OK, however three things did surprise me very much, but perhaps that is due to my limited knowledge. The reason for this post

My extractor queue is like thins:

some generic extractors e.g. to extract the task name
one or more extractors for messages related to program-A
one or more extractors for messages related to program-B
etc.

However I am missing three things:

If the one or more extractors related to program-A did their job, and the incoming message was related to Program-A

So I would expect an “extractor queue ready exit” option. There is no reason to walk over the rest of the extractors !!!

If a certain extractor matches, it could imply that I am not interested in that given message

So I would expect an “drop message” option

It could be that if a certain extractor matches, I would like to assign a certain value to a field which is not at all part of the message

So I would expect an option to fill a field with an 'extractor defined value

I assume others did met this as well, so I am curies how others dialed with this issues / idea’s

ihe · November 28, 2022, 12:33pm

Hi @louis
You should use pipelines for this feature-set.

In pipelines you can check for the existence of fields and their value. This should to the job.

drop_message will to that for you

just put another set_field to set those fields

From my experience the messages extractors are not the best way for extracting, the pipelines are much more powerful. As far as I know extractors will be deprecated some time in the future, but there is no date or version yet.

louis · November 28, 2022, 8:09pm

I am learning, however. You are promoting pipelines however … I have ‘some’ problems:

I am missing a picture showing about the graylog architecture. How function blocks like inputs, steams, extractors etc. are interconnected. If you know such a picture … please share it
pipelines seems to be more powerfull than extractors but also not as easy to use
I am not aware of a function which can convert extractors into a pipeline
In point one 1) you suggest an check on a field as solution, that is what I do within the extractors, I was looking for something like “break” (I am done leave the pipeline / return from the pipeline function)
I did read somewhere about a filter function, but I can not find such an function of “processing block” in the collection ^architecture^
I did read somewhere that you can define the order in which function blocks are processed, no idea how to do that (or is that an enterprise feature perhaps)

gsmith · November 28, 2022, 11:52pm

Hey @louis

I agree pipelines are not easy. There are tons of examples in this forum on HowTo’s , configurations, routing, re-naming, filtering messages, also use regex/Grok within the pipeline, etc…

Also in this forum. BUT if I understand this correct, the function of this extractor, then how to make a pipeline the same way, OR magically turn the Extractor into a pipeline?

Example #1

let’s say you have an Input for Linux devices, and you need to route a message that has a unique field “node_02” into another stream and send out an alert.

rule " alert"
when
  has_field("node_02")
then
  route_to_stream(id:"63094a92218139114d4923f2");
end

I tend to use stream ID’s instead of stream names.

Example #2

let’s say you have an Input for Linux devices, and you need to route a message that has a unique field “node_02” with specific data under that field call “Louis” into another stream and send out an alert.

rule “Route Node_02/Louis”

when
    has_field("node_02") AND contains(to_string($message.node_02, "Louis")
then
     route_to_stream(id:"63094a92218139114d4923f2");
end

Not only will you find example here in the forum but also in GitHub, there are Tag’s" here in the forum you can use for better search. Pipelines are so versatile I cant post all configurations here.

Dropping message.

Example#3

rule "discard Message with Louis"
    when
        has_field("node_02") AND  contains(to_string($message.node_02),  "Louis", true)
    then
        drop_message();
    end

@louis don’t take this the wrong way but these statements tell me you not really interested in Pipeline, probably because you don’t have enough knowledge.

louis:

I am missing a picture showing about the graylog architecture. How function blocks like inputs, steams, extractors etc. are interconnected. If you know such a picture … please share it

pipelines seems to be more powerfull than extractors but also not as easy to use

I am not aware of a function which can convert extractors into a pipeline

In point one 1) you suggest an check on a field as solution, that is what I do within the extractors, I was looking for something like “break” (I am done leave the pipeline / return from the pipeline function)

I did read somewhere about a filter function, but I can not find such an function of “processing block” in the collection ^architecture^

I did read somewhere that you can define the order in which function blocks are processed, no idea how to do that (or is that an enterprise feature perhaps)

To sum those up as follow:
Steps

Pipelines

One of our Members took some time to demo this out for others, hope this helps.

If you get stuck and need assistants, I’m sure someone here can help.

EDIT:
I forgot to add about Extractors, depending on what your trying to achieve you can create a REGEX extractor and attach a lookup table to it.
Here is one of mine, extractor type regular expression

louis · November 29, 2022, 1:43pm

I will probably look into pipelines later, but what is really strikes me is that I do not see any drawing clearly shows the graylog processflow and related how the different objects (input, extractors, streams, pipelines etc.) are interconnected! Which is the first thing you need to know, if you want to use “graylog” in the best possible way.

input is clear (the data arrives here)
output is clear (data towards other systems)
index is clear (the message storage DB)
also clear extractors (identifying messages and extracting fields from messages arriving from one particular input)
dashboard (representing data as available in the DB)

However the rest and the interaction between the blocks …

gsmith · November 30, 2022, 12:36am

hello

The link I posted above does show a logical diagram on the flow. This was help from multiple community members that collaborated because the question that’s being asked here was asked before.

louis · November 30, 2022, 9:38am

The problem is that IMHO, it is not really showing what happens. For me it just is not good enough.

A line with some function blocks connected really do not show what happens and in which order. It is not correct neither (IMHO).

I will try by looking to the gui and what the options are there to do some revers engeneering. But that should not be necessary …

louis · November 30, 2022, 11:09am

I think the Graylog dataprocessing model is like this

Left the external systems
Right the presentation layer

gsmith · November 30, 2022, 11:50pm

Hello,

That’s unfortunate, Its pretty close on how Graylog functions. But if you can come up with something else that would be great.

gsmith · December 1, 2022, 2:15am

hey @louis

Going back over your topics again to make sure I understand what your trying to achieve.

Extractors and how they work from the beginning.
Select a message input on the System → Inputs page and hit Manage extractors in the actions menu. You can also choose to apply so called converters on the extracted value to convert a string consisting of numbers to an integer or double value. So Let’s say you have 25 extractors using default configurations, meaning something like this.

Imagine 25 people trying to cut/copy data from the field “message” or "full_message’.
Then you have this section.

Drag and drop the extractors on the list to change the order in which they will be applied.

This is stated above /w screen shot.

This would depend on your configurations made. Under Condition

Extracting only from messages that match a certain condition helps you avoiding wrong or unnecessary extractions and can also save CPU resources.

OR

Type a regular expression that the field should contain in order to attempt the extraction.

As @ihe and I stated this would be in pipeline /w stages attached to a stream. to drop a message.

As you know already Pipeline, or depending on what you want to do, then a converter might be option for you. And yes I made that rhyme.

Set a plan on what you want, its pretty simple flow

INPUT --> modifying data here --> STREAM (filters or routing) --> INDEX.

That’s all I have for you, also in those links I post above, if you scroll down the page you will notice another member have posted video link on this subject and/or check out Graylog YouTube channel. Just a thought.

system · December 15, 2022, 2:16am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Stream -> pipeline And Input Extractor clarification Graylog Central (peer support) pipeline-rules , route-to-streampl	12	9204	July 3, 2018
Extractors vs. Pipelines - What is the preferred way? Graylog Central (peer support) pipeline-rules	2	9077	October 12, 2017
What comes first, pipeline or extractors? Graylog Central (peer support)	3	871	March 10, 2018
Two questions: delay in applying configuration changes & difficulty extracting key=value pairs Graylog Central (peer support) basic-configuration	21	1314	March 3, 2022
Extractor -> Pipeline Graylog Central (peer support)	2	709	August 13, 2020

I am missing functionaiilty in the input que. How to deal with that !?

Related topics