Exctractors Check

GitsBdr · August 30, 2018, 8:30am

Hi everybody,
I’m currently trying to customise my extractors on graylog
Same extractors are needed by all the inputs from that node
Because of a different log format, my regex expression is the following : logid=("[^"]+"|[^\s]+)
As you can see, there is a “OR” within so it’s using a “long” time to process my messages

Issue : I always have some messages unprocessed in my journal so sometimes they are written in /var/lib/graylog-server/journal/* when there are so much logs and then happen a memory issue because of my /var full
I think, I can’t optimize my regex because of the different log format, I can’t guess if one attrbitut will be with “…” around or without
So, I would like to know how can I check if all the extractors are usefull or not … I can search with the “not _exist:xxx” on graylog but it’s gonna take a long time, I have more than 200 extractors
Thanks

jan · August 30, 2018, 9:58am

your usecase looks like something that can be done better with the processing pipelines as you can decide very granular when what regex should inspect what message. In addition multiple stages for information extraction can be used easily.

In your case it looks like all your 200 extractors run on all incoming messages. At least on all messages that use the input.

You can always check the metrics and look which extractor is taking the longest time, but as you already seen this will consume lot of time.

GitsBdr · August 30, 2018, 12:22pm

Thank you for the answer,

Yeah, my regex is already like that, trying to extract only if a specific field is detected in the message. But i guess that also this check take itself a bit of time ?

I check out in Inputs/Manage Extractor/Details and I found something like “321 hits, 0 misses” or “0 hits, 321 misses” for a particular extractors. So, i guess than “0 hits” are the one whom never matched with one filed and can be deleting right ?

Thanks again

jan · August 30, 2018, 1:42pm

So, i guess than “0 hits” are the one whom never matched with one filed and can be deleting right ?

Maybe - as I did not know your setup, your extractors and the intention when that processing was configured it might be. Yes.

Yeah, my regex is already like that, trying to extract only if a specific field is detected in the message. But i guess that also this check take itself a bit of time ?

The processing pipeline has even more features as you can act because of content of another field to run extractor. Not only from the same field as with the extractors.

system · September 13, 2018, 1:42pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unable to get some REGEX extractors working Graylog Central (peer support)	3	832	November 19, 2019
Difficulties to apply extractors using regex Graylog Central (peer support) key_value	47	2726	April 22, 2022
Regex extractor bug? Graylog Central (peer support)	2	713	November 14, 2019
Extractor help needed Graylog Central (peer support)	2	835	March 15, 2021
Deleted Extractors are still running Graylog Central (peer support)	4	1096	September 3, 2019

Exctractors Check

Related topics