Pipeline Processor vs Extractor


(Dan Ravenstone) #1

Hello
I am trying to determine which method would use less resources (cpu, load, memory, io) and load on the graylog cluster when extracting key values from a message, pipeline processor or extractor

to provide an example, approximately 20% of messages coming into an apache log input contain an URL string that we would like to parse that contain URL-encoded characters

GET /uri/rr.php?url=https%3A%2F%2Fwww.exampledomain.com/ar/%2Far%2Fconverter%2Fconvert%2F%3Ffield%3D1%26From%3DFIELD%26To%3DSAR&pageId=blank_convert&fromThisy=fieldExample&toThis=anotherFieldExample HTTP/1.1

With a pipeline processor, it would require at least two rules, the first would have regex to identify the message to be processed, and the other to grab certain key values and place in new fields.

With an extractor, this would be performed on the input following similar logic, only process messages containing a certain string.

What I would like to understand, is which method would require less resources on Graylog.

thanks


(Jochen) #2
rule "message-with-url"
when
  contains(to_string($message.message), "%2F")
then
  // process message
end

(Dan Ravenstone) #3

Thanks @jochen.

However, I am more interested in understanding which method would be the most useful.
More specifically, which method would use less system resources (cpu, mem, io, load) or is that unknown?


(Jan Doberstein) #4

@nadx969 that isn’t measured by Graylog - but the processing pipelines is where we will move in the future too.

So, the future save solution is to use and understand the processing pipelines.


(system) #5

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.