How to calculate number of inputbuffer_processors and processbuffer_processors

Even after couple of years with Graylog it’s still full of magic to me.
For example, after fighting with all those pipelines/extractors/grok/whatever I came to pure client-side processing solution.
So now I don’t have any pipeline or extractor.
And since documentation doesn’t clearly explain what exactly each parameter means I’m confused - how many “processors” I need to set for such configuration?
0? 1? 16? X * ES nodes?
Any ideas, guys?

this isn’t something anyone can give you much information on other than perhaps a starting point. The number of processors required depends alot on the number of messages your receiving, the pipelines you have configured, any extractors you are using, and how heavily you are using/searching the data.

Some guidelines for configuring them are:

  • Never allocate more processors than you have available CPUs. (if you are running on a system that has 4 cores, you can’t assign 6 to processing)
  • Process buffer is your heavy hitter, so the majority should be allocated there.
  • by default, Output buffer doesn’t require alot, so start with 1 CPU and go from there. If you’re configuring custom outputs, your needs will vary and you’ll need to adjust accordingly. I would still start with 1 unless you have CPUs to spare. Then go with 2.
  • Leave at least one CPU for system overhead (you’re still running webservices and Mongo in most cases)
  • if running on a hypervisor, and you can’t reserve the resources, you may run into issues where another system’s utilization is affecting Graylog.
  • not processor related, but don’t discount the Java heap and size it accordingly too.

to give you some insight, I’m processing 500+ million messages a day and have only a single GL node with 16CPUs allocated as:

processbuffer_processors = 12
outputbuffer_processors = 2

(some pipelines, but mostly GROK extractors)

hope that helps.

1 Like

And you set up 1 CPU for inputbuffer_processor ?

Do GL and ES are installed on the same host ? If yes, do you limit the number of CPU allocated to ES ?

Actually I have 2. so i’m breaking a bit of my bp’s above… :grinning_face_with_smiling_eyes: :innocent:

:shushing_face:

My ES is a completely seperate 3 node cluster, so there is no resource contention.

just for the record: I got performance improvement by increasing outputbuffer_processors from 4 to 10
I have 12 ES nodes
But final result is far from my expectations, performance is not much better comparing to previous configuration, when I had just 2 ES nodes

curious how many GL nodes you have. do you have 1 that you bumped from 4 to 10, or multiple that you did this on?

I have 2 GL nodes (56 cores) and initially ES also was running on the same nodes.
Later I added 10 separated ES nodes, but I didn’t get 500% performance improvement. Maybe just 20%
That’s why I love and hate GL and ES in the same time.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.