we have around 600 events by now and the event queue builds up more and more. Any way to improve event processing, other than setting up a second graylog server, which would probably help? Elasticsearch is its own 3-node cluster with sas ssds, nothing looks to be bottlenecking there.
you provide nearly no information about your system other than events (per second?) In addition to that you did not share any processing you do or what kind of server you run that on.
So that is not an easy answer - because you have so many bits and pieces …
Hey thanks for the quick anwser,
I was originally asking in a general sense, if there were maybe just some obvious trick or flag I had to set.
Graylog has 12 cores with 24GB of RAM, 12GB Heap Size for the graylog jvm. The 3 ES nodes have 4 cores and 32GB of RAM each, 16GB ES jvm heap size. None of the machines are hitting the resource limits.
The ~600 events run every 10 seconds searching in the last 10 seconds of 1 stream with its own index set, ~360GB, 12 indices with 3 shards + replicas each.
Overall 1TB in Logs seperated into a few index sets.
We have a few hundred extractors over all the different inputs and a few processing pipelines doing small things, looking at the time all of those take doesn’t look too bad.
Overall it’s performing absolutely perfect! During heavy times it will write 10k-20k messages to ES without issues. Just the events which are somehow processed slowly.