I have setup graylog with kafka, which has around 4 queues, two queues with 120 partitions and two with 240 partitions each. I have 4 GELF Kafka inputs, 3 graylog Servers,
Inputs with queues having 120 partitions have 40 threads each
Inputs with queues having 240 partitions have 80 threads each
However there is lots of log messages being pushed to kafka, and the consmers (graylog server) is not processing it quickly, each graylog process has 40GB of Xmx and Xms. The processing will be good at the beginning, however the lag starts increasing in kafka, I have tried splitting the kafka increasing the queues, increasing the partitions, nothing helped. The graylog server is just processing around 2k messages per second (kafka lag is around 1.2 million),
Graylog UDP itself was processing around 50k messages per second.
If the graylog server consumes the messages quickly then the lag will reduce.
Can some one help on this?
Graylog version which I use is 2.1.2
How does having JAVA HEAP more than the recommended settings lead to slower processing of messages ?
and before using kafka I used to send log messages via UDP and it was working fine. It was able to push those messages to Elasticsearch
what is the available memory of your Graylog Server? And is elasticsearch on other systems or on the same? what is the RAM and the configured HEAP for Elasticsearch?
ElasticSearch:
There are 3 master nodes having 6GB RAM and 4GB Heap, 6 data nodes, each server having 60 GB RAM and 45GB Heap, and each data node has 4TB for data storage and no replication enabled.
Graylog:
3 nodes, each having 60GB RAM and 20GB Heap.