Graylog Kafka Input Processing is slow

Hi,

I have setup graylog with kafka, which has around 4 queues, two queues with 120 partitions and two with 240 partitions each. I have 4 GELF Kafka inputs, 3 graylog Servers,
Inputs with queues having 120 partitions have 40 threads each
Inputs with queues having 240 partitions have 80 threads each

However there is lots of log messages being pushed to kafka, and the consmers (graylog server) is not processing it quickly, each graylog process has 40GB of Xmx and Xms. The processing will be good at the beginning, however the lag starts increasing in kafka, I have tried splitting the kafka increasing the queues, increasing the partitions, nothing helped. The graylog server is just processing around 2k messages per second (kafka lag is around 1.2 million),
Graylog UDP itself was processing around 50k messages per second.

If the graylog server consumes the messages quickly then the lag will reduce.
Can some one help on this?
Graylog version which I use is 2.1.2

Can some one help me, the kafka input is not able to cope up at the rate in which message logs are generated

  • Did you have done some tuning with Graylog? If yes, what?
  • Did you checked if your Elasticsearch Cluster can eat that much messages you ingest?
  • the amount of JAVA HEAP is way above the recommend settings for Graylog and you should reduce it.

How does having JAVA HEAP more than the recommended settings lead to slower processing of messages ?
and before using kafka I used to send log messages via UDP and it was working fine. It was able to push those messages to Elasticsearch

@rakesh

having more HEAP means longer GC Runs.

Hi Jan thanks for the reply,

Whats the recommended heap size, the one that comes default on the graylog.conf?
I have set it to 20 GB currently

what is the available memory of your Graylog Server? And is elasticsearch on other systems or on the same? what is the RAM and the configured HEAP for Elasticsearch?

ElasticSearch:
There are 3 master nodes having 6GB RAM and 4GB Heap, 6 data nodes, each server having 60 GB RAM and 45GB Heap, and each data node has 4TB for data storage and no replication enabled.

Graylog:
3 nodes, each having 60GB RAM and 20GB Heap.

Using more than 32 GB (actually, 30.5 GB) of heap with the JVM is strongly discouraged, see the following blog post for some background information:

20 GB of heap memory for Graylog also seems a bit overkill. Most (also large) setups run pretty smooth with 4 GB of heap memory.

ok, let me try that
and I get around 60 million log messages every 1 hour, so having 4GB of heap would still be sufficient ?

Without details about your setup, that’s hard to say.

You’ll have to try it.

This would probably also be a good time to think about Graylog Enterprise support: https://www.graylog.org/enterprise

Cool let me try that,
Sure will think over about the graylog enterprise support

thanks jochen

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.