Gelf kafka input load balancing


(M D) #1

Hi,
I have Graylog cluster with 3 nodes. I have set input using GELF kafka and I send all my logs to kafka.
I send about 100 thousand logs to test my set up. I can see only one of the node (Node 1) tries to process all the 100 thousand logs while the other 2 nodes are not doing any processing. If I stop Graylog server on Node 1 and then send 100k logs, then Node 2 processes it while Node 3 is not processing. So i can see that all 3 nodes are able to read from the kafka topic but only 1 node processes the logs. How can I can balance the load between all 3 Graylog nodes?

Thanks


(Jochen) #2

(M D) #3

Tried creating a topic with 4 partitions. Still not working.


(M D) #4

Solved issue by increasing the partitions adding the following property in server.properties file for kafka config.

num.partitions=4

If topics were created before adding this configuration, then add additional partitions manually using the kafka replication tools

bin/kafka-topics --alter --zookeeper localhost:2181 --topic topicname --partitions 6

Note: the number of partitions should be number of graylog server nodes multiply by 2.

https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools

Restart the producer (source sending the messages to kafka) just to be sure.


(Jan Doberstein) #5

as reference:


(system) #6

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.