I have a 6-node Graylog 2.4.3 cluster pulling logs out of a 3-node Kafka 2.0.0 cluster. There are six topics in the Kafka cluster, each topic mirroring log messages from a remote Kafka cluster. Graylog has one GELF Kafka input configured per topic (i.e. 6 GELF Kafka inputs). Provided Graylog and Kafka are running, log messages are flowing smoothly.
Yesterday, I shut down the Graylog cluster for 10 minutes and then started it up again, testing whether Graylog would pick up where it left off with regard to consuming logs out of the Kafka cluster. Graylog has indexed about 40% of the log volume for the outage period compared to the volume indexed on either side of the outage.
In investigating this, I’m wondering how the Graylog Kafka input plugins keep track of Kafka message offsets in the topics. I used the ‘kafka-consumer-groups.sh’ tool to list all consumer groups known by Kafka, but ‘graylog2’ wasn’t listed. Can anyone tell me whether Graylog uses the Kafka group consumer offset mechanism to track the last processed message from a topic, or does it use an internally-recorded offset? Also, can anyone confirm whether Graylog should, by design, resume log ingestion from Kafka after a short outage such as I have instigated, assuming the logs have not been deleted from Kafka?