Cluster Graylog and embedded Kafka

keaoner · May 15, 2018, 3:01pm

Hello everyone,

We are going to have a graylog cluster on AWS, and I have a question about embedded kafka services how does it handle messages between each node? have to plan a configuration? or it is automatically managed by graylog

Thanks for your help

Best regards

jochen · May 15, 2018, 3:13pm

Which embedded Kafka service do you mean?

Graylog is using the disk journal implementation of Kafka internally (the “Log”) and it provides inputs, which can be used to consume messages from Kafka brokers, but it doesn’t provide any embedded Kafka broker.

keaoner · May 15, 2018, 3:52pm

thx you very much Jochen for your answer

Yes it is: “implementation of Kafka Internally in Graylog”, can you confirm that it is run on each node of the cluster and that the writing of the log is done locally on each node in the directory specified in the file of configuration “graylog.conf” by the variable: “message_journal_dir =”?

I use logstash to send my server logs in graylog (GELF / UDP) and I wanted to know how are distributed the messages on the different node of the cluster graylog does the distribution of the messages automatically?

jochen · May 15, 2018, 4:06pm

Graylog writes incoming messages into the disk journal (which can be configured with the message_journal_dir setting) immediately after they’ve been received and before they are further processed (extractors, pipeline rules, etc.).

Graylog doesn’t automatically distribute messages (or message fragments) to different nodes in the Graylog cluster. Messages are always processed on the node which received them.

In case of GELF UDP, you have to make sure that all message chunks are received by the same Graylog node. This is important if you’re using a UDP load-balancer which is not aware of the GELF UDP protocol.

keaoner · May 16, 2018, 10:49am

Thx you Jochen

So to ensure the distribution of the messages I have to put in front of the Graylog cluster a loadbalancer that intercepts the UDP / GELF requests transmitted by logstatsh and which transfers them to the different node of the cluster graylog?

jochen · May 16, 2018, 10:55am

I’d recommend using GELF TCP when you want to deploy a load balancer in front of Graylog.

keaoner · May 22, 2018, 1:42pm

@jochen: hi Jochen seen that it is a kafka implementation that runs and writes directly to the local disk if we have a loss of the node / pod (docker), we lose the current operations.
is there a solution? ie: is there a way to share the processing directory between the different nodes?

jochen · May 22, 2018, 1:53pm

It’s a disk journal, not a full-fledged messaging broker.

You could write your logs into an external message broker such as RabbitMQ or Kafka and let Graylog pull messages from there.

No, the disk journal cannot be shared between nodes.

keaoner · May 22, 2018, 2:05pm

So if I go through Kafka or RabbitMQ, graylog does not write on the disc, is that it?

RiceBowlJr · May 22, 2018, 2:14pm

Hi,

So if I understand your solution with an external Kafka/RabbitMQ, Graylog will trigger a job when it sees a message in the queue, process the work to do, and only then clear the message in the queue.
This is what would be optimal for me/us.

However, I am wondering where I can notify Graylog that I want to use an extrenal message broker, I can’t find it in the doc.
Moreover, I think it could be a good thing that you have a vision of the overall project, that is to un Graylog on top of Kubernetes, with HA and no loss of data/work. That is whhy we are looing for a way to make Graylog stateless.

jochen · May 22, 2018, 2:21pm

Graylog comes with Kafka and RabbitMQ inputs which you can use to read messages from a message broker.

system · June 5, 2018, 2:21pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Gelf kafka input load balancing Graylog Central (peer support)	5	2836	March 28, 2018
Which version of Graylog is compatible with Kafka? Graylog Central (peer support)	1	565	March 18, 2019
Log from kafka to Graylog Graylog Central (peer support)	3	1478	March 21, 2019
Only Master node is processing logs , the other 2 nodes are stating "The journal contains 0 unprocessed messages in 1 segment. 0 messages appended, 0 messages read in the last second" Graylog Central (peer support)	7	1222	July 26, 2019
Dedicated Logstash Cluster Graylog Central (peer support)	6	1469	July 18, 2018

Cluster Graylog and embedded Kafka

Related topics