Configure Graylog with Kafka

Hello ,

I have configured in a LAB environement 2 seperated hosts which are running both Kafka/Zookeeper and 1 host as Graylog server.
My goal is to send all /var/log/messages logs from clients to Kafka and ultimately to Graylog.
Could you please check the following points which are not clear to me ?

  1. How do i configure rsyslog from clients in order to send /var/log/messages to Kafka ?
    I used something like the following one without success :

    $ModLoad omkafka
    action(type=“omkafka” topic=“logs” broker=["<ip of kafka server:9092"] template=“json”)

Also ,In the Kafka IP what should i fill ? I have 2 Kafka servers and not just 1.

  1. What is the correct listener type within Graylog ?
    Syslog Kafka or Raw/Plaintext Kafka or something else ?

Also in the field “Zookeeper Address” of the Graylog Kafka Listener , what name should i fill ? Please note that there are 2 available Zookeeper/Kafka servers .Does Zookeeper has something like cluster name ?

Any help would be appreciated,

Out of all the things you mention, only Graylog is a product supported by the folks of this website. With some luck somebody may come by who has experience with Zookeeper and/or Kafka, but that’s not guaranteed.

You could make it a DNS alias and point it towards the hostname? That’s how I tackled things with my rsyslog clients: as a target, which was a DNS alias for three hosts running syslog receivers.

I have written a small quide on sending syslog via Kafka:

but that was written at a time rsyslog does not support kafka and I do not have the time to check on that topic again and make a how-to. So if you have figured that out, it would be nice if you just write all needed steps down for the next.

Great article ,very helpful indeed.
However , could you please shed some light regarding the JSON extractor part ? Could you please explain further the following :

" Additionally create a second extractor on the field host and the type Copy input , and store it in the field source . You might want a third Copy input to store Logstash’s @timestamp field into the timestamp message field used by Graylog. "

It’s telling you that it’s a good idea to create a new extractor (System > Inputs, then choose the input, and Manage Extractors). Apparently the article has already told you to create an extractor, it’s now telling you to repeat the process.

But instead of working against the “messages” field, it’s telling you to extract from the “host” field, copying it into “source”.

The problem is that nobody really explains how to create manually an extractor in order to extract the fields you need ( for example extract fields from /var/log/messages) .Just a bunch of ambiguous information around with target only people with some experience.
Moreover ,the official graylog documentation is indeed poor on that issue.

Thanks for your reply.

Moreover ,the official graylog documentation is indeed poor on that issue.

two possible options:

  1. help working to improve the documentation (it is an opensource product!)
  2. create a issue in github that explains what is not explained that someone else might pick up the work.

I’ll gladly help by the way; I’ve written too bloody much the past few weeks not to share a little bit… And awesome, I see that the docs site is editable on Github! I’ll try and poke around in the contents RSN™.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.