Filebeat -> Graylog -> kafka -> kafka -> Graylog -- fields not being mapped at end

Hi there,

I have a setup whereby log messages are pushed by filebeat 5.5.1 running on CentOS 6/7 servers to a Beats input on a 3-node Graylog 2.4.3 cluster. An output (org.graylog.plugins.kafka.KafkaOutput from on the ‘All messages’ stream on the 3-node cluster sends the messages to a local kafka cluster, and are then consumed by a remote kafka cluster, to be consumed by a 6-node Graylog 2.4.3 cluster, using a Syslog Kafka input.

On the 3-node cluster end, the fields are extracted successfully into ‘source’, ‘timestamp’, ‘message’, ‘facility’, ‘file’, ‘type’ etc. The messages are received successfully at the 6-node end, but the ‘source’ is displayed by Graylog as ‘unknown’, the ‘facility’ is ‘Unknown’ and other beats-related fields don’t exist. I can see that the messages are sent into kafka by the aforementioned output on the 3-node Graylog cluster side in plain text, so this makes sense.

I want the messages in the 6-node Graylog cluster to be in the same format as in the 3-node cluster, i.e. with ‘source’ containing the hostname of the machine generating the message, together with the beats-related fields etc. Would it work if the kafka output plugin on the 3-node cluster side had the option of outputting the messages to kafka in JSON format, with all fields intact? Any other potential solutions?


The Kafka output you’ve linked to only sends the “message” field of a message to Kafka and ignores all other message fields:

