currently I am using Logstash and the GELF plugin to send logfiles to Graylog.
So: Logstash -> GELF -> Graylog
Basically it works, but sometimes Logstash goes crazy and consumes almost all CPU resources - which is particularly bad on the production system. One reason for sure are very complex grok filters.
To take some load off the production machines and make them immune to Logstash hickups I’m thinking about makind a dedicated Logstash cluster.
So something like this:
Filebeat -> Logstash Cluster -> GELF -> Graylog
Question is: Shoud I put a message broker, e.g. Kafka or RabbitMQ, inbetween?
Because I have lots of Logstash configuration files with complex filter definitions (grok, etc…).
I don’t wanna use Graylog extractors to do the (grok) filtering because
(a) there are lots of existing Logstash config files and it would be a huge effort to re-write them in graylog extractors.
(b) I’d have to upscale my graylog cluster (currently 3 nodes) to n nodes so that it can handle the additional grok filtering nodes. imho its cleaner from an architectural point of view to have a separate Logstash cluster to do this.