We are currently testing a graylog cluster with 4 nodes (1 Frontend/WebUI & 3 backend DB nodes):
1 - HAProxy (frontend, not cutting it so far…) + Graylog WebUI
2 - Graylog/Elasticsearch/MongoDB
3 - Graylog/Elasticsearch/MongoDB
4 - Graylog/Elasticsearch/MongoDB
We are running into major messaging backlogs as HAproxy only binds to one of the backend nodes instead of evenly distributing the log load across all three backend nodes. The logs are VERY bursty with large bursts of logs in the thousands/tens of thousands…
What is the recommended architecture and/or best option for use to queue up these large bursts of log messages to prevent them from creating huge backlogs of message processing on a single backend node? HAproxy doesn’t seem to be doing the trick…
This is using TCP inputs, sending logs from rsyslog via TCP…we want TCP for reliability. We’ve tried HAproxy’s roundrobin lb method and also the leastconn lb method and neither are working out.