I have more than 4 GL nodes and an ES cluster.
The problem is that I have more than +100 active streams. In the beginning I had only a few streams. As I need more streams, and pushing more logs in the system I saw it needs more CPU power. I added more graylog nodes. Now I am in the situation that if I add a new GL it does not add too much value because graylog node needs to evaluate the regexes from that +100 streams.
In the beginning each node processing over 10000 msg/sec, and now its processing ~1000 msg/sec (and its not because of elasticseach. I tested with pausing all the streams but one and the ingestion rate increased.)
There is a architecture consideration for this?
I was thinking to use a common elasticsearch cluster, and use 2 or more graylog clusters independent by each other.
Each graylog cluster to use same mongo replica set, but of course different databases; also for elasticsearch each graylog cluster to use different indices set.;. In front of them to pun an Apache to with a reverse proxy to point to a graylog cluster (eg/. www.site.org/gl1/ (for gl cluster 1) and www.site.org/gl2/ (for gl cluster 2).
Any ideas or hints?