Hi Everyone,
We’re currently running Graylog Open 6.x, and over time our log ingestion has grown significantly — from around 10GB/month to roughly 100–200GB/month today.
Our current setup is a single VM running all Graylog components (Graylog server, MongoDB, and OpenSearch). As expected, this setup is now struggling — we’re seeing data pull timeouts, unresponsive VM instances, and missing logs quite often.
Before we consider moving to the Graylog Enterprise License, I’d like to explore if we can improve or re-architect our current setup using the Open version.
We’re planning to test a new setup with:
-
2 Graylog Open instances
-
4-node DataNode cluster
However, from what I understand, clustering multiple Graylog application nodes is only supported under the Enterprise version.
So my main questions are:
-
Would this kind of setup (2 Graylog Open instances + 4 DataNodes) actually work in practice?
-
If not, what’s the best way to scale or distribute the load using Graylog Open?
-
Are there any proven setups or AWS best practices that can help stabilize performance before going Enterprise?
For context:
-
Hosted on AWS
-
Logs stored on EFS, currently under one EC2 instance
Any advice, references, or insights from those who’ve scaled Graylog Open would be greatly appreciated.
Thanks in advance!