Depeneding on what you want to do. We have a load balancer on firewall which makes it easy for myself.
Well, that depends on your environment. Keep in mind if you want to expand in the future or is this for a few servers. To better answer this question can you give us a mockup of environment you will have?
This is a good read prior to setting up your Graylog servers.
Again, you might want to check the link above to fill in the gaps of what you want. Things to consider are: Elasticsearch can be a resource hog so it’s always considered to put Elasticsearch on its own node and Graylog/MongoDb together on a different node. You could start with 4CPU/ 4GB Ram/ 500 GB HDD and go from there. To be honest, it’s hard to answer these questions because there are many things to consider and not knowing what your environment will/does look like, I can’t give you a direct answer.
Thank you for taking the time to reply, really appreciate it.
I’m anticipating huge amount of logs from a few thousand machines. For a start, I’ll be deploying 3 nodes and scale as we go. Also, it seems to me that Elasticsearch scales best horizontally with small-medium machines.
Perhaps I should phrase my question in a different way.
What are some good reasons to separate graylog and graylog/mongodb on its own nodes respectively?
There is really no need to separate Graylog from MongoDb.
Just a brief description on all three service. Graylog is your frontend, MongoDb stores settings/metadata, Elasticsearch is where messages get indexed. Elasticsearch will use most of your resource.
As for original question, I do believe the question about separating each service to its own node. This is do because ES will be using most of those resources specially when executing searches, dashboards /w widgets, etc…Also, this will prevent ES and Graylog competing for resources. I had that happen to myself and it not a pretty sight. As shown from the documents.
My environment example:
I have a lab Graylog server and receives 28 Million + messages a day which equals around 30GB day for storage. Inputs used are GELF TCP/TLS.
This is all running on a CentOS 7, 12 vCPU, 12 GB Ram, and 500 HDD. I retain 40 days of messages. I would show you pictures but it the weekend and I’m BBQing
yes that would be correct, BUT again this depends on your environment. You can have two VM’s one for ES and one for GL/MongoDb.
That would be able to handle pretty good. I cant tell you for sure since I dont know how many nodes you have or an idea of how you want to configure Graylog.