Scaling Graylog

Hi all,

I’m planning for a production setup on graylog.
I saw the ‘Architectural considerations’ and wish to do something like that.

A few questions I have…

  1. How do I setup and configure the Load Balancer? Is it through nginx/apache or does graylog have its own load balancer? I can’t find any guide on it.

  2. Instead of splitting graylog into its own node/cluster/server, is it okay to have graylog, mongodb & elasticsearch together in each node?

  3. How do you decide how much RAM to allocate for graylog, mongodb? I know how much to allocate for elasticsearch from their docs.

Thank you!

1 Like

@syntax
Hello,

Maybe I can answer your question/s.

Yes, you can create a load balancer from nginx/apache, but you can also create one in an enterprise firewall or a device specially for Load balancing.

Here are a couple examples.

FortiGate Firewall

https://docs.fortinet.com/document/fortigate/6.0.0/handbook/759836/server-load-balancing

Nginx

http://nginx.org/en/docs/http/load_balancing.html

Enterprise 1G

Depeneding on what you want to do. We have a load balancer on firewall which makes it easy for myself.

Well, that depends on your environment. Keep in mind if you want to expand in the future or is this for a few servers. To better answer this question can you give us a mockup of environment you will have?
This is a good read prior to setting up your Graylog servers.

https://docs.graylog.org/en/4.0/pages/getting_started/planning.html

Its hard to say, but here are some questions.

  1. What kind of install method are you looking at?
  2. How many nodes will be sending logs to Graylog?
  3. What kind of INPUT configuration you looking at?

Again, you might want to check the link above to fill in the gaps of what you want. Things to consider are: Elasticsearch can be a resource hog so it’s always considered to put Elasticsearch on its own node and Graylog/MongoDb together on a different node. You could start with 4CPU/ 4GB Ram/ 500 GB HDD and go from there. To be honest, it’s hard to answer these questions because there are many things to consider and not knowing what your environment will/does look like, I can’t give you a direct answer.

Hope that helps

1 Like

Thank you for taking the time to reply, really appreciate it.
I’m anticipating huge amount of logs from a few thousand machines. For a start, I’ll be deploying 3 nodes and scale as we go. Also, it seems to me that Elasticsearch scales best horizontally with small-medium machines.

Perhaps I should phrase my question in a different way.
What are some good reasons to separate graylog and graylog/mongodb on its own nodes respectively?

Hello,

There is really no need to separate Graylog from MongoDb.
Just a brief description on all three service. Graylog is your frontend, MongoDb stores settings/metadata, Elasticsearch is where messages get indexed. Elasticsearch will use most of your resource.
As for original question, I do believe the question about separating each service to its own node. This is do because ES will be using most of those resources specially when executing searches, dashboards /w widgets, etc…Also, this will prevent ES and Graylog competing for resources. I had that happen to myself and it not a pretty sight. As shown from the documents.

My environment example:
I have a lab Graylog server and receives 28 Million + messages a day which equals around 30GB day for storage. Inputs used are GELF TCP/TLS.
This is all running on a CentOS 7, 12 vCPU, 12 GB Ram, and 500 HDD. I retain 40 days of messages. I would show you pictures but it the weekend and I’m BBQing :blush:

2 Likes

Thanks gsmith.
Had a typo in my earlier question. I meant separating *elasticsearch and graylog+mongodb on its own nodes.

I take your point on preventing ES & graylog competing for resources.

In the diagram shown above, does it therefore mean that i’ll need 6 physical servers?

Hello,

yes that would be correct, BUT again this depends on your environment. You can have two VM’s one for ES and one for GL/MongoDb.
That would be able to handle pretty good. I cant tell you for sure since I dont know how many nodes you have or an idea of how you want to configure Graylog.

Hope that helps

1 Like