Cluster Infrastructure

GuillaumeDB · April 9, 2019, 9:39am

Hello

For 4 weeks now I use Graylog on one and same Virtual Machine, so MongoDB and Elasticsearch are on too. I am in work placement and one of my project is to make a infrastructure with graylog. I was thinking to start with 4 virtual machine 2 Graylog nodes and a cluster of 2 elasticsearch servers. But to be honest I am a little lost. Is it possible to make what I said above. Is it a good idea ? Do I need to install MongoDB on both graylog servers ? Do I need to deploy a Replica Set for MongoDB ?

I dont really know how to start. Later I will add for sure a loadbalancer but for now I dont even start the cluster, nothing is urgent now but after some search on the web i’m feel lost and confused about all of this.

It will be nice if someone can enlighten me.

Thanks for you help

macko003 · April 9, 2019, 1:34pm

First do a calculation how many servers do you need…
https://groups.google.com/forum/#!msg/graylog2/lSTKgFvEAyQ/lzUdCyH3AQAJ
After…
http://docs.graylog.org/en/3.0/pages/configuration/multinode_setup.html
It contains a lot of useful information, and as I see you forget some points.

tmacgbay · April 9, 2019, 1:43pm

This link helped me create a MongoDB replica set. (Don’t do it out of order like I did )

MongoDB replica instructions

GuillaumeDB · April 9, 2019, 1:52pm

Thanks for this help. I think that my company need will generate 10 - 50 GB of log per day. So I need 2 Graylog servers and 2 ES severs too according to your documentation. On Graylog servers I install MongoDB right ? Then I just need to follow instructions in order ? (First mongodb, ES Cluster and Graylog ?)

Sorry i’m not English there is a lot of things I dont really understand well

Thanks tmacgbay I think it will be usefull

tmacgbay · April 9, 2019, 2:09pm

I would order it with ES first, then MongoDB, then Graylog. Graylog keeps it’s settings in the MongoDB. You want the ES and Mongo set-up and their replication working first. If you want to keep your current data, it is possible to start from your current machine, set up replication for ES and MongoDB and eventually drop your original machine after making sure you have moved all master functionality to your new machines. It will take a little bit of research with MongoDB and ES for the details if that is the way you go.

macko003 · April 9, 2019, 2:19pm

@tmacgbay wrote true things, but I suggest first RTFM.

eg.

Most important is that you have an odd number of MongoDB servers in the replica set.

So what do you think?

I think if you spend a day with collecting information you will have less “OHHH…” moments in the future. Eg. when your mongo cluster with two member stops working when you restart one of your server.

GuillaumeDB · April 9, 2019, 2:27pm

@tmacgbay Thanks for your help. It start to be more clear in my idiot head . Just a thing, I will restart from the beginning and I will not use my last machine. It was just to learn how Graylog work a little.

@macko003 I dont really see what you want to mean. Actually the part with MongoDB especially the part that talks about an odd number server is really fuzzy for me.

I have read the manual but this part give me troubles

macko003 · April 9, 2019, 2:37pm

if you install two mongo server, you will su…s.

GuillaumeDB · April 9, 2019, 2:45pm

But I will use 4 different servers and it’s not plan to add one more so what you advise me to do ?

tmacgbay · April 9, 2019, 3:05pm

Always RTFM - particularly with live data!

Two MongoDB (or ES) will work … it will improve consumption without improving resilience much. For better redundancy they both prefer to have a minimum of three. The odd number(s) would allow you to carefully shut one down for maintenance/problem without stopping the DB.

I don’t know what su…s is but I hope it never happens to me!

GuillaumeDB · April 9, 2019, 3:11pm

Oh ok I see, really thank you, I understand better now. How long you think need a beginner to set up this kind of architecture?

tmacgbay · April 9, 2019, 3:32pm

How long you think need a beginner to set up this kind of architecture?

Too many unknown factors. xkcd: Success

benvanstaveren · April 9, 2019, 7:58pm

“It will be done when it’s done” I’d say

Personally (I’m a little biased), I prefer a setup with 3 Graylog servers, each machine running Graylog and a MongoDB instance (because 3 is nice and odd numbered), with inputs loadbalanced across the Graylogs (with Filebeat you can do this in the config of filebeat, with other inputs you need a TCP loadbalancer).

Then as much Elasticsearch nodes as you need to satisfy the storage requirements. If you ingest 50Gb of logs per day, you need 100Gb of storage capacity per day if you use 1 replica (advised), so now you have to consider how long you want to keep logs searchable. If you say 90 days, then you need 90 * 100Gb = +/- 9Tb worth of storage space. Ideally then you spread that across at least 3 data nodes so you can lose a node and still have the data.

Your Elasticsearch setup would then be: 3 master servers, 3 data nodes for a total of 6 servers. Ideally on bare metal, with the masters needing 32Gb memory and not that much disk space, and data nodes needing about 64Gb of memory, and either large SSD drives (the 1Tb type in raid 0) or SATA drives. (the SSD obviously being much faster with it’s IO).

But that’s just my own personal perference for “how would I…”

macko003 · April 10, 2019, 6:23am

everything based on needs…
usually I have to install geo redundant clusters, so 3 graylog node not applicable in my case.

to install a working cluster is not so much, maybe a week enough for play with everything. the harder thing to set up you need in graylog. eg pipelines, extractors, alerts, streams, rights, etc… It’s another 1-2 decades

benvanstaveren · April 10, 2019, 6:35am

Setting up pipelines, alerts, streams, user rights etc. actually just never stops, I found There’s always something to change >.>

GuillaumeDB · April 10, 2019, 7:24am

@benvanstaveren @macko003 I dont even see all of features, I mean I dont know how to use them all in a good way. I think that I have some work now

benvanstaveren · April 10, 2019, 8:51am

The only bit of advice I can give you is to just set it up in a test setup (OVA image comes to mind as an easy way to get started, just don’t use it for production), throw some logs at it, and play with it until you’re comfortable (and have potentially discovered all features relevant for your use case).

There are also a lot of things on the forums here (search is your friend) where people have explained how they’ve done things, or asked questions about how to accomplish a certain thing, so… yeah. Play, read, fiddle, and then deploy in production at some point

GuillaumeDB · April 10, 2019, 3:05pm

Hi again !

There is something I dont really understand with Graylog nodes about configuration file. In the documentation “multinode set up” its written :

“After the installation of Graylog, you should take care that only one Graylog node is configured to be master with the configuration setting is_master = true .”

It seems to be a soft configuration. There is something else to do or just this ? I dont see how they can connect each other and transfer data. I have sought answer on the web and in this forum but I dont really found what I want.

tmacgbay · April 10, 2019, 3:40pm

MongoDB replication handles keeping the Graylog servers in sync, the Elasticsearch replication keeps your log data in sync.

is_master = true - A configuration variable for “periodical and maintenance actions” not handled on slave GL servers. (per documentation here: http://docs.graylog.org/en/3.0/pages/installation/manual_setup.html)

GuillaumeDB · April 10, 2019, 3:47pm

@tmacgbay Thanks you, so all in the infrastructure is connected, my Graylog nodes just depend on how is set up my servers. I mean there is no nodes without ES cluster and MongoDB replica set

Topic		Replies	Views
Graylog 2.4 Cluster hangs up Graylog Central (peer support)	6	556	March 28, 2022
Graylog cluster only shows 1 node Graylog Central (peer support)	6	4819	September 20, 2017
Unable to connect graylog cluster with mongodb replicaset Graylog Central (peer support) pipeline-rules , debuggingpl	9	12069	March 14, 2018
Docker Graylog connect in single MongoDB but not in Replica set Graylog Central (peer support)	13	3247	May 2, 2019
Graylog cluster setup Graylog Central (peer support)	4	1372	December 11, 2018

Cluster Infrastructure

Related topics