If messages will be lost by changing standalone to cluster

limanzhang · November 16, 2020, 1:46pm

There is only one standalone node in our env now. I am upgrading graylog, mongodb and elasticsearch as they are very old. Here is my plan:

Clone the production server graylog01 to graylog02
Upgrade graylog to 3.3, mongodb to 4.4, elastic search to 6.0 on graylog02
Change rsyslog settings on servers to send their log to graylog02
Upgrade graylog01
Make graylog01 and graylog02 as a cluster

My questions is, if I do it this way, will the data being messy/lost because the two nodes are not synchronized with each other? Or if they can be synchronized after the cluster is setup?

Thank you in advance!

cawfehman · November 19, 2020, 4:33pm

Why are you clustering? Need redundancy or capacity on the front end or the back end?

MongoDB and Elasticsearch recommend no cluster be built with less than 3 nodes. So if you are going to go from a standalone, to a “cluster” perhaps consider going to 3 nodes.

Based on your needs, there may be other options as well.

limanzhang · November 20, 2020, 7:59am

I am clustering for redundancy. As when I was upgrading mongodb and elasticsearch, I found some data lost during the upgrade. If it is a cluster, I can do rolling upgrade.

cawfehman · November 20, 2020, 2:29pm

Cool… so for input redundancy, you technically only need a Graylog cluster. ES can still be a single node. I do not recommend this, but it would work.

This would work mainly because if you need to upgrade ES (which is a single node) the messages will queue up on Graylog until the upgrade is complete. So in essence, you don’t lose a message, but you can not search for anything either. If this is an ok scenario, then just build the Graylog cluster with a MongoDB replica set, and point it at the ES node on the back end. Optionally put a load balancer in front of the Graylog cluster.

Obviously, any single node is not redundant if something were to happen to the hardware or system that it is running on, so if full true redundancy is something you want, you’ll need to make both redundant by clustering them.

So better solution (and more forward thinking) in your case (based on my understanding) is to setup a 3 node Graylog cluster, leveraging a MongoDB replica set, a 3 node Elasticsearch cluster, and then perhaps put a load balancer in front of the Graylog cluster.

Also, you can have the same 3 hosts simultaneously be part of both the Graylog/MongoDB cluster and the ES Cluster, but unless you are tight on resources… I’m not a fan of this design.

limanzhang · November 25, 2020, 1:37pm

So you mean if I setup a graylog cluster, then upgrade mongodb and elasticsearch, no message will be lost as long as graylog is still running. Is my understanding correct?

cawfehman · November 25, 2020, 1:50pm

sort of… that’s overly simplified and as I mentioned only provides redundancy on the ingest. If you lose ES, you still could lose messages. also if you’re not using a load balancer and global inputs, you could lose messages if one of your GL nodes goes down.

system · December 9, 2020, 1:50pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Upgrade single node from 3.x to 4.0.5 without downtime Graylog Central (peer support)	5	968	March 23, 2021
Graylog distributed architecture question - load balancing or redundancy? Graylog Central (peer support)	5	1502	April 24, 2017
Graylog two node Cluster Graylog Central (peer support)	6	3880	September 18, 2017
Cluster, placement of Elasticsearch instances Graylog Central (peer support)	3	558	August 26, 2021
Cluster Infrastructure Graylog Central (peer support)	36	3698	May 9, 2019

If messages will be lost by changing standalone to cluster

Related topics