Cluster Infrastructure

benvanstaveren · April 10, 2019, 4:16pm

Depends on how you set it up. In our case at my company we have 3 physical machines that just run the Graylog server process, as well as a mongo daemon each to form a replica set, then we have 25 (!) physical machines that form our ES cluster.

Then again we have some hilarious retention requirements so we’re sort of “way out there” as far as infrastructure goes.

tmacgbay · April 10, 2019, 6:22pm

25 physical machines that form our ES cluster … hilarious retention requirements

benvanstaveren · April 10, 2019, 8:57pm

Okay so… just for shits and giggles.

3 dedicated master nodes
3 dedicated routing nodes
2 cold storage (40Tb/piece) ES nodes
14 warm storage (6Tb/piece) ES nodes
3 hot storage/indexing (2Tb/piece, SSD) ES nodes

All indexing happens on the hot nodes (e.g. new indices are allocated to them automatically). Graylog’s index retention is turned off (“do nothing”), as well as index optimisation. All indices are set to rotate on a P1D (one day) time.

Since it’s ES 6.7, we use the new Index Lifecycle Management to move any index older than 36 hours (after Graylog has rotated it away) to a warm node where they are available for 60 days (and for certain index sets, 90 days), after which ILM will move them to the cold storage nodes, reduces the replica count to 0, and freezes them, where we keep them until we need to clear up disk space, at which time we take a snapshot of a few indices to S3, then delete them.

The reason is that in our business sector we often need to be able to pull up historical data at will, and we need to retain data for at least 5 years. Our current setup lets us satisfy that requirement with minimal effort due to being able to un-freeze an index and have it available in under 15 minutes. Snapshot restores take too long sometimes.

This whole setup ingests, as of today, about 5000 msg/sec continuously, for a little over 120Gb of data daily. Since we run with 2 replicas on important indices (because we need the availability guaranteed), that ends up being 360Gb of storage space required in the cluster, daily. We still have to roll out to 3000+ devices, so by the time all is said and done we may even have to expand the cluster to keep our “data that needs to be live” requirement intact.

Fun times

GuillaumeDB · April 11, 2019, 7:27am

This is funny and I can see what we can do with ES, MongoDB and Graylog but i’ll never do that I think, not for now. My work placement takes place over 4 months (3 months left now) knowing that I can’t be on Graylog 100% of my time, I got many other project. But who knows ? Maybe one day I will set up 25 machines too ahah

benvanstaveren · April 11, 2019, 2:47pm

Ah I see! Okay, well, who knows I’ll say though that knowing Graylog and all it’s supporting bits and pieces (mongo, Elastic) is a good thing

GuillaumeDB · April 12, 2019, 9:08am

I will start to set up the infrastructure today or monday. I just want to know if someone already has set up a 2 members replica set on MongoDB ? If I understand, there is no arbiter with 2 members ? What I have to change in the configuration comparing to a 3 members replica set ?

tmacgbay · April 12, 2019, 1:34pm

The mongodb replica instructions I posted above should give you what you need to add another member…

GuillaumeDB · April 19, 2019, 9:48am

Hello, did you got the problem when you want to restart mongod service after change configuration files ?

Just here in the documentation :

" On each of your Linodes, make the following changes to your /etc/mongod.conf file:

/etc/mongod.conf

net:
port: 27017
bindIp: 127.0.0.1,192.0.2.1

security:
keyFile: /opt/mongo/mongo-keyfile

replication:
replSetName: rs0


Once you’ve made these changes, restart the mongod service:

sudo systemctl restart mongod"

Thanks for help

tmacgbay · April 19, 2019, 1:49pm

I am not sure what you are asking… but I did have problems with the bindIP. It is very particular with what port it is listening to. I ended up putting 0.0.0.0 …which is a security issue - we balance the risk.

GuillaumeDB · April 23, 2019, 7:12am

It was what I ask, thanks. But unfortunately I still have the problem. Can’t fix it for now

GuillaumeDB · April 23, 2019, 8:13am

To explain my problem. I’m trying to set up MongoDB Replica Set with 3 members. On each nodes I install mongoDB and configure the file /etc/mongod.conf . On the node which will be the primary I also create an administrative user. When I restart the service, only my primary server start and others return :
Job for mongod.service failed because the control process exited with error code. See “systemctl status mongod.service” and “journalctl -xe” for details.

I wanted to check log of Mongo but there is nothing in the file. journalctl -xe dont really bring me some information

Edit : I also enable the corresponding port and disabled selinux and apply some rights

tmacgbay · April 23, 2019, 2:15pm

Need more details… always post details of what you have done, tried and looked at…

Are the other servers set properly in your host file?
Did you rs.initiate() the primary? Results?
have you executed rs.add("<server2>") Results?
What is the output of rs.conf() and rs.status() ?
Do you see anything in /var/log/mongodb/mongodb.log?

GuillaumeDB · April 23, 2019, 2:31pm

Hi, thanks for the reply

After launch many restart and many install of mongodb. It works well, but I don’t change anything when I installed the service and enter the configuration …

rs.status() return this :

"set" : "rs_mongo0",
        "date" : ISODate("2019-04-23T14:30:09.707Z"),
        "myState" : 1,
        "term" : NumberLong(1),
        "heartbeatIntervalMillis" : NumberLong(2000),
        "members" : [
                {
                        "_id" : 0,
                        "name" : "srv1-gl-gdb.crtinformatique.local:27017",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 24038,
                        "optime" : {
                                "ts" : Timestamp(1556026325, 1),
                                "t" : NumberLong(1)
                        },
                        "optimeDate" : ISODate("2019-04-23T13:32:05Z"),
                        "electionTime" : Timestamp(1556010390, 2),
                        "electionDate" : ISODate("2019-04-23T09:06:30Z"),
                        "configVersion" : 3,
                        "self" : true
                },
                {
                        "_id" : 1,
                        "name" : "srv2-gl-gdb:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 19169,
                        "optime" : {
                                "ts" : Timestamp(1556026325, 1),
                                "t" : NumberLong(1)
                        },
                        "optimeDate" : ISODate("2019-04-23T13:32:05Z"),
                        "lastHeartbeat" : ISODate("2019-04-23T14:30:09.245Z"),
                        "lastHeartbeatRecv" : ISODate("2019-04-23T14:30:09.244Z"),
                        "pingMs" : NumberLong(0),
                        "syncingTo" : "srv1-gl-gdb.crtinformatique.local:27017",
                        "configVersion" : 3
                },
                {
                        "_id" : 2,
                        "name" : "srv1-es-gdb:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 18934,
                        "optime" : {
                                "ts" : Timestamp(1556026325, 1),
                                "t" : NumberLong(1)
                        },
                        "optimeDate" : ISODate("2019-04-23T13:32:05Z"),
                        "lastHeartbeat" : ISODate("2019-04-23T14:30:09.036Z"),
                        "lastHeartbeatRecv" : ISODate("2019-04-23T14:30:09.035Z"),
                        "pingMs" : NumberLong(0),
                        "syncingTo" : "srv2-gl-gdb:27017",
                        "configVersion" : 3
                }
        ],
        "ok" : 1

I guess it works ?

tmacgbay · April 23, 2019, 2:48pm

I am no expert but it looks good to me. There was a method for testing replication in the original link I sent if you want to double check that it is replicating OK.

GuillaumeDB · April 23, 2019, 2:52pm

Yes indeed ! I dont tell you about this test but it was OK.

I try now to connect ES and MongoDB to graylog in graylog.conf.

There is only 2 lines for this ? 1 for ES : elasticsearch_hosts and 1 for MongoDB : mongodb_uri

According to the official multinodes documentation of graylog there are only these two

GuillaumeDB · April 25, 2019, 10:00am

Hello,

All is set up but I got an error concerning graylog deflector, after I take a look on the forum I tried this command :

curl -XDELETE http://localhost:9200/graylog_*/

and I had this line one Elasticsearch file :

action.auto_create_index: false

After this my Elasticsearch cluster became Green, it was yellow before. But I still have the problem with the deflector. In indexer errors I had nothing before executing my commands. Now I got this :

2 minutes ago graylog_deflector 91344300-6740-11e9-8366-005056952112 “IndexMissingException[[graylog_deflector] missing]”

Someone can help me with this ? Thanks

system · May 9, 2019, 10:08am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Failover Setup? Graylog Central (peer support)	12	4885	March 10, 2017
Graylog cluster setup Graylog Central (peer support)	4	1360	December 11, 2018
Graylog minimum Cluster Setup Graylog Central (peer support)	3	674	January 31, 2020
Question about cluster Graylog Central (peer support) docker , architecture	8	678	September 1, 2023
Cluster setup graylog Graylog Central (peer support)	3	1139	February 6, 2018

Cluster Infrastructure

Related topics