Cluster Infrastructure

Depends on how you set it up. In our case at my company we have 3 physical machines that just run the Graylog server process, as well as a mongo daemon each to form a replica set, then we have 25 (!) physical machines that form our ES cluster.

Then again we have some hilarious retention requirements so we’re sort of “way out there” as far as infrastructure goes.

1 Like

25 physical machines that form our ES cluster … hilarious retention requirements

Okay so… just for shits and giggles.

3 dedicated master nodes
3 dedicated routing nodes
2 cold storage (40Tb/piece) ES nodes
14 warm storage (6Tb/piece) ES nodes
3 hot storage/indexing (2Tb/piece, SSD) ES nodes

All indexing happens on the hot nodes (e.g. new indices are allocated to them automatically). Graylog’s index retention is turned off (“do nothing”), as well as index optimisation. All indices are set to rotate on a P1D (one day) time.

Since it’s ES 6.7, we use the new Index Lifecycle Management to move any index older than 36 hours (after Graylog has rotated it away) to a warm node where they are available for 60 days (and for certain index sets, 90 days), after which ILM will move them to the cold storage nodes, reduces the replica count to 0, and freezes them, where we keep them until we need to clear up disk space, at which time we take a snapshot of a few indices to S3, then delete them.

The reason is that in our business sector we often need to be able to pull up historical data at will, and we need to retain data for at least 5 years. Our current setup lets us satisfy that requirement with minimal effort due to being able to un-freeze an index and have it available in under 15 minutes. Snapshot restores take too long sometimes.

This whole setup ingests, as of today, about 5000 msg/sec continuously, for a little over 120Gb of data daily. Since we run with 2 replicas on important indices (because we need the availability guaranteed), that ends up being 360Gb of storage space required in the cluster, daily. We still have to roll out to 3000+ devices, so by the time all is said and done we may even have to expand the cluster to keep our “data that needs to be live” requirement intact.

Fun times :slight_smile:


This is funny and I can see what we can do with ES, MongoDB and Graylog but i’ll never do that I think, not for now. My work placement takes place over 4 months (3 months left now) knowing that I can’t be on Graylog 100% of my time, I got many other project. But who knows ? Maybe one day I will set up 25 machines too ahah

1 Like

Ah I see! Okay, well, who knows :slight_smile: I’ll say though that knowing Graylog and all it’s supporting bits and pieces (mongo, Elastic) is a good thing :slight_smile:

I will start to set up the infrastructure today or monday. I just want to know if someone already has set up a 2 members replica set on MongoDB ? If I understand, there is no arbiter with 2 members ? What I have to change in the configuration comparing to a 3 members replica set ?

1 Like

The mongodb replica instructions I posted above should give you what you need to add another member…

1 Like

Hello, did you got the problem when you want to restart mongod service after change configuration files ?

Just here in the documentation :

" On each of your Linodes, make the following changes to your /etc/mongod.conf file:


port: 27017

keyFile: /opt/mongo/mongo-keyfile

replSetName: rs0

Once you’ve made these changes, restart the mongod service:

sudo systemctl restart mongod"

Thanks for help

I am not sure what you are asking… but I did have problems with the bindIP. It is very particular with what port it is listening to. I ended up putting …which is a security issue - we balance the risk.

It was what I ask, thanks. But unfortunately I still have the problem. Can’t fix it for now

To explain my problem. I’m trying to set up MongoDB Replica Set with 3 members. On each nodes I install mongoDB and configure the file /etc/mongod.conf . On the node which will be the primary I also create an administrative user. When I restart the service, only my primary server start and others return :
Job for mongod.service failed because the control process exited with error code. See “systemctl status mongod.service” and “journalctl -xe” for details.

I wanted to check log of Mongo but there is nothing in the file. journalctl -xe dont really bring me some information

Edit : I also enable the corresponding port and disabled selinux and apply some rights

Need more details… always post details of what you have done, tried and looked at…

  • Are the other servers set properly in your host file?
  • Did you rs.initiate() the primary? Results?
  • have you executed rs.add("<server2>") Results?
  • What is the output of rs.conf() and rs.status() ?
  • Do you see anything in /var/log/mongodb/mongodb.log?

Hi, thanks for the reply

After launch many restart and many install of mongodb. It works well, but I don’t change anything when I installed the service and enter the configuration …

rs.status() return this :

"set" : "rs_mongo0",
        "date" : ISODate("2019-04-23T14:30:09.707Z"),
        "myState" : 1,
        "term" : NumberLong(1),
        "heartbeatIntervalMillis" : NumberLong(2000),
        "members" : [
                        "_id" : 0,
                        "name" : "srv1-gl-gdb.crtinformatique.local:27017",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 24038,
                        "optime" : {
                                "ts" : Timestamp(1556026325, 1),
                                "t" : NumberLong(1)
                        "optimeDate" : ISODate("2019-04-23T13:32:05Z"),
                        "electionTime" : Timestamp(1556010390, 2),
                        "electionDate" : ISODate("2019-04-23T09:06:30Z"),
                        "configVersion" : 3,
                        "self" : true
                        "_id" : 1,
                        "name" : "srv2-gl-gdb:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 19169,
                        "optime" : {
                                "ts" : Timestamp(1556026325, 1),
                                "t" : NumberLong(1)
                        "optimeDate" : ISODate("2019-04-23T13:32:05Z"),
                        "lastHeartbeat" : ISODate("2019-04-23T14:30:09.245Z"),
                        "lastHeartbeatRecv" : ISODate("2019-04-23T14:30:09.244Z"),
                        "pingMs" : NumberLong(0),
                        "syncingTo" : "srv1-gl-gdb.crtinformatique.local:27017",
                        "configVersion" : 3
                        "_id" : 2,
                        "name" : "srv1-es-gdb:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 18934,
                        "optime" : {
                                "ts" : Timestamp(1556026325, 1),
                                "t" : NumberLong(1)
                        "optimeDate" : ISODate("2019-04-23T13:32:05Z"),
                        "lastHeartbeat" : ISODate("2019-04-23T14:30:09.036Z"),
                        "lastHeartbeatRecv" : ISODate("2019-04-23T14:30:09.035Z"),
                        "pingMs" : NumberLong(0),
                        "syncingTo" : "srv2-gl-gdb:27017",
                        "configVersion" : 3
        "ok" : 1

I guess it works ?

I am no expert but it looks good to me. There was a method for testing replication in the original link I sent if you want to double check that it is replicating OK.

Yes indeed ! I dont tell you about this test but it was OK.

I try now to connect ES and MongoDB to graylog in graylog.conf.

There is only 2 lines for this ? 1 for ES : elasticsearch_hosts and 1 for MongoDB : mongodb_uri

According to the official multinodes documentation of graylog there are only these two


All is set up but I got an error concerning graylog deflector, after I take a look on the forum I tried this command :

curl -XDELETE http://localhost:9200/graylog_*/

and I had this line one Elasticsearch file :

action.auto_create_index: false

After this my Elasticsearch cluster became Green, it was yellow before. But I still have the problem with the deflector. In indexer errors I had nothing before executing my commands. Now I got this :

2 minutes ago graylog_deflector 91344300-6740-11e9-8366-005056952112 “IndexMissingException[[graylog_deflector] missing]”

Someone can help me with this ? Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.