Graylog 2.2 Cluster Issue

meyju · April 5, 2017, 9:43am

Hi,

I’m currently setting up a new Graylog 2.2 Cluster with four nodes. One node is bringing trouble to the whole cluster.

Node 1 is the master
Node 2 brings the trouble if he is running
Node 3 works fine
Node 4 works fine

If I start Node 2 the cluster is loosing is master server or at least it cluster state because I access the webinterface only on the master node.

On the overview page I get this message:
“There was no master Graylog server node detected in the cluster.”

In the logfile of the master node I see the following message every second:
2017-04-05T10:25:57.802+02:00 WARN [NodePingThread] Did not find meta info of this node. Re-registering.

In the logfile of the second node is no error or anything suspect to see.

I tried to set a new node-id on node 2, to register it as a new node, but this didn’t helped.

I checked the configs three times. The master flag is only set on node 1.

I Posted the logfiles on Gist - Graylog Logfiles

Any suggestions?

Kind Regardes
Julian

jochen · April 5, 2017, 10:04am

How did you install and configure these Graylog nodes?
Are all nodes using the same MongoDB database?
Are all node IDs of the Graylog nodes unique?

meyju · April 5, 2017, 10:24am

I installed the RPM Packages (graylog-server-2.2.0-11) on RHEL7 Systems.

The MongoDB databaes is the same on all four nodes. It is a replication set, where alle nodes are running without any problem

Yes, ever Graylog node has its unique id, which is a uuid

I diffed all configs of the four nodes, the only differences are is_master, which is only on node 1 true and the second difference is elasticsearch_network_host which is the ip of the server.

jochen · April 5, 2017, 10:52am

Please upgrade to the latest stable release in the 2.2.x line (Graylog 2.2.3) to rule out any bugs which have been fixed since Graylog 2.2.0.

Additionally, please post the complete logs of all Graylog nodes.

meyju · April 5, 2017, 1:06pm

After the Update I saw a few log messages " [NodePingThread] Did not find meta info of this node. Re-registering." this time on node 2, but rest of the cluster was ok. I did a clean cluster restart (all stop & all start) and now it is working normally.

I will have an eye on it for the next days.

Thanks so far.

meyju · April 10, 2017, 8:05am

After the Weekend I had the same messages on my node2. But i found the solution.

If you get messages like:
[NodePingThread] Did not find meta info of this node. Re-registering.

Check your server clock! The time on all systems should be in sync!

Maybe this helps some other, too.

jochen · April 10, 2017, 11:29am

Also see http://docs.graylog.org/en/2.2/pages/configuration/multinode_setup.html?highlight=ntp#prerequisites

Topic		Replies	Views
Graylog 2.3 cluster issue Graylog Central (peer support)	9	2569	December 27, 2017
Graylog 3.1.3 - Master node flap / NodePingThread Graylog Central (peer support)	2	1756	December 20, 2019
Cannot sustain a master node - logs flooded with "Did not find meta info of this node. Re-registering" Graylog Central (peer support)	3	1871	February 13, 2021
Graylog Gluster Node 1 error Graylog Central (peer support)	3	704	September 25, 2018
Graylog 3.0.2 cluster fail Graylog Central (peer support)	7	559	August 8, 2019

Graylog 2.2 Cluster Issue

Related topics