Cluster only has two [Manager] Nodes

Greetings, its been a while.

I recently had a minute to breather and set up a Zabbix server to monitor my Graylog cluster. By way of some background, my cluster is as follows:
Manager Nodes (Manager Only Eligible)
opnsrchmgr-0
opnsrchmgr-1
opnsrchmgr-2
Data Nodes (Data Only)
opnsrchdata-0 (Hot)
opnsrchdata-1 (Hot)
opnsrchdata-2 (Hot)
opnsrchcold-0 (Cold)
Server:
graylogsvr-0
ILM:
Opensearch Dashboard

Well, everything appears to be running well except that Zabbix has alerted me to the fact that my cluster has only two [Manager]-eligible nodes. This shouldn’t be as I have explicitly configured three manager nodes as potential managers. Further, Opensearch bears this out:

curl -s k -u <username>:'<password>' 'https://opnsrchmgr-0.foo.bar:9200/_cat/nodes?v'
ip           heap.percent ram.percent cpu load_1m load_5m load_15m node.role node.roles      cluster_manager name
192.168.8.6            51          97  16   16.13   12.82     8.64 d         data            -               opnsrchdata-1
192.168.8.4            63          97  38   16.13   12.82     8.64 -         cluster-manager -               opnsrchmgr-2
192.168.8.3            24          97   1   16.13   12.82     8.64 m         cluster_manager -               opnsrchmgr-1
192.168.8.7            63          97  19   16.13   12.82     8.64 d         data            -               opnsrchdata-2
192.168.8.5            46          97   8   16.13   12.82     8.64 d         data            -               opnsrchdata-0
192.168.8.14           64          97  20   16.13   12.82     8.64 d         data            -               opnsrchcold-0
192.168.8.2            62          97   5   16.13   12.82     8.64 m         cluster_manager *               opnsrchmgr-0

As you can see, Opensearch “knows” that opnsrchmgr-2 is eligible to be only a cluster-manager but under node.role, nothing is elected “-”.

Further, a cluster master has been elected:

curl -s k -u <username>:'<password>' 'https://opnsrchmgr-0.foo.bar:9200/_cluster/health?pretty'
{
  "cluster_name" : "graylog",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 7,
  "number_of_data_nodes" : 4,
  "discovered_master" : true,
  "discovered_cluster_manager" : true,
  "active_primary_shards" : 350,
  "active_shards" : 540,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

I have attempted to manually force opnsrchmgr-2 into a role “m” but nothing doing.

Has anyone encountered this issue?

Thank you!

Keeping this posterity (and to keep me humble).

This thing vexed me for over two weeks but I guess all it took was to paste the output to a community board.

The issue is purely a typo.

In the opensearch.yml for opnsrchmgr-2, I had cluster-manager instead of cluster_manager

:roll_eyes:

Happy friday, everyone!

1 Like