1. Describe your incident:
The OpenSearch cluster had some storage issues and it was hitting the high watermark threshold, so I extended the LV for the OS data, upgraded OpenSearch to 1.3.8 and restarted the OpenSearch service.
After the restart, Graylog’s web interface complains with:
2. Describe your environment:
-
OS Information:
Ubuntu 20.04 LTS for Graylog
CentOS 7.9 for OpenSearch -
Package Version:
Graylog 4.3.9
OpenSearch 1.3.8 -
Service logs, configurations, and environment variables:
Relevant config:
$ grep elasticsearch_ /etc/graylog/server/server.conf
elasticsearch_version = 7
elasticsearch_hosts = http://admin:admin@node-1:9200,http://admin:admin@node-2:9200,http://admin:admin@node-3:9200
From the Graylog logs:
2023-03-02T17:22:07.788+01:00 INFO [SearchDbPreflightCheck] Connected to (Elastic/Open)Search version OpenSearch:1.3.8
3. What steps have you already taken to try and solve the problem?
I checked that the OpenSearch status can be queried from the Graylog cluster nodes:
$ curl http://opensearch-node1:9200/_cluster/health?pretty -u admin:admin -k
{
“cluster_name” : “opensearch-cluster”,
“status” : “yellow”,
“timed_out” : false,
“number_of_nodes” : 3,
“number_of_data_nodes” : 3,
“discovered_master” : true,
“active_primary_shards” : 1798,
“active_shards” : 1827,
“relocating_shards” : 0,
“initializing_shards” : 4,
“unassigned_shards” : 1169,
“delayed_unassigned_shards” : 0,
“number_of_pending_tasks” : 0,
“number_of_in_flight_fetch” : 0,
“task_max_waiting_in_queue_millis” : 0,
“active_shards_percent_as_number” : 60.9
}
4. How can the community help?
Is there any reason for this to stop suddenly working? Or should I learn to be patient and OS will catch up eventually?
TIA!