Elasticsearch cluster unhealthy (RED) - Shards unassigned

jslagger · July 14, 2017, 6:36pm

I am using the latest Graylog v2.2.3 OVA running in VMWare Player v12.5.0 build-4352439. Everything seem to be work great until I rebooted the Graylog Server. After that, the Elasticsearch cluster went into RED and shows the Shards as unassigned.

I ran the following command and restarted the graylog server:
curl -XPUT ‘:9200/_all/_settings’ -d ‘{“number_of_replicas”: 0}’

This reset the unassigned to 0 but the cluster is still in RED

Elasticsearch cluster unhealthy (RED) (triggered 2 days ago)
The Elasticsearch cluster state is RED which means shards are unassigned. This usually indicates a crashed and corrupt cluster and needs to be investigated. Graylog will write into the local disk journal. Read how to fix this in the Elasticsearch setup documentation.

Elasticsearch cluster
The possible Elasticsearch cluster states and more related information is available in the Graylog documentation.
Elasticsearch cluster is yellow. Shards: 4 active, 0 initializing, 0 relocating, 0 unassigned, What does this mean?

jslagger · July 14, 2017, 7:13pm

The cluster shows GREEN now but there are no new messages being displayed on the Syslog stream I have set up. This is the only stream outside the defaults and it was working until the reboot that put the shards into unassigned. Messages from nginx requests for example work fine.

curl -XGET http://<SERVER>:9200/_cluster/health?pretty

{
  "cluster_name" : "graylog",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 4,
  "active_shards" : 4,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

jan · July 16, 2017, 5:51pm

Did you check the available space inside of the VM?

jslagger · July 17, 2017, 12:16pm

Yes, I checked and it appears fine.

df -ah
Filesystem      Size  Used Avail Use% Mounted on
sysfs              0     0     0    - /sys
proc               0     0     0    - /proc
udev            2.0G  4.0K  2.0G   1% /dev
devpts             0     0     0    - /dev/pts
tmpfs           396M  624K  395M   1% /run
/dev/dm-0        15G  4.0G   11G  29% /
none            4.0K     0  4.0K   0% /sys/fs/cgroup
none               0     0     0    - /sys/fs/fuse/connections
none               0     0     0    - /sys/kernel/debug
none               0     0     0    - /sys/kernel/security
none            5.0M     0  5.0M   0% /run/lock
none            2.0G     0  2.0G   0% /run/shm
none            100M     0  100M   0% /run/user
none               0     0     0    - /sys/fs/pstore
/dev/sda1       236M   74M  150M  34% /boot
systemd            0     0     0    - /sys/fs/cgroup/systemd

jochen · July 17, 2017, 12:20pm

Have you tried restarting the virtual machine?

jslagger · July 17, 2017, 12:36pm

Yes, reboots did not help.

Ok, I am able to use a syslog test util and send a message to graylog, which displays properly. So the issue appears to be with my devices communicating with graylog. Checking now to see if anything changed with our access rules, etc… on the network.

Should I see UDP 514 and UDP6 514 in netstat on the graylog server? I only see the UDP6 and want to make sure that is not the issue.

jochen · July 17, 2017, 12:56pm

No, on a system using a dual-stack, you’ll only see one entry in the output of netstat which covers both, IPv4 and IPv6.

jslagger · July 17, 2017, 4:46pm

Ok, I figured it out. There were 2 issues. First the cluster going red and then the local PC that the OVA is running on had issues with the Symantec Firewall. Basically, Symantec decided at some point to start blocking the incoming UDP traffic. Everything appears to be working properly now. Thanks for everyone’s time!

system · July 31, 2017, 4:46pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Graylog running but elasticsearch cluster health is red Graylog Central (peer support)	4	2936	March 1, 2018
Graylog cluster, elasticsearch unassigned shards Graylog Central (peer support)	4	3029	May 4, 2021
Graylog elasticsearch health red, how to fix? Graylog Central (peer support)	3	3456	May 24, 2021
Elasticsearch cluster is RED - Solved Graylog Central (peer support)	6	11922	May 3, 2017
Elasticsearch service is running but the Cluster is Red on the Web interface Graylog Central (peer support)	22	16217	November 17, 2017

Elasticsearch cluster unhealthy (RED) - Shards unassigned

Related topics