Help with shards

byangwam · January 31, 2019, 5:36am

Some please help me make sense of the below and if there is away it can be solved.
I am running version 2.4 of graylog and deployed an OVA.

Elasticsearch cluster

**The possible Elasticsearch cluster states and more related information is available in the [Graylog documentation]

Elasticsearch cluster is red. Shards: 79 active, 0 initializing, 0 relocating, 81 unassigned,

Indexer failures

**Every message that was not successfully indexed will be logged as an indexer failure.

There were 158,424 failed indexing attempts in the last 24 hours.

“”"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
type":“unavailable_shards_exception”,“reason”:"[graylog_496][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[graylog_496][0]] containing [115] requests]

jan · January 31, 2019, 7:39am

Did you check if you are not out of space on the OVA?

Totally_Not_A_Robot · January 31, 2019, 9:43am

That would definitely mess up your Elastic, as I’ve learned on these forums

Currently working on getting some Elastic monitoring set up on our env to prevent bad situations like these.

byangwam · January 31, 2019, 9:45am

Hello Jan,
Thanks. The server space is good.
Filesystem Size Used Avail Use% Mounted on
udev 3.9G 4.0K 3.9G 1% /dev
tmpfs 799M 468K 798M 1% /run
/dev/dm-0 15G 3.1G 12G 22% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
none 5.0M 0 5.0M 0% /run/lock
none 3.9G 0 3.9G 0% /run/shm
none 100M 0 100M 0% /run/user
/dev/sda1 236M 75M 149M 34% /boot
/dev/sdb1 187G 37G 141G 21% /var/opt/graylog/data

Totally_Not_A_Robot · January 31, 2019, 10:11am

Could you please poke around the ElasticSearch API, based on instructions from here?

https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-shards.html

checking the status of the shards in your ElasticSearch will give you a precise status code, as per the list below:

https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-shards.html#reason-unassigned

This will make it clear why the shards are “unassigned”.

Could it be that someone has been manually deleting files in /var/opt/graylog/data, to clean up more disk space?

benvanstaveren · January 31, 2019, 4:30pm

I can recommend using Telegraf + Grafana to yank metrics out of ES and do alerting out of Grafana, with ElasticHQ installed somewhere as a neat little tool to quickly ascertain what your cluster’s up to.

jtkarvo · January 31, 2019, 5:25pm

hi,

how many elasticsearch nodes do you have?
how many replicas do you have?

The number of replicas needs to be less than the number of servers.

byangwam · February 1, 2019, 9:03am

Its one node. and 1 replica.

byangwam · February 1, 2019, 9:05am

Hi Tess,
Thanks for the response. I tried out the links but couldnt make much out of them.
And not, the disk as you can see from the information i posted has about 60% of it free.

jtkarvo · February 1, 2019, 10:02am

You cannot have replicas, if you have only one node. You should set that to 0.

Probably about 79 of your unassigned shards come from the replica shards that the cluster is not able to put anywhere, since you do not have a second node for them.

After setting the number of replicas and restarting elasticsearch, you can safely delete the replica shards with curl.

system · February 15, 2019, 10:02am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch service down Graylog Central (peer support)	15	3091	January 31, 2022
Elasticsearch disconnecting frequently Graylog Central (peer support)	10	774	July 23, 2021
Indexer failures on restarts Graylog Central (peer support)	5	1468	March 9, 2020
Even now still confused over relationship between RED/YELLOW and graylog Graylog Central (peer support)	2	552	September 30, 2017
OpenSearch Issues / Search Dashboard empty Graylog Central (peer support)	6	149	October 27, 2024

Help with shards

Elasticsearch cluster

Indexer failures

Related topics