Help with shards


(mic) #1

Some please help me make sense of the below and if there is away it can be solved.
I am running version 2.4 of graylog and deployed an OVA.

Elasticsearch cluster

**The possible Elasticsearch cluster states and more related information is available in the [Graylog documentation]

Elasticsearch cluster is red. Shards: 79 active, 0 initializing, 0 relocating, 81 unassigned,

Indexer failures

**Every message that was not successfully indexed will be logged as an indexer failure.

There were 158,424 failed indexing attempts in the last 24 hours.

“”"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
type":“unavailable_shards_exception”,“reason”:"[graylog_496][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[graylog_496][0]] containing [115] requests]


(Jan Doberstein) #2

Did you check if you are not out of space on the OVA?


(Tess) #3

That would definitely mess up your Elastic, as I’ve learned on these forums :slight_smile:

Currently working on getting some Elastic monitoring set up on our env to prevent bad situations like these.


(mic) #4

Hello Jan,
Thanks. The server space is good.
Filesystem Size Used Avail Use% Mounted on
udev 3.9G 4.0K 3.9G 1% /dev
tmpfs 799M 468K 798M 1% /run
/dev/dm-0 15G 3.1G 12G 22% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
none 5.0M 0 5.0M 0% /run/lock
none 3.9G 0 3.9G 0% /run/shm
none 100M 0 100M 0% /run/user
/dev/sda1 236M 75M 149M 34% /boot
/dev/sdb1 187G 37G 141G 21% /var/opt/graylog/data


(Tess) #5

Could you please poke around the ElasticSearch API, based on instructions from here?

https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-shards.html

checking the status of the shards in your ElasticSearch will give you a precise status code, as per the list below:

https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-shards.html#reason-unassigned

This will make it clear why the shards are “unassigned”.

Could it be that someone has been manually deleting files in /var/opt/graylog/data, to clean up more disk space? :wink:


(Ben van Staveren) #6

I can recommend using Telegraf + Grafana to yank metrics out of ES and do alerting out of Grafana, with ElasticHQ installed somewhere as a neat little tool to quickly ascertain what your cluster’s up to.


#7

hi,

  • how many elasticsearch nodes do you have?
  • how many replicas do you have?

The number of replicas needs to be less than the number of servers.


(mic) #8

Its one node. and 1 replica.


(mic) #9

Hi Tess,
Thanks for the response. I tried out the links but couldnt make much out of them.
And not, the disk as you can see from the information i posted has about 60% of it free.


#10

You cannot have replicas, if you have only one node. You should set that to 0.

Probably about 79 of your unassigned shards come from the replica shards that the cluster is not able to put anywhere, since you do not have a second node for them.

After setting the number of replicas and restarting elasticsearch, you can safely delete the replica shards with curl.


(system) closed #11

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.