Still confused over ElasticSearch green/red meaning

jason · August 12, 2017, 9:59pm

Hi there

I asked this question on the ES list recently and still haven’t got an answer that I understand

Basically what I don’t get is why when I shutdown graylog-server, then shutdown elasticsearch - which is “green” according to /_cluster/health, when I restart elasticsearch, it always comes up “red” - with lots of unassigned shards. They get processed and eventually it goes “green” - but my point is why does a formal shutdown leave ES in a poor state? This really impacts availability in that if (say) I was doing an incremental ES upgrade, I’d expect (as is the case with all other software I’ve ever used) shutting down a service and restarting it would always start in an “OK” state. I mention umounting a file system as an analogy if it helps

Is this expected ES behaviour (which is weird) or does it imply something’s wrong. The logs certainly don’t show any problem - but part of the reason I’ve come across this quite a lot is because I’d start seeing graylog slow down in searches and nothing but a restart would fix it - which implies these unassigned shards are causing problems. But they don’t even show up until an ES restart - so I’m looking for better ways of detecting this

This is graylog-server-2.3.0-7.noarch elasticsearch-2.4.6-1.noarch on CentOS-7

Thanks

Jason

dustintennill · August 13, 2017, 7:06pm

Jason,

Someone else will probably come along with a more correct answer, but here is mine.

This is expected behavior - it is just how Elasticsearch nodes work as they startup.

I read your question and the answers posted on the ES forum, and tried to find a better description of the process.

https://books.google.com/books?id=8kLfBgAAQBAJ&pg=PA19&lpg=PA19&dq=elasticsearch+startup+process+detailed&source=bl&ots=VjVZEY_8Iv&sig=oDRwI6o0h1jN75XVjHtZkrXQcvI&hl=en&sa=X&ved=0ahUKEwir1c6o8dTVAhVW6mMKHYOGBaEQ6AEIVjAI#v=onepage&q=elasticsearch%20startup%20process%20detailed&f=false

In short:

On startup a node running ES seems to say “Everything is in bad/unknown shape until I have had a chance to check”. At this point everything shows red.
After learning the status of all indexes and nodes, and making sure all primary shards are online the status moves to yellow.
After all replica shards are happy, and there are no unassigned shards status moves to green.

When we restart our ES Cluster (12 nodes) running behind Graylog, it may take 30 minutes for everything to arrive at green. In terms of detection, our concern is that the cluster goes RED when we don’t expect it. Startup and node restarts generate status changes.

Hope this helps.

Dustin Tennill
EKU

jason · August 13, 2017, 10:00pm

Thanks Dustin. If it’s expected behaviour, then “phew!”. But as the status is “red”, does that mean that after ES restarts (or system reboots), graylog will notice the “red”, and block putting data into ES until it goes “green”?

The reason that matters is that you’d have to ensure your message_journal can handle the volume - if it’s too small (5G by default), or ES takes too long to come back “green”, you’ll lose data?

Just trying to understand the idiosyncrasies and ensuring we’re operating it all correctly

Thanks!
Jason

jtkarvo · August 14, 2017, 5:34am

hi

“red” does not mean it will not accept new messages. ES can accept new data when red. Red means more like: “I might have lost some of your data.” Yellow is more like: “At the moment I am not redundant, so if a node goes down, you might lose some of your data.”

system · August 28, 2017, 5:35am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Graylog running but elasticsearch cluster health is red Graylog Central (peer support)	4	2955	March 1, 2018
Graylog elasticsearch health red, how to fix? Graylog Central (peer support)	3	3519	May 24, 2021
Graylog Elasticsearch cluster is yellow since 3 days back Graylog Central (peer support)	10	3174	July 5, 2018
Elasticsearch cluster unhealthy (RED) - Shards unassigned Graylog Central (peer support)	8	6751	July 31, 2017
Elasticsearch service is running but the Cluster is Red on the Web interface Graylog Central (peer support)	22	16262	November 17, 2017

Still confused over ElasticSearch green/red meaning

Related topics