Elasticsearch partially unavailable from graylog

Hi @ all

When one Elasticsearch Host is completely shutdown, graylog needs a lot of time to write the data to the second ES Node and produces Error Messages in the Log Files. Data Searching is not possible anymore.

My Test Environment:

  • 2 VMs with Elasticsearch(Data, Master), 1 VM with Graylog and Elasticsearch (Master)
  • Graylog Version 2.5.1
  • Elasticsearch Version 5.6.14

Configured Elasticsearch Settings in Graylog server.conf
elasticsearch_hosts = http://x.x.x.1:9200,http://x.x.x.2:9200,http://x.x.x.3:9200
All other elasticsearch_ settings use the default values

I made some tests to narrow down the problem:
Initial Situation for Case 1 and 2: All Elasticsearch Nodes and the Graylog Server are up and running. Data is written to elasticsearch and searchable

Case 1: Stop elasticsearch on one node (systemctl stop elasticsearch). VM is still running
Result: Graylog is still able to write and search data from and to elasticsearch. Elasticsearch Cluster State is Yellow as expected.

Case 2: Shutdown one VM with Elasticsearch.
Result: Graylog writes Data to Elasticsearch, but extremely slow. Searching data is not possible anymore. Error Messages appears in the Log File.

Initial Situation for Case 3: All Elasticsearch Nodes and the Graylog Server are up and running. Data is written to elasticsearch and searchable. The following Setting in the graylog server.conf is set:
elasticsearch_discovery_enabled = true

Case 3: Shutdown one VM with Elasticsearch.
Result: Graylog is still able to write and search data from and to elasticsearch. Elasticsearch Cluster State is Yellow as expected.

To overcome the problem in case 2 i can configure the elasticsearch_discovery_enabled = true setting. In the near future i want to use authentication for elasticsearch, but then the elasticsearch_discovery_enabled setting does not work anymore.

Does anybody know the problem? Is the behaviour of Graylog as expected in Case 1 and Case 2?

These are the error messages from Case 2:

2019-01-09T16:53:59.569+01:00 ERROR [IndexRotationThread] Couldn’t point deflector to a new index
org.graylog2.indexer.ElasticsearchException: Couldn’t collect indices for alias 30dayretindexset_deflector
at org.graylog2.indexer.cluster.jest.JestUtils.execute(JestUtils.java:51) ~[graylog.jar:?]
at org.graylog2.indexer.cluster.jest.JestUtils.execute(JestUtils.java:62) ~[graylog.jar:?]
at org.graylog2.indexer.indices.Indices.aliasTarget(Indices.java:338) ~[graylog.jar:?]
at org.graylog2.indexer.MongoIndexSet.getActiveWriteIndex(MongoIndexSet.java:204) ~[graylog.jar:?]
at org.graylog2.periodical.IndexRotationThread.checkAndRepair(IndexRotationThread.java:144) ~[graylog.jar:?]
at org.graylog2.periodical.IndexRotationThread.lambda$doRun$0(IndexRotationThread.java:76) ~[graylog.jar:?]
at java.lang.Iterable.forEach(Iterable.java:75) [?:1.8.0_191]
at org.graylog2.periodical.IndexRotationThread.doRun(IndexRotationThread.java:73) [graylog.jar:?]
at org.graylog2.plugin.periodical.Periodical.run(Periodical.java:77) [graylog.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_191]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_191]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_191]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_191]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to :9200 [/] failed: connect timed out
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:151) ~[graylog.jar:?]
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:373) ~[graylog.jar:?]
at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:381) ~[graylog.jar:?]
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237) ~[graylog.jar:?]
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) ~[graylog.jar:?]
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) ~[graylog.jar:?]
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) ~[graylog.jar:?]
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) ~[graylog.jar:?]
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) ~[graylog.jar:?]
at io.searchbox.client.http.JestHttpClient.executeRequest(JestHttpClient.java:151) ~[graylog.jar:?]
at io.searchbox.client.http.JestHttpClient.execute(JestHttpClient.java:77) ~[graylog.jar:?]
at org.graylog2.indexer.cluster.jest.JestUtils.execute(JestUtils.java:46) ~[graylog.jar:?]
… 15 more
Caused by: java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_191]
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_191]
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_191]
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_191]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_191]
at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_191]
at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:75) ~[graylog.jar:?]
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142) ~[graylog.jar:?]
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:373) ~[graylog.jar:?]
at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:381) ~[graylog.jar:?]
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237) ~[graylog.jar:?]
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) ~[graylog.jar:?]
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) ~[graylog.jar:?]
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) ~[graylog.jar:?]
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) ~[graylog.jar:?]
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) ~[graylog.jar:?]
at io.searchbox.client.http.JestHttpClient.executeRequest(JestHttpClient.java:151) ~[graylog.jar:?]
at io.searchbox.client.http.JestHttpClient.execute(JestHttpClient.java:77) ~[graylog.jar:?]
at org.graylog2.indexer.cluster.jest.JestUtils.execute(JestUtils.java:46) ~[graylog.jar:?]
… 15 more

Regards,
Marcel

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.