hi people,
my setup:
i have
4 Graylog Servers which are doing the message processing - 24 vCores, 64 gb of ram with 30 dedicated to the heap for java.
1 Graylog Master which is also the Webserver 15 vCores 32 gb of ram with 16 for java heap -
(All 5 Graylog Servers have MongoDb with one Primary)
3 Elastic Search Data Node Servers - 24 vCores, 64 gb of ram with 30 dedicated to the heap for java.
3 Elastic Search Master Node Servers - 10 vCores 32 gb of ram with 16 for java heapstrong text
every index set has a replica of 1
i have a new issue with elasticsearch/graylog.
every time i’m doing a query in graylog the load on the elasticsearch data nodes goes above 26 on the 2nd data node and more than 35 on the 3rd data node.
if there is no query most of the times the 3rd note is having a load of 30
this is very weird in my opinion because the heap size is ok, disk space too and the cpu barely gets utilized…
more than this everytime there is a high load on the 3rd node i’m having index failures on the 3rd node
6 minutes ago row_firewall_215 551cedb5-d467-11e9-90b7-005056867a00 {"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [2184091][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[row_firewall_215][2]] containing [637] requests, target allocation id: r9jkAo4HRcWrWdS_o2BKcg, primary term: 1 on EsThreadPoolExecutor[name = data-node-3/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@eb770bf[Running, pool size = 24, active threads = 24, queued tasks = 214, completed tasks = 769470]]"}
6 minutes ago firewall_782 551cc6cf-d467-11e9-90b7-005056867a00 {"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [2184088][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[firewall_782][1]] containing [1601] requests, target allocation id: y5jCsR5HSiORK2Kh2SRBXA, primary term: 1 on EsThreadPoolExecutor[name = data-node-3/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@eb770bf[Running, pool size = 24, active threads = 24, queued tasks = 213, completed tasks = 769469]]"}
6 minutes ago row_firewall_215 551cedb0-d467-11e9-90b7-005056867a00 {"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [2184091][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[row_firewall_215][2]] containing [637] requests, target allocation id: r9jkAo4HRcWrWdS_o2BKcg, primary term: 1 on EsThreadPoolExecutor[name = data-node-3/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@eb770bf[Running, pool size = 24, active threads = 24, queued tasks = 214, completed tasks = 769470]]"}
6 minutes ago firewall_782 551cc6cb-d467-11e9-90b7-005056867a00 {"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [2184088][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[firewall_782][1]] containing [1601] requests, target allocation id: y5jCsR5HSiORK2Kh2SRBXA, primary term: 1 on EsThreadPoolExecutor[name = data-node-3/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@eb770bf[Running, pool size = 24, active threads = 24, queued tasks = 213, completed tasks = 769469]]"}
6 minutes ago firewall_782 551cc6ca-d467-11e9-90b7-005056867a00 {"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [2184088][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[firewall_782][1]] containing [1601] requests, target allocation id: y5jCsR5HSiORK2Kh2SRBXA, primary term: 1 on EsThreadPoolExecutor[name = data-node-3/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@eb770bf[Running, pool size = 24, active threads = 24, queued tasks = 213, completed tasks = 769469]]"}
6 minutes ago firewall_782 551cc6c5-d467-11e9-90b7-005056867a00 {"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [2184088][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[firewall_782][1]] containing [1601] requests, target allocation id: y5jCsR5HSiORK2Kh2SRBXA, primary term: 1 on EsThreadPoolExecutor[name = data-node-3/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@eb770bf[Running, pool size = 24, active threads = 24, queued tasks = 213, completed tasks = 769469]]"}
6 minutes ago row_firewall_215 551cc6c8-d467-11e9-90b7-005056867a00 {"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [2184061][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[row_firewall_215][0]] containing [636] requests, target allocation id: ASEjD0PhT6-AGDai5iPx4w, primary term: 1 on EsThreadPoolExecutor[name = data-node-3/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@eb770bf[Running, pool size = 24, active threads = 24, queued tasks = 203, completed tasks = 769466]]"}
6 minutes ago row_firewall_215 551cc6c9-d467-11e9-90b7-005056867a00 {"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [2184091][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[row_firewall_215][2]] containing [637] requests, target allocation id: r9jkAo4HRcWrWdS_o2BKcg, primary term: 1 on EsThreadPoolExecutor[name = data-node-3/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@eb770bf[Running, pool size = 24, active threads = 24, queued tasks = 214, completed tasks = 769470]]"}
6 minutes ago row_firewall_215 551cc6c3-d467-11e9-90b7-005056867a00 {"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [2184071][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[row_firewall_215][3]] containing [619] requests, target allocation id: OKJn1c-CRw6emsR_VaoHaQ, primary term: 1 on EsThreadPoolExecutor[name = data-node-3/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@eb770bf[Running, pool size = 24, active threads = 24, queued tasks = 206, completed tasks = 769466]]"}
6 minutes ago row_firewall_215 551cc6c2-d467-11e9-90b7-005056867a00 {"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [2184091][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[row_firewall_215][2]] containing [637] requests, target allocation id: r9jkAo4HRcWrWdS_o2BKcg, primary term: 1 on EsThreadPoolExecutor[name = data-node-3/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@eb770bf[Running, pool size = 24, active threads = 24, queued tasks = 214, completed tasks = 769470]]"}
in the logs of the 3rd node i’m getting some java messages like
[2019-09-11T03:18:27,207][DEBUG][o.e.a.s.TransportSearchAction] [data-node-3] [graylog_3][3], node[bh-CSNy5SlmX7OJbmALsXg], [P], s[STARTED], a[id=bVQUsG6KRjeXcocZNq_GDg]: Failed to execute [SearchRequest{searchType=QUERY_THEN_FETCH, indices=[graylog_3], indicesOptions=IndicesOptions[ignore_unavailable=false, allow_no_indices=false, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false, ignore_throttled=true], types=[message], routing='null', preference='null', requestCache=null, scroll=null, maxConcurrentShardRequests=15, batchedReduceSize=512, preFilterShardSize=64, allowPartialSearchResults=true, localClusterAlias=null, getOrCreateAbsoluteStartMillis=-1, source={"from":0,"size":0,"query":{"bool":{"filter":[{"query_string":{"query":"EventID:4625 AND SubStatus:0xc000006a AND NOT LogonType:5 AND streams:000000000000000000000001","fields":[],"type":"best_fields","tie_breaker":0.0,"default_operator":"or","max_determinized_states":10000,"allow_leading_wildcard":true,"enable_position_increments":true,"fuzziness":"AUTO","fuzzy_prefix_length":0,"fuzzy_max_expansions":50,"phrase_slop":0,"escape":false,"auto_generate_synonyms_phrase_query":true,"fuzzy_transpositions":true,"boost":1.0}},{"range":{"timestamp":{"from":"2019-09-09 15:03:54.030","to":"2019-09-09 15:08:54.030","include_lower":true,"include_upper":true,"boost":1.0}}},{"bool":{"should":[{"term":{"streams":{"value":"000000000000000000000001","boost":1.0}}}],"adjust_pure_negative":true,"boost":1.0}}],"adjust_pure_negative":true,"boost":1.0}},"aggregations":{"pivot-1-series-max(TargetUserName)":{"max":{"field":"TargetUserName"}},"timestamp-min":{"min":{"field":"timestamp"}},"timestamp-max":{"max":{"field":"timestamp"}}}}}]
org.elasticsearch.transport.RemoteTransportException: [data-node-2][10.161.90.45:9300][indices:data/read/search[phase/query]]
Caused by: java.lang.IllegalArgumentException: Expected numeric type on field [TargetUserName], but got [keyword]
at org.elasticsearch.search.aggregations.support.ValuesSourceConfig.numericField(ValuesSourceConfig.java:309) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.aggregations.support.ValuesSourceConfig.originalValuesSource(ValuesSourceConfig.java:292) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.aggregations.support.ValuesSourceConfig.toValuesSource(ValuesSourceConfig.java:249) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.aggregations.support.ValuesSourceAggregatorFactory.createInternal(ValuesSourceAggregatorFactory.java:55) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.aggregations.AggregatorFactory.create(AggregatorFactory.java:216) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.aggregations.AggregatorFactories.createTopLevelAggregators(AggregatorFactories.java:217) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.aggregations.AggregationPhase.preProcess(AggregationPhase.java:55) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:112) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.indices.IndicesService.lambda$loadIntoContext$17(IndicesService.java:1253) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.indices.IndicesService.lambda$cacheShardLevelResult$18(IndicesService.java:1309) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.indices.IndicesRequestCache$Loader.load(IndicesRequestCache.java:164) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.indices.IndicesRequestCache$Loader.load(IndicesRequestCache.java:147) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.common.cache.Cache.computeIfAbsent(Cache.java:433) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.indices.IndicesRequestCache.getOrCompute(IndicesRequestCache.java:119) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.indices.IndicesService.cacheShardLevelResult(IndicesService.java:1315) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.indices.IndicesService.loadIntoContext(IndicesService.java:1251) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:348) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:394) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.SearchService.access$100(SearchService.java:126) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.SearchService$2.onResponse(SearchService.java:359) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.SearchService$2.onResponse(SearchService.java:355) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.SearchService$4.doRun(SearchService.java:1107) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:41) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.8.3.jar:6.8.3]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_212]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_212]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212]
[2019-09-11T03:18:27,212][DEBUG][o.e.a.s.TransportSearchAction] [data-node-3] All shards failed for phase: [query]
org.elasticsearch.ElasticsearchException$1: Expected numeric type on field [TargetUserName], but got [keyword]
at org.elasticsearch.ElasticsearchException.guessRootCauses(ElasticsearchException.java:657) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:131) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:259) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.action.search.InitialSearchPhase.onShardFailure(InitialSearchPhase.java:100) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.action.search.InitialSearchPhase.access$100(InitialSearchPhase.java:48) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.action.search.InitialSearchPhase$2.lambda$onFailure$1(InitialSearchPhase.java:220) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.action.search.InitialSearchPhase.maybeFork(InitialSearchPhase.java:174) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.action.search.InitialSearchPhase.access$000(InitialSearchPhase.java:48) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.action.search.InitialSearchPhase$2.onFailure(InitialSearchPhase.java:220) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.action.search.SearchExecutionStatsCollector.onFailure(SearchExecutionStatsCollector.java:73) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:59) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.action.search.SearchTransportService$ConnectionCountingHandler.handleException(SearchTransportService.java:463) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1114) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.transport.TcpTransport.lambda$handleException$24(TcpTransport.java:1011) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:193) [elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.transport.TcpTransport.handleException(TcpTransport.java:1009) [elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.transport.TcpTransport.handlerResponseError(TcpTransport.java:1001) [elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:950) [elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:763) [elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:53) [transport-netty4-client-6.8.3.jar:6.8.3]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:323) [netty-codec-4.1.32.Final.jar:4.1.32.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:297) [netty-codec-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241) [netty-handler-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:656) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:556) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:510) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) [netty-common-4.1.32.Final.jar:4.1.32.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212]
Caused by: java.lang.IllegalArgumentException: Expected numeric type on field [TargetUserName], but got [keyword]
at org.elasticsearch.search.aggregations.support.ValuesSourceConfig.numericField(ValuesSourceConfig.java:309) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.aggregations.support.ValuesSourceConfig.originalValuesSource(ValuesSourceConfig.java:292) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.aggregations.support.ValuesSourceConfig.toValuesSource(ValuesSourceConfig.java:249) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.aggregations.support.ValuesSourceAggregatorFactory.createInternal(ValuesSourceAggregatorFactory.java:55) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.aggregations.AggregatorFactory.create(AggregatorFactory.java:216) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.aggregations.AggregatorFactories.createTopLevelAggregators(AggregatorFactories.java:217) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.aggregations.AggregationPhase.preProcess(AggregationPhase.java:55) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:112) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.indices.IndicesService.lambda$loadIntoContext$17(IndicesService.java:1253) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.indices.IndicesService.lambda$cacheShardLevelResult$18(IndicesService.java:1309) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.indices.IndicesRequestCache$Loader.load(IndicesRequestCache.java:164) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.indices.IndicesRequestCache$Loader.load(IndicesRequestCache.java:147) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.common.cache.Cache.computeIfAbsent(Cache.java:433) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.indices.IndicesRequestCache.getOrCompute(IndicesRequestCache.java:119) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.indices.IndicesService.cacheShardLevelResult(IndicesService.java:1315) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.indices.IndicesService.loadIntoContext(IndicesService.java:1251) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:348) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:394) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.SearchService.access$100(SearchService.java:126) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.SearchService$2.onResponse(SearchService.java:359) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.SearchService$2.onResponse(SearchService.java:355) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.search.SearchService$4.doRun(SearchService.java:1107) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:41) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) ~[elasticsearch-6.8.3.jar:6.8.3]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.8.3.jar:6.8.3]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_212]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_212]
... 1 more
previously i was running elasticsearch 6.5.2, yesterday i have upgraded to 6.8.3 and things became more usable…
i have also re-adjusted the indices in the range of 20 - 40 GB
now every data node is managing something like 5.5 TB of data and i was thinking to do something like 4 TB/ elasticsearch data node… what do you think?
honestly i’m lost i have no idea what to do to improve the load on the 3rd node, any ideas?
thanks,
Marius.