ElasticSearch Java reports OutOfMemory Causing Graylog to queu messages

JoeG · April 27, 2017, 6:10pm

I have set of Graylog 2.2.3 servers and an ElasticSearch 2.4.4 cluster with 3 master-eligible nodes and 10 data nodes. The master servers are 4xCPU and 16GB RAM with 8GB allocated for java heap; the data nodes are 8xCPU with 64 GB RAM with 31GB allocated for java heap. I have 4,112 shards in 439 indices totalling approximately 29 TB. Following the graylog and elasticsearch documents for memory tuning I still get the OutOfMemory errors at which point the node seems to fail and the assocated shards become unallocated and then graylog begins to queue messages as the elasticsearch reallocates the data:

2017-04-27 05:49:56,738][INFO ][monitor.jvm ] [es-node-08.example.com] [gc][old][28914][348] duration [6.7s], collections [1]/[6.7s], total [6.7s]/[27.1m], memory [30.8gb]->[30.9gb]/[30.9gb], all_pools {[young] [507mb]->[532.5mb]/[532.5mb]}{[survivor] [0b]->[37.6mb]/[66.5mb]}{[old] [30.3gb]->[30.3gb]/[30.3gb]}
[2017-04-27 05:50:47,914][WARN ][transport.netty ] [es-node-08.example.com] exception caught on transport layer [[id: 0xc5cd721b, /192.168.1.86:44452 => /192.168.1.88:9300]], closing connection java.lang.OutOfMemoryError: Java heap space
[2017-04-27 05:51:14,697][WARN ][netty.channel.socket.nio.AbstractNioSelector] Unexpected exception in the selector loop. java.lang.OutOfMemoryError: Java heap space
[2017-04-27 05:51:22,189][INFO ][monitor.jvm ] [es-node-08.example.com] [gc][old][28916][367] duration [7.1s], collections [1]/[7.4s], total [7.1s]/[28.5m], memory [30.3gb]->[30gb]/[30.9gb], all_pools {[young] [84mb]->[90.2mb]/[532.5mb]}{[survivor] [0b]->[0b]/[66.5mb]}{[old] [30.2gb]->[29.9gb]/[30.3gb]}

jan · April 28, 2017, 6:53am

hej @JoeG

what is your question now?

JoeG · April 28, 2017, 12:12pm

Where do I start to troubleshoot or tune to prevent the OutOfMemory? Is there a proper ratio of shards to server or a shard size sweet spot?

Topic		Replies	Views
Elasticsearch Heap requirements skyrocketed Graylog Central (peer support)	5	1024	September 13, 2018
Graylog Performance Graylog Central (peer support)	2	958	August 24, 2017
Graylog, log problem Graylog Central (peer support)	23	2140	March 18, 2019
Indexing of new messages stoppes occasionally Graylog Central (peer support)	17	1542	May 17, 2017
Graylog server extremely slow. cant get System/inputs Graylog Central (peer support)	19	6565	December 4, 2017

ElasticSearch Java reports OutOfMemory Causing Graylog to queu messages

Related topics