What causes a graylog instance to start caching up out of the blue?

karlt · January 15, 2018, 7:06pm

testing out new hardware on 2 servers - both have 48 threads, 128GB ram
the graylog server has 4x SSDs in a raid10, the elastic server has 6x spindle drives in a raid 6. both are connected to each other with a 10GBe link.
graylog has 30GB jvm and elastic has 31GB jvm heap. Did all the tweaks for elastic such as setting the indexing to 60s. indexes are set at 28GB, 5 shards
in graylog, i’ve adjusted the output buffers to 30 and the batchsize to 8000

running graylog 2.4 and elastic 2.4.4. no streams, pipelines or anything else setup in graylog

I’m able to get bursts of about 30k eps and sustained 10k eps without it touching the journal too much. I ran it all weekend with sysloggen pushing 10k at it and things were fine. Pushed about 3TB into it this weekend. This morning I increased it to 15k eps and it handled it for several minutes and then the output to elastic was sitting at 0 and the journal was caching up. I turned off sysloggen for the journal to clear and then it could barely handle 10k eps without hitting the output buffer a considerable amount, with the output to elastic frequently sitting at 0. Graylog and elastic logs didn’t show any issue during this time

restarted elastic, didn’t make much of a difference. restarted graylog and it’s holding steady for about 20 minutes and then the output and process buffers start loading up

jan · January 16, 2018, 6:44pm

did you write some metrics while doing all this?

this would give you some light!

karlt · January 24, 2018, 4:16am

i’ll see what i can do to reproduce the scenario. Do you have a guide for metrics or guidelines other than disk IO, memory, CPU and network stats?

jochen · January 24, 2018, 8:52am

You should also record Graylog-internal metrics.

system · February 7, 2018, 8:53am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Again Graylog is backed up and slow to write out Graylog Central (peer support)	2	820	April 20, 2019
Journal is filling up along the time Graylog Central (peer support)	2	796	March 30, 2017
Journal keeps growing Graylog Central (peer support)	8	1926	March 21, 2018
Performace Problems Graylog Central (peer support)	5	503	December 6, 2019
Performance Tuning Whitepaper, Guide, Doc Graylog Central (peer support)	5	4569	August 8, 2017

What causes a graylog instance to start caching up out of the blue?

Related Topics