Output batch_size question

nix-power · March 16, 2019, 12:06pm

Hi

I currently run 5 Graylog nodes on cluster on aws c5 ec2 instances. 16 CPU and 32 GB Ram.
On these machines I also run elastic coordinate only nodes.
Heap size for Graylog is 12G
Heap size for ES coordinate node 8G
http.max_content_length of elastic set to 1024Mb
Index.refresh: 15s
Graylog nodes configured to send messages to all 5 coordinate only nodes.

Coordinate only nodes are part of elastic cluster that consist of 16 data nodes and 3 separate masters.
Data nodes have 3.5 Tb NVME SSD, 16 cores and 122 GB RAM.

I set output batch size to 1000
Refresh rate: 1s
Max elastic connections to 160 and
Max connections per route 32.

Our median log flow is 15.000msg/sec

Does it make sense to raise batch size to 10.000 or it may have negative effect on performance due to very large bulk size ?

benvanstaveren · March 16, 2019, 12:16pm

Large bulk size also may cause ES to reject it, I found on our cluster that having a batch size of about 2048 with more outputbuffer_processors can raise performance - up to a certain extent. We run 3 graylog nodes, with 8 outputbuffer_processors, 16 processbuffer_processors (on 24 core machines).

Realistically, since everyone’s setup is different the only advice I can give you is: experiment. Give it a shot and see what happens There’s currently no real “golden bullet” for larger setups.

nix-power · March 16, 2019, 12:44pm

You right, I am just wondering about rule “set batch size to your median log rate”, but it looks like it is not going to work for extremely heavy loaded setups

benvanstaveren · March 16, 2019, 1:25pm

I think also raising the number of connections per route will work, we use 64 per route with Graylog pointed at 3 coordinating nodes, and a maximum of 3 * 64 connections (because, well, math and random reasons).

nix-power · March 16, 2019, 2:16pm

Actually I am not complaining on performance: since I set up http.max content length and index refresh properly everything works fantastic with 1000 output batch size. But since our log traffic growth I want to be ready to higher throuput

benvanstaveren · March 16, 2019, 3:12pm

Hmm, I’d say see if you can tune the existing setup so you know where your “breaking point” is, and then start planning on more Graylog nodes

system · March 30, 2019, 3:47pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How can we improve output message rate to elasticsearch Graylog Central (peer support)	2	3904	May 27, 2019
Very high outputbuffer_processors count and small output_batch_size for better message throughput? Graylog Central (peer support)	5	8057	January 1, 2018
BlockingBatchedESOutput Graylog Central (peer support)	3	409	July 3, 2020
Graylog throws bulk errors from several indexes Graylog Central (peer support)	2	1311	March 27, 2019
Championing Graylog and need performance advice Graylog Central (peer support)	10	4171	September 14, 2017

Output batch_size question

Related topics