Hey,
I’m preparing an academical paper about graylog and I wanted to ask about how exactly is graylog processing data (buffers) and how different conf. parameters influence the performance ( throughput) of the graylog.
I have 1 server ( 4 CPU, 8GB RAM) on which all - graylog, mongo, and ES are running. I was recently testing the throughput of this node with different conf. parameters, unfortunately, it looks like nothing really influences the throughput that much. I’m able to process only around 1000 EPS. If I test 1500+ output buffer and journal are both quickly full.
I was reading a lot of similar threads but I was not able to find a clear and detailed explanation of how exactly Graylog processes data.
Is this how Graylog process data/logs from one buffer to the other one?
UPDATE: This picture is edited according to the comments below
Can you please tell me if this is the architecture of buffers that the Graylog uses when processing data?
My configuration is as following:
ring_size= 65536 ( # of messages in each buffer)
inputbuffer_ring_size=65536
inputbuffer_processors = 4
processbuffer_processors = 5
outputbuffer_processors = 3
output_batch_size = 2000
SERVER RAM = 2gb
GRAYLOG RAM = 2gb
ELASTICSEARCH RAM =4gb
I was trying different ring sizes, different batch sizes and different number of processors but to no avail.
For me the best combination seemed to be:
output_batch_size = 10000
inputbuffer_processors = 4
processbuffer_processors = 5
outputbuffer_processors = 2
inputbuffer_ring_size=65536
With “the best” I don’t mean that the throughput was higher but in case of more logs 1500+/sec it seemed that the buffers are filling up slower.
Is there any mathematical formula that calculates the max. throughput ?
I don’t have more computational resources available and I wanted to optimize my server to get the highest throughput, can you please give me advice about what to do better?