Single Node Question

min · September 8, 2020, 6:43pm

Hello everyone!

I was doing some performance tests with a single node of GL and ES to understand the impact of some things like the parsers, message sizes, pipeline rules and etc…

The variables that I’m manipulating to make these tests are:

output_batch_size
processbuffer_processors
outputbuffer_processors
ES memory
GL memory

I noticed that after upgrading my VM’s (giving more CPU and memory) until certain point I didn’t saw too much difference on the throughput.

So it came me a question… Are there a limit to a single node schema (I think it probably has) of CPU and memory? I mean, if I gave more CPU/memory it doesn’t will improve the performance and maybe it will yield less performance though

ttsandrew · September 10, 2020, 3:47pm

Hello @min

Are you increasing graylog and elasticsearch jvm heap sizes when you allocate more memory?
If so, are you also monitoring disk performance?

If adding more CPU and memory is yielding diminishing returns I would suggest you check that you are actually increasing load during testing and that you are not constrained by disk performance. There is also a limitation on the amount of heap memory you can allocate to a single JVM.

min · September 21, 2020, 5:03pm

Hi @ttsandrew!

Thank you for your response.

Yes, I’m also increasing the heap memory, but I was not watching the disk performance.

I will make some more tests to check it.

cawfehman · October 2, 2020, 2:58am

beware there is a maximum heap size…

IMHO, on a single node… your bottleneck almost always be your disk. And part of that reason is because graylog by design writes everything it receives to the journal FIRST. Then again when it’s done, so if you think about it, data comes in and is written to the disk (on the node) then it is read from the disk (on the node) then it is processed and written again to the disk (on the node) and while all of that is going on… the system (elasticsearch mainly) is optimizing the data, rotating indexes, creating new indices, etc… and doing that via the IO of the disk. with Graylog and ES both contending for the IO.

That being said, some disks nowadays are very fast, and some such as NVMe can even do read writes simultaneously, but NVMe storage at scale is expensive.

system · October 16, 2020, 2:58am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Graylog heap size maximum - 2 Graylog Central (peer support)	14	2326	February 28, 2019
Single Node Sizing Graylog Central (peer support)	2	474	October 16, 2020
Graylog heap size maximum Graylog Central (peer support)	19	9030	February 13, 2019
Performance Tuning Whitepaper, Guide, Doc Graylog Central (peer support)	5	4555	August 8, 2017
Process and output buffer is 100% utilized Graylog Central (peer support)	5	9080	July 26, 2018

Single Node Question

Related Topics