Single Node Question

Hello everyone!

I was doing some performance tests with a single node of GL and ES to understand the impact of some things like the parsers, message sizes, pipeline rules and etc…

The variables that I’m manipulating to make these tests are:

output_batch_size
processbuffer_processors
outputbuffer_processors
ES memory
GL memory

I noticed that after upgrading my VM’s (giving more CPU and memory) until certain point I didn’t saw too much difference on the throughput.

So it came me a question… Are there a limit to a single node schema (I think it probably has) of CPU and memory? I mean, if I gave more CPU/memory it doesn’t will improve the performance and maybe it will yield less performance though

Hello @min

Are you increasing graylog and elasticsearch jvm heap sizes when you allocate more memory?
If so, are you also monitoring disk performance?

If adding more CPU and memory is yielding diminishing returns I would suggest you check that you are actually increasing load during testing and that you are not constrained by disk performance. There is also a limitation on the amount of heap memory you can allocate to a single JVM.

Hi @ttsandrew!

Thank you for your response.

Yes, I’m also increasing the heap memory, but I was not watching the disk performance.

I will make some more tests to check it.

beware there is a maximum heap size…

IMHO, on a single node… your bottleneck almost always be your disk. And part of that reason is because graylog by design writes everything it receives to the journal FIRST. Then again when it’s done, so if you think about it, data comes in and is written to the disk (on the node) then it is read from the disk (on the node) then it is processed and written again to the disk (on the node) and while all of that is going on… the system (elasticsearch mainly) is optimizing the data, rotating indexes, creating new indices, etc… and doing that via the IO of the disk. with Graylog and ES both contending for the IO.

That being said, some disks nowadays are very fast, and some such as NVMe can even do read writes simultaneously, but NVMe storage at scale is expensive.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.