Sizing - how many cores do I really need for a small setup? And can I setup storage in a separate disk?

ZPrime · December 11, 2024, 10:57pm

Hello Graylog community!

I’m trying to setup a new environment. Initially, we are going to see far below 1GB/day of logs.

I’ve watched an older video about the Graylog Reference Architecture, but it doesn’t go very deep on small setups. I understand that it is strongly not advised to install OpenSearch/DataNode on the same system as the Graylog WebUI, so I was planning on creating a front & back-end node even though we are starting small.

My environment is a fairly small virtualization cluster (Nutanix), and I don’t have a ton of RAM to go around. Additionally, the physical hosts only have 12-core CPUs. Because of the low physical core count of our environment, I’m concerned about running into “CPU Ready” contention when trying to schedule wider VMs with 4-8 cores.

The sizing guidelines suggest that both the front & backend should have 8 CPUs/cores, and a total of 40GB of RAM between them (16 FE, 24 BE)… how necessary is this in a very low volume deployment? Can I get by with a WebUI using only 1 or 2 cores, and a datanode with 2-4? I’m assuming that most of the CPU usage on the frontend is from a user interfacing with it, so if the system isn’t expected to see a lot of searching, I may be able to get away with less vCPU and RAM? I figure that the datanode is going to be more RAM hungry since it has the database… but on Nutanix, the storage system is already RAM-cached by design, and then flushed down to SSD (and eventually migrated to spinning disk if the bits stay cold for long enough), so I’m not sure how much benefit there will be to gobs of RAM on the datanode.

On Virtualized systems especially, it seems that the best way to size vCPU count is to only increase them when the system is at or near 100% usage consistently. It seems incredibly wasteful to throw 8 cores at each of these VMs, especially knowing that the “width” of them may start to cause scheduling problems on the hypervisor.

I would much rather need to increase RAM &/or vCPU on nodes later, than have them oversized for purpose right now (again, due to the cluster constraints we have). Will there be obvious “signs” or indicators that the frontend or datanode are undersized?

My other concern - on the datanodes/opensearch, I don’t want a filled disk to kill the host, so I want the storage separate from the root partition. Is there a guideline somewhere on how to put that storage in its own volume? I know almost nothing about opensearch, other than it’s similar to ElasticSearch and is Java-based.

Joel_Duffield · December 14, 2024, 3:53am

Can you run graylog on really small hardware, ya theoretically, it’s hard to calculate how small, but i run a low volume lab environment with specs like you are talking about and its fine. The graylog cpu is actually mostly used processing messages, so really dependant with how much processing you do to your messages.

As for storing it somewhere else, there is a line in the opensearch config that you can use to use to set the directory.

system · December 28, 2024, 3:53am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to size virtual systems for running Graylog/Opensearch? Graylog Central (peer support)	1	118	November 8, 2024
Best hardware setup for medium sized Graylog Central (peer support)	5	1617	August 13, 2022
Graylog grinding to a halt Graylog Central (peer support) sidecar , winlogbeat	11	1597	February 24, 2021
Self-Host On Prem: If you Could Spec Out the Perfect Graylog Rig Graylog Central (peer support)	8	1832	May 25, 2023
Server sizing guide? Graylog Central (peer support)	3	7801	August 7, 2019

Sizing - how many cores do I *really* need for a small setup? And can I setup storage in a separate disk?

Related topics

Sizing - how many cores do I really need for a small setup? And can I setup storage in a separate disk?