Sizing - how many cores do I *really* need for a small setup? And can I setup storage in a separate disk?

Hello Graylog community!

I’m trying to setup a new environment. Initially, we are going to see far below 1GB/day of logs.

I’ve watched an older video about the Graylog Reference Architecture, but it doesn’t go very deep on small setups. I understand that it is strongly not advised to install OpenSearch/DataNode on the same system as the Graylog WebUI, so I was planning on creating a front & back-end node even though we are starting small.

My environment is a fairly small virtualization cluster (Nutanix), and I don’t have a ton of RAM to go around. Additionally, the physical hosts only have 12-core CPUs. Because of the low physical core count of our environment, I’m concerned about running into “CPU Ready” contention when trying to schedule wider VMs with 4-8 cores.

The sizing guidelines suggest that both the front & backend should have 8 CPUs/cores, and a total of 40GB of RAM between them (16 FE, 24 BE)… how necessary is this in a very low volume deployment? Can I get by with a WebUI using only 1 or 2 cores, and a datanode with 2-4? I’m assuming that most of the CPU usage on the frontend is from a user interfacing with it, so if the system isn’t expected to see a lot of searching, I may be able to get away with less vCPU and RAM? I figure that the datanode is going to be more RAM hungry since it has the database… but on Nutanix, the storage system is already RAM-cached by design, and then flushed down to SSD (and eventually migrated to spinning disk if the bits stay cold for long enough), so I’m not sure how much benefit there will be to gobs of RAM on the datanode.

On Virtualized systems especially, it seems that the best way to size vCPU count is to only increase them when the system is at or near 100% usage consistently. It seems incredibly wasteful to throw 8 cores at each of these VMs, especially knowing that the “width” of them may start to cause scheduling problems on the hypervisor.

I would much rather need to increase RAM &/or vCPU on nodes later, than have them oversized for purpose right now (again, due to the cluster constraints we have). Will there be obvious “signs” or indicators that the frontend or datanode are undersized?

My other concern - on the datanodes/opensearch, I don’t want a filled disk to kill the host, so I want the storage separate from the root partition. Is there a guideline somewhere on how to put that storage in its own volume? I know almost nothing about opensearch, other than it’s similar to ElasticSearch and is Java-based.

Can you run graylog on really small hardware, ya theoretically, it’s hard to calculate how small, but i run a low volume lab environment with specs like you are talking about and its fine. The graylog cpu is actually mostly used processing messages, so really dependant with how much processing you do to your messages.

As for storing it somewhere else, there is a line in the opensearch config that you can use to use to set the directory.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.