Self-Host On Prem: If you Could Spec Out the Perfect Graylog Rig

accidentaladmin · May 10, 2023, 3:46pm

What specs would you target? This isn’t an idle question, I will be presenting the request to the higherups as my current setup is reaching its limits.

We ingest about 20GBs a day in logs. I would likely use a ProxMox host so that I could potentially split OpenSearch and Graylog into two different containers.

So, any suggestions? My preference is using an AMD CPU

joe.gross · May 10, 2023, 4:00pm

Hi @accidentaladmin,

You definitely want to separate GL and OS on their own hosts. Don’t make them share resources.

As for the perfect rig, I wouldn’t overthink it. Pretty much any modern system with appropriate resources assigned will handle the load you’re describing. You can run it on physical servers, in virtual, or in containerized environments.

The key resources to consider are CPU, RAM, Java heap, Storage and I/O. The diagram below gives suggested CPU and RAM recommendations. Storage is determined by how much log data you wish to retain. I/O is not as important at this low ingestion level, but stay away from spinning disks if at all possible. The rule of thumb for Java heap is that it should be half of system RAM, up to 31GB max heap.

Hope this helps. Feel free to ask any follow up questions you may have.

accidentaladmin · May 10, 2023, 4:44pm

Thank you so much! This helps a great deal.

I am curious, though. Is there a reason for two Opensearch nodes? Also, am I doing my math, correctly?

Opensearch:
Node 1: 8 cores 32 GB
Node 2: 8 cores 32 GB

Graylog:
Server: 8 cores 16GB

So If I wanted one-rig (running proxmox) I’d probably want something with at minimum 24 cores? (or am I conflating cores with threads?)

Thank you!

joe.gross · May 10, 2023, 5:01pm

You may not need two OS nodes. It depends on your retention requirements.

That’s the TL;DR, there is a much longer answer relating to index shards and the heap space assigned to each of them.

If you are interested in the topic of sharding, this Elasticsearch blog post is a great reference. It applies equally to Opensearch.

accidentaladmin · May 10, 2023, 5:57pm

Again, thank you!

We retain logs for 90days (if that helps)

EDIT: The java heap is handled by the Graylog server and not specifically the OpenSearch servers, correct?

joe.gross · May 10, 2023, 9:40pm

No. Each has their own JVM settings. The exact location depends on the OS and the application. Default file locations are listed here: Default file locations

If your individual indices are small, you should probably reduce the number of shards. As you saw in the link I posted earlier, the size of each shard should be between 20 and 40 GB. So, for a small index, one or two shards should be plenty and will reduce the total number of shards you have to deal with.

From the post:

" TIP: The number of shards you can hold on a node will be proportional to the amount of heap you have available, but there is no fixed limit enforced by Elasticsearch. A good rule-of-thumb is to ensure you keep the number of shards per node below 20 per GB heap it has configured. A node with a 30GB heap should therefore have a maximum of 600 shards, but the further below this limit you can keep it the better. This will generally help the cluster stay in good health."

accidentaladmin · May 10, 2023, 11:33pm

Thank you. Yes, I read through your link prior to asking and reduced the amount of shards per index from the default using this formula:

App. Number of Primary Shards = 
(Source Data + Room to Grow) * (1 + Indexing Overhead) / Desired Shard Size

I was shooting for 20GB shard sizes and basically came to 1 shard per index for all but 1 or 2 indices. I am already seeing a performance boost.

joe.gross · May 11, 2023, 4:30pm

Good work. That’s exactly what I would recommend.

Good luck!

system · May 25, 2023, 4:30pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Graylog died on me (again). There is something I am missing Graylog Central (peer support)	8	327	February 7, 2024
How to size virtual systems for running Graylog/Opensearch? Graylog Central (peer support)	1	117	November 8, 2024
Graylog heap size maximum Graylog Central (peer support)	19	10203	February 13, 2019
Users feedbacks / Guides for heavy load graylog Cluster Graylog Central (peer support)	32	6404	July 29, 2020
Opensearch sizing Documentation Campfire dashboards	1	972	March 18, 2023

Self-Host On Prem: If you Could Spec Out the Perfect Graylog Rig

Related topics