Graylog cant process the amount of data

Marvin1 · July 23, 2024, 7:34am

Hi guys,
I have a problem and I think someone here might be able to solve it.
After making a change and increasing the sensitivity of logging, the amount of data has increased.
I have increased the resources but I think it is still not enough.
I hope one of you can tell me which resources are not enough.

Graylog details:
Graylog 5.2.7
Elasticsearch 7.10.2
MongoDB 6.0.15
on Debian 10

Change details:
From 120G to 480-300G per Day
Deleted all Extractor (Regex is dangerous)

the Problem

grafik

the amount of data (gaps come from a downtime of the server)

Server conf

JVM Settings

gsmith · July 23, 2024, 9:52pm

hey @Marvin1

the settings for Input, output and processbuffer_processors they create CPU threads. Its a common process that those three setting should match how many CPU’s you have on your instance. Looking at your settings you should have at least 21 CPU cores. what sometimes happens is that your running out of resources, also check your logs incase something has happened to Elasticsearch. also take note, having 38+ million logs will take some time to digest.

joe.gross · July 25, 2024, 12:37am

How many nodes do you have in this cluster? Do Graylog and Elasticsearch share the same host? If so, you will need to split them onto their own nodes. For that volume, you likely need more than one node of each.

Marvin1 · July 25, 2024, 7:53am

Hey @gsmith, thanks for your respond.
The resources for Graylog are 70GB RAM and 32vCPU.
I changed the wait strategy of the buffers to busy_spinning.

Elastic is green, i didnt found any logs to worry about.

Marvin1 · July 25, 2024, 8:02am

Hey @joe.gross, thanks for respond.
I am running a single node setup. Initially we did not need a multi-node setup in our production.
I’m now planning a multi node setup.
5 virtual machines
1 VM for load balancing with nginx
2 VMs, one for each elastic node
and 2 VMs for the graylog node running mongodb on a separate partition.

grafik

Can you recommend any documentation that describes such a multinode setup?

I am currently still worried about the configuration of NGNIX and reaching the WEB UI via HTTPS

joe.gross · July 27, 2024, 1:11pm

Unfortunately, we don’t have a great set of docs to describe multi-node setup. Fortunately, it’s not terribly hard. For Graylog, you will configure each node the same. The only difference between them is that, in the server.conf, one will be designated Primary, and one will be Secondary.

If you are rebuilding, I would recommend Opensearch, rather than Elasticsearch. It won’t be long before we don’t support Elasticsearch at all anymore. You should go ahead and make the change while you have the opportunity.

Setting up multiple Opensearch nodes is also pretty easy. Just make sure that each node has the address of its peers, and they both belong to the same cluster, and it should do most of the work for you.

The set up instructions still apply, so you can use those to make sure you get your configurations are right.

As for Nginx, Many people choose to configure TLS termination on Nginx, rather than Graylog.

Good luck!

system · August 10, 2024, 1:11pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Performace Problems Graylog Central (peer support)	5	570	December 6, 2019
Graylog not using all CPUs Graylog Central (peer support)	3	1299	February 22, 2018
Graylog Processing Messages Super Slow Graylog Central (peer support)	3	4009	October 16, 2018
Graylog has Millions of Unprocessed Messages Graylog Central (peer support)	2	976	February 10, 2021
Again Graylog is backed up and slow to write out Graylog Central (peer support)	2	881	April 20, 2019

Graylog cant process the amount of data

Related topics