Graylog cant process the amount of data

Hi guys,
I have a problem and I think someone here might be able to solve it.
After making a change and increasing the sensitivity of logging, the amount of data has increased.
I have increased the resources but I think it is still not enough.
I hope one of you can tell me which resources are not enough.

Graylog details:
Graylog 5.2.7
Elasticsearch 7.10.2
MongoDB 6.0.15
on Debian 10

Change details:
From 120G to 480-300G per Day
Deleted all Extractor (Regex is dangerous)

  • the Problem

grafik

  • the amount of data (gaps come from a downtime of the server)


  • Server conf

  • JVM Settings

hey @Marvin1

the settings for Input, output and processbuffer_processors they create CPU threads. Its a common process that those three setting should match how many CPU’s you have on your instance. Looking at your settings you should have at least 21 CPU cores. what sometimes happens is that your running out of resources, also check your logs incase something has happened to Elasticsearch. also take note, having 38+ million logs will take some time to digest.

1 Like

How many nodes do you have in this cluster? Do Graylog and Elasticsearch share the same host? If so, you will need to split them onto their own nodes. For that volume, you likely need more than one node of each.

1 Like

Hey @gsmith, thanks for your respond.
The resources for Graylog are 70GB RAM and 32vCPU.
I changed the wait strategy of the buffers to busy_spinning.

Elastic is green, i didnt found any logs to worry about.

Hey @joe.gross, thanks for respond.
I am running a single node setup. Initially we did not need a multi-node setup in our production.
I’m now planning a multi node setup.
5 virtual machines
1 VM for load balancing with nginx
2 VMs, one for each elastic node
and 2 VMs for the graylog node running mongodb on a separate partition.

grafik

Can you recommend any documentation that describes such a multinode setup?

I am currently still worried about the configuration of NGNIX and reaching the WEB UI via HTTPS

Unfortunately, we don’t have a great set of docs to describe multi-node setup. Fortunately, it’s not terribly hard. For Graylog, you will configure each node the same. The only difference between them is that, in the server.conf, one will be designated Primary, and one will be Secondary.

If you are rebuilding, I would recommend Opensearch, rather than Elasticsearch. It won’t be long before we don’t support Elasticsearch at all anymore. You should go ahead and make the change while you have the opportunity.

Setting up multiple Opensearch nodes is also pretty easy. Just make sure that each node has the address of its peers, and they both belong to the same cluster, and it should do most of the work for you.

The set up instructions still apply, so you can use those to make sure you get your configurations are right.

As for Nginx, Many people choose to configure TLS termination on Nginx, rather than Graylog.

Good luck!

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.