Assistance Required: Enhancing Graylog Efficiency for Huge Log Volumes

Hello Everyone :hugs:,

Right now, I’m in charge of a large amount of log data being handled by a Graylog system, and I’ve been having performance problems. Any tips or thoughts from the community concerning how to improve my setup would be greatly appreciated.

Context:

Environment: Graylog 4.2 is being run on a three-node cluster.
Elasticsearch: 3 nodes clustered together, version 7.10.2.
MongoDB: one node, version 4.4.
Log Volume: Every day, about 500 GB of logs are consumed.
Hardware: 32GB RAM, 8 vCPUs, and SSD storage are included in every Graylog node. Elasticsearch nodes are built to comparable specs.

Problems:

Search Performance: Results from searches are coming back slowly, sometimes taking minutes.
Indexing Delays: After being ingested, logs take a while to show up in the Graylog interface.
High CPU utilisation: All nodes, especially the Graylog nodes, continuously display high CPU utilisation.

What I’ve Attempting:

  • larger JVM heap sizes for Elasticsearch and Graylog.
  • Modified the shard/replica counts and refresh intervals for the Elasticsearch index.
  • indexed rotation and retention tactics were put into place to control disc utilisation.
  • In order to manage spikes in log data, input/output filters in Graylog were enabled and configured.

Inquiries:

Are there any particular setups or recommended procedures for managing substantial log collections in Graylog which I may be overlooking? :thinking:

Would expanding the Elasticsearch or Graylog clusters be beneficial, and if so, how should one go about doing it? :thinking:

Are there any metrics or monitoring tools that I should pay particular attention to in order to gain a better understanding of the inefficiencies in my setup? :thinking:

I also checked this :point_right: https://community.graylog.org/t/graylog-journal-getting-full/tableau

Thank you :pray: in advance for your help and suggestions.

How full are the disks on the elastic nodes?
Does the elastic cluster show as green inside Graylog?
How long are you retaining data in the indices for?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.