High traffic, very slow dashboards

I apologize as I know slow dashboards is not a new topic, however, I’ve yet to find a solution that helps.

Infrastructure AWS hosted:
2 x Graylog nodes (m5.2xlarge - 8 cpu and 32GB) (also running mongo - default config)
5 x Elasticsearch 6.8 nodes (i3.4xlarge - 16 cpu and 128GB, 3.5 TB software Raid 0 of physical SSDs)
1 x HAProxy load balancer in front of Graylog

Graylog version 3.3 (16GB allocated for heap) - configured to talk to all 5 nodes
Elasticsearch 6.8.7 (30GB allocated for heap) - configured to use local SSDs in Raid 0 for ES storage.

Pretty much default configs for each (can provide parts if it would help the investigation).

A separate index was created for this particular stream. It’s currently holding 4TB in 144 indices. 5 shards per index with 2 replicas. I had adjusted to 20 shards per index for several days, but it had no effect, if anything the dashboard felt slower to load. Rotation strategy is set to 1 index per hour. Most of our queries do not look past 8 hours, but even a 30 minute window is very slow to load (>60 seconds). Field type refresh interval: 30 seconds
Max segments: 1

The graylog server load average (normal traffic time) 1.40, 1.40, 1.40.
ES load average (normal traffic time) 2.5, 2.5, 2.5
These numbers are obviously estimates, but just showing there is no cpu waiting going on.

The dashboard does have multiple graphs on it. There are 4 tabs each with about the same charts as mentioned below:
1-4 x 1 minute interval line graphs, grouped by a data type (default 8 hour window)
4-6 x pie charts, each by a different data type (30 minute window)
1-2 x table chart of recent messages sorted by timestamp

This particular dashboard is extremely slow to load, taking usually about a minute, if not longer for each refresh. I see no spikes in cpu usage on any of the ES nodes nor the Graylog nodes when loading the page. I have other dashboards for different indices, but significantly less data, that load just fine. Reducing to a 30 minute window is still extremely slow to load.

Any help is appreciated, please let me know if any specific parts of configuration files would be helpful.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.