Enhance Graylog search performance by adding new Elastic nodes?

zahnd · August 10, 2017, 9:31am

Hi community

We see some search performance issues and I would like to be sure to go into the right direction prior to invest into new hardware and therefore would like to ask the community for their experience.

We’re running an infrastructure with 3 graylog and 3 elasticsearch nodes by which one elastic node is only an eligible master node without data. So there are only 2 data nodes in the elastic cluster. Unfortunately those data nodes or at least one of them aren’t equiped with very fast disks.

Input is around 4k messages per second. Constant maximum output is around 10k message per second. We have one index per day for audit and other messages each configured with one primary and one secondary shard. At the moment there are 225 indices, 450 shards and 17’977’430’475 documents in the elasticsearch cluster which consumes a disk space of 15.44TB data.

A search in in the last 5 minutes takes about 20 seconds to run over all messages. Aprox 3 seconds less in streams. A search in the last day (24 hour) takes about 60 seconds and during the search, after 20 seconds or so, the output stops sending messages to elasticsearch. After the search has finished messages are sent again and the journal is emptied.

During the search (no matter how much back in time) the elasticsearch node holding the primary shards is running at >90% cpu power. The average load of the system is around 8 and peaks at 12 during a search.

From my understanding of the elasticsearch infrastructure the load per cluster node decreases during searches by increasing the number of nodes. Is my asumption correct? Has somebody some experience with performance issues? The goal is to have search times for short range searchs under 3 seconds and for long range searchs under 10 seconds.

Thank you very much in advance for any suggestions on this.

Best regards, Stefan

jtkarvo · August 10, 2017, 9:43am

hi,

you probably have too little memory in the ES nodes. For 15T of data per node you probably could use a full 64G RAM and 32G JVM per node, but for performance it could be more efficient to have 4 servers with 32G RAM and 16T JVM each.

This is just a gut feeling, but just something you could look at.

zahnd · August 10, 2017, 11:51am

Hi

Thank you for your answer and feeling about this. Forgot to mention that both cluster nodes already have 64GB of RAM and 31GB JVM. I too share your gut feeling

best regards

system · August 24, 2017, 11:51am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Help in Better performance of my Elasticsearch-Graylog Setup Graylog Central (peer support)	11	4222	March 13, 2019
Terrible search performance Graylog Central (peer support)	12	1979	June 10, 2019
Graylog very slow Graylog Central (peer support) elastic , architecture , capacity_planning	16	2142	April 26, 2023
Assistance Required: Enhancing Graylog Efficiency for Huge Log Volumes Graylog Central (peer support)	2	93	July 3, 2024
High load on the elasticsearch data nodes Graylog Central (peer support)	8	2925	September 27, 2019

Enhance Graylog search performance by adding new Elastic nodes?

Related topics