OpenSearch is very disk-hungry

m_mlk · May 1, 2023, 7:16am

1. Describe your incident:

We had an old instance of Graylog and everything ran on a single node with the following main features: 32 GB RAM, 1.1 TB for the ElasticSearch data.
That old node was capable of handling a lot of load just fine. The best part was: the data storage disk used by ES was always under control and we never needed to extend it to accommodate more data:

/dev/dm-1 1.1T 613G 414G 60% /var/graylog

I deployed a new Graylog instance some time ago consisting of 2x Graylog servers + 3x OpenSearch servers.
They also handle the load pretty well. My only issue is that I must extend the data disks for OpenSeach on a regular basis. It started with 1.5 TB and nowadays they are:

os-node1:
/dev/mapper/vg_data-log_data 5.5T 4.3T 986G 82% /data

os-node2:
/dev/mapper/vg_data-log_data 5.5T 4.3T 986G 82% /data

os-node3:
/dev/mapper/vg_data-log_data 5.5T 4.3T 986G 82% /data

…and they keep growing.

2. Describe your environment:

OS Information:
Graylog: Ubuntu 20.04 LTS
OpenSearch: CentOS 7.9
Package Version:
Graylog: 4.3.9
OpenSearch: 1.3.8-1
Service logs, configurations, and environment variables:

Not relevant, for now, I think. Can be provided upon request though.

3. What steps have you already taken to try and solve the problem?

Read the documentation and the forum.
Perhaps I oversaw the obvious… O_o

4. How can the community help?

What do you guys do to keep your OpenSearch from eating up your disk resources like there was no tomorrow?

TIA!

Joel_Duffield · May 1, 2023, 9:29am

What is the volume you are ingesting daily (you can see this on system>overview)?

What are the retention settings of your indices (system>indices)?

m_mlk · May 1, 2023, 12:05pm

Hi @Joel_Duffield

Most indices stick to the defaults:

Thanks!

Joel_Duffield · May 1, 2023, 1:29pm

you are pushing in up to 600GB a day, 15TB in the last 30 days. I can’t see the sizes of those indexes but the bottom two only rotate once a month and keep 12 so if those are large you are keeping that data for at least a year. So from just those data points, ya I would guess you are burning through a ton of disk space.

How much data are you wanting to keep?

jgutie45 · May 2, 2023, 7:56am

Hello m_mlk

can you tell us how many machines are you ingering from?

Thanks a lot.

jgutie45 · May 2, 2023, 8:01am

Hello Joel_Duffield

Can you tell me how configure graylog to stay data 1 year?
Can I keep graylog data “forever”? Does it only depend on the disk size?
Can you help me to make a encrypt configuration?

Best regards.

Joel_Duffield · May 4, 2023, 1:29am

The length of time data is stored for is governed by the index settings. An index (where the data is written in opensearch) has two key settings, how often it is rotated (retured from being written to to being read only) and retention (how many of those old versions to keep around).

Rotation can be based on multiple things, but for this use time is easiest. So if we set the rotation to every 1 day, and retain the last 90 we now have 90 days of data. There are a million advanced topics on this, but that is the basics of it.
You can read all about it here Index model

For encryption you would either want to look at OS disk encryption, or see if opensearch has some encryption options, Graylog doesn’t have encryption of data as a built in feature.

jgutie45 · May 4, 2023, 8:08am

Hello Joel_Duffield

When I told you to the encryption was about comunications encryption between agents and graylog server, not about hard disk of data.
I have to read indices documenation a lot.

Thanks mister. If you want to tell me something please don´t doubt to do it.

m_mlk · May 12, 2023, 8:01am

Hi @Joel_Duffield

those indices rotating once a month are Graylog’s defaults:

I guess it’s OK to reduce that and keep, say, 1 month of Graylog’s internal logging…

TIA

m_mlk · May 12, 2023, 8:24am

Hello again @Joel_Duffield

Different applications use different indices and streams. Therefore, they also have different rotation/retention times, some of them must comply with rules and regulations for auditors and all that jazz…

My idea was to send rotated indices to some cold storage but I read that this option is only available in the enterprise version …

My poor-man’s approach was to take full snapshots of the master node and send them to a cold storage disk, reduce the rotation times for the indices and use the cold-storage as some sort of point-in-time restore… Would that work?

TIA

jgutie45 · May 12, 2023, 11:24am

Hi m_mlk

I am just thinking about in the same today (all the last week).
I am learning all graylog documentation (opensearch too) and we have to know that opensearch´s snapshot are incremental so if you delete a old snapshot and you try to restore a snapshot less old that the snapshot deleted , the operation will not work because opensear use the previos snapshots to make the new snapshots and it only copy de different data, the same data stays in the old snapshot, it is used to not use a lot of hard disk space.

Best regards. Have a nice day.

jgutie45 · May 12, 2023, 11:31am

ok thanks
This is it and can´t be changed, don´t?

Best regards. Good luck.
Sorry to reponse so late but I was making a moke server to a work teammate.

system · May 26, 2023, 11:31am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch (opensearch) nodes disk usage above flood stage watermark Graylog Central (peer support) alert	4	188	December 19, 2024
Elasticsearch Heap utilization Graylog Central (peer support)	5	1950	June 15, 2017
Elasticsearch Heap requirements skyrocketed Graylog Central (peer support)	5	1023	September 13, 2018
Graylog Elasticsearch Storage Graylog Central (peer support)	2	100	September 13, 2024
Graylog Journal Filling Up Graylog Central (peer support)	4	806	June 19, 2017

OpenSearch is very disk-hungry

Related topics