How to zip old Elasticsearch archives

deepakkumardubey · May 27, 2021, 4:17am

Hello Friends,

How to zip old Elasticsearch directories as I don’t have enough space in my elasticsearch backup directory.

reimlima · May 28, 2021, 6:46pm

Hi,

I don’t think that you have the option in elasticsearch to zip data, even if you do it mannualy, you data won’t be “searcheable”.

It’s a good idea to have mor than one elasticsearch node, so you can configure Elasticsearch Curator to manage you data life cycle, moving old data to another node and purge data older than “X” days or something like that.

Curator also allow you to take some Snapshots for backup purposes.

deepakkumardubey · June 1, 2021, 3:58am

Thanks reimlima,

Can you pls help with the installation and configuration steps of curator?

or Any documentation.

reimlima · June 1, 2021, 12:45pm

Hi, sure thing!

Here’s the solution that solved a problem similar to yours.

Graylog

In your Graylog UI, go to “system/indices” and edit your “Default index set” to rotate every day:

Elasticsearch

Next, in all node in yout ES Cluster, identify “who is who” in you cluster configuration with this options in your elasticsearch.yml:

node.attr.data: hot     # change here to "warm" for warm nodes or "cold" for your cold ones
node.attr.box_type: hot #

In the node you will manage indexes with curator follow the steps below:

Create a script with the content below, I named it wire.sh:

#!/bin/bash

CURRENT_INDEX=$(curl -s -XGET "0.0.0.0:9200/_cat/indices?pretty" -H 'Content-Type: application/json' | grep 'graylog_' | awk '{print $3}' | sort -t _ -k 2 -rn | head -1)

curl -s --output /dev/null -XPUT "0.0.0.0:9200/${CURRENT_INDEX}/_settings?pretty" -H 'Content-Type: application/json' --data '
{
  "index.routing.allocation.include.data": "hot",
  "index.routing.allocation.include.box_type": "warm,cold",
  "index.routing.allocation.require.box_type": "warm,cold",
}'

This script will add some additional options to every new index created by Graylog, so you need to run it sometime after a new index is created.

Configure your Repo:
DEB: Install Elasticsearch with Debian Package | Elasticsearch Guide [8.11] | Elastic
RPM: Install Elasticsearch with RPM | Elasticsearch Guide [8.11] | Elastic
Install Curator:
elasticsearch-curator-5.8.4-1.x86_64 (Current Version)
Create a config file for curator:

client:
  hosts:
    - ?.?.?.? # Add here all node in your ES Cluster
    - ?.?.?.? # 
    - ?.?.?.? #
    - ?.?.?.? #
  port: 9200
  url_prefix:
  use_ssl: False
  certificate:
  client_cert:
  client_key:
  ssl_no_validate: False
  username:
  password:
  timeout: 30
  master_only: False

logging:
  loglevel: INFO
  logfile: /var/log/curator.log
  logformat: default
  blacklist: []

Next, create a file called data-migration.yml or name it whatever you want, whit the content below:

This is the “Playbook” that tells curator what do with the indexes.
In this case it will force new indexes been migrated to hot nodes, and then migrate old nodes to warm after 1 day, cold after 30 days and delete indexes older than 60 days.
Change here according your needs.

actions:
  1:
    action: allocation
    description: "Apply shard allocation filtering rules to newest indexes"
    options:
      key: box_type
      value: hot
      allocation_type: require
      wait_for_completion: True
      max_wait: 3600
      timeout_override:
      continue_if_exception: False
      disable_action: False
      allow_ilm_indices: True
    filters:
      - filtertype: pattern
        kind: prefix
        value: graylog_
      - filtertype: age
        source: creation_date
        direction: younger
        unit: days
        unit_count: 1

  2:
    action: allocation
    description: "Apply shard allocation filtering rules to the hot log"
    options:
      key: box_type
      value: warm
      allocation_type: require
      wait_for_completion: True
      max_wait: 3600
      timeout_override:
      continue_if_exception: False
      disable_action: False
      allow_ilm_indices: True
    filters:
      - filtertype: pattern
        kind: prefix
        value: graylog_
      - filtertype: age
        source: creation_date
        direction: older
        unit: days
        unit_count: 1

  3:
    action: allocation
    description: "Apply shard allocation filtering rules to the warm log"
    options:
      key: box_type
      value: cold
      allocation_type: require
      wait_for_completion: True
      max_wait: 3600
      timeout_override:
      continue_if_exception: False
      disable_action: False
      allow_ilm_indices: True
    filters:
      - filtertype: pattern
        kind: prefix
        value: graylog_
      - filtertype: age
        source: creation_date
        direction: older
        unit: days
        unit_count: 30

  4:
    action: delete_indices
    description: "Delete index after 60 days of life"
    options:
      timeout_override: 3600
      continue_if_exception: False
      disable_action: False
    filters:
      - filtertype: pattern
        kind: prefix
        value: graylog_
      - filtertype: age
        source: creation_date
        direction: older
        unit: days
        unit_count: 60

  5:
    action: forcemerge
    description: "Perform forceMerge to 'max_num_segments' per shard"
    options:
      max_num_segments: 1
      delay:
      timeout_override: 21600
      continue_if_exception: false
      disable_action: false
    filters:
      - filtertype: pattern
        kind: prefix
        value: graylog_
      - filtertype: age
        source: creation_date
        direction: older
        unit: days
        unit_count: 3

Manual tests

Run “wire.sh” and check if it changes the ES Index Settings.

bash wire.sh
curl -s -XGET "0.0.0.0:9200/graylog_[INDEX NUMBER]/_settings?pretty"

Run curator and check if your data has been migrated:

curator --config /etc/curator/curator.yml /etc/curator/data-migration.yml
less /var/log/curator.log

CronJob

If everything worked fine, add to cron:

create a cronjob for “wire” and curator:
/etc/cron.d/wire

MAILTO=""
SHELL=/bin/bash
10 1 * * *       root bash wire.sh

/etc/cron.d/curator

MAILTO=""
SHELL=/bin/bash
0 1 * * *       root    curator --config /etc/curator/curator.yml /etc/curator/data-migration.yml >/dev/null 2>&1

Hope it helps, let me know if you have any other question.

deepakkumardubey · June 2, 2021, 7:43am

Nice, Thanks reimlima…!

system · June 16, 2021, 7:44am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Rotation of indexe using Curator Graylog Central (peer support)	8	1663	April 2, 2021
Problem about elasticsearch nodes disk usage Graylog Central (peer support)	5	1748	April 2, 2021
Move Elasticsearch Cluster without (major) downtime Graylog Central (peer support) elastic	5	1260	September 29, 2023
Graylog rewriting "index template" in Elasticsearch Graylog Central (peer support)	8	4015	March 16, 2021
Graylog 3.3, Hot/Cold storage, How? Graylog Central (peer support)	2	1792	August 25, 2020

How to zip old Elasticsearch archives

Graylog

Elasticsearch

Manual tests

CronJob

Related topics