How to zip old Elasticsearch archives

reimlima · June 1, 2021, 12:45pm

Hi, sure thing!

Here’s the solution that solved a problem similar to yours.

Graylog

In your Graylog UI, go to “system/indices” and edit your “Default index set” to rotate every day:

Elasticsearch

Next, in all node in yout ES Cluster, identify “who is who” in you cluster configuration with this options in your elasticsearch.yml:

node.attr.data: hot     # change here to "warm" for warm nodes or "cold" for your cold ones
node.attr.box_type: hot #

In the node you will manage indexes with curator follow the steps below:

Create a script with the content below, I named it wire.sh:

#!/bin/bash

CURRENT_INDEX=$(curl -s -XGET "0.0.0.0:9200/_cat/indices?pretty" -H 'Content-Type: application/json' | grep 'graylog_' | awk '{print $3}' | sort -t _ -k 2 -rn | head -1)

curl -s --output /dev/null -XPUT "0.0.0.0:9200/${CURRENT_INDEX}/_settings?pretty" -H 'Content-Type: application/json' --data '
{
  "index.routing.allocation.include.data": "hot",
  "index.routing.allocation.include.box_type": "warm,cold",
  "index.routing.allocation.require.box_type": "warm,cold",
}'

This script will add some additional options to every new index created by Graylog, so you need to run it sometime after a new index is created.

Configure your Repo:
DEB: Install Elasticsearch with Debian Package | Elasticsearch Guide [8.11] | Elastic
RPM: Install Elasticsearch with RPM | Elasticsearch Guide [8.11] | Elastic
Install Curator:
elasticsearch-curator-5.8.4-1.x86_64 (Current Version)
Create a config file for curator:

client:
  hosts:
    - ?.?.?.? # Add here all node in your ES Cluster
    - ?.?.?.? # 
    - ?.?.?.? #
    - ?.?.?.? #
  port: 9200
  url_prefix:
  use_ssl: False
  certificate:
  client_cert:
  client_key:
  ssl_no_validate: False
  username:
  password:
  timeout: 30
  master_only: False

logging:
  loglevel: INFO
  logfile: /var/log/curator.log
  logformat: default
  blacklist: []

Next, create a file called data-migration.yml or name it whatever you want, whit the content below:

This is the “Playbook” that tells curator what do with the indexes.
In this case it will force new indexes been migrated to hot nodes, and then migrate old nodes to warm after 1 day, cold after 30 days and delete indexes older than 60 days.
Change here according your needs.

actions:
  1:
    action: allocation
    description: "Apply shard allocation filtering rules to newest indexes"
    options:
      key: box_type
      value: hot
      allocation_type: require
      wait_for_completion: True
      max_wait: 3600
      timeout_override:
      continue_if_exception: False
      disable_action: False
      allow_ilm_indices: True
    filters:
      - filtertype: pattern
        kind: prefix
        value: graylog_
      - filtertype: age
        source: creation_date
        direction: younger
        unit: days
        unit_count: 1

  2:
    action: allocation
    description: "Apply shard allocation filtering rules to the hot log"
    options:
      key: box_type
      value: warm
      allocation_type: require
      wait_for_completion: True
      max_wait: 3600
      timeout_override:
      continue_if_exception: False
      disable_action: False
      allow_ilm_indices: True
    filters:
      - filtertype: pattern
        kind: prefix
        value: graylog_
      - filtertype: age
        source: creation_date
        direction: older
        unit: days
        unit_count: 1

  3:
    action: allocation
    description: "Apply shard allocation filtering rules to the warm log"
    options:
      key: box_type
      value: cold
      allocation_type: require
      wait_for_completion: True
      max_wait: 3600
      timeout_override:
      continue_if_exception: False
      disable_action: False
      allow_ilm_indices: True
    filters:
      - filtertype: pattern
        kind: prefix
        value: graylog_
      - filtertype: age
        source: creation_date
        direction: older
        unit: days
        unit_count: 30

  4:
    action: delete_indices
    description: "Delete index after 60 days of life"
    options:
      timeout_override: 3600
      continue_if_exception: False
      disable_action: False
    filters:
      - filtertype: pattern
        kind: prefix
        value: graylog_
      - filtertype: age
        source: creation_date
        direction: older
        unit: days
        unit_count: 60

  5:
    action: forcemerge
    description: "Perform forceMerge to 'max_num_segments' per shard"
    options:
      max_num_segments: 1
      delay:
      timeout_override: 21600
      continue_if_exception: false
      disable_action: false
    filters:
      - filtertype: pattern
        kind: prefix
        value: graylog_
      - filtertype: age
        source: creation_date
        direction: older
        unit: days
        unit_count: 3

Manual tests

Run “wire.sh” and check if it changes the ES Index Settings.

bash wire.sh
curl -s -XGET "0.0.0.0:9200/graylog_[INDEX NUMBER]/_settings?pretty"

Run curator and check if your data has been migrated:

curator --config /etc/curator/curator.yml /etc/curator/data-migration.yml
less /var/log/curator.log

CronJob

If everything worked fine, add to cron:

create a cronjob for “wire” and curator:
/etc/cron.d/wire

MAILTO=""
SHELL=/bin/bash
10 1 * * *       root bash wire.sh

/etc/cron.d/curator

MAILTO=""
SHELL=/bin/bash
0 1 * * *       root    curator --config /etc/curator/curator.yml /etc/curator/data-migration.yml >/dev/null 2>&1

Hope it helps, let me know if you have any other question.

Topic		Replies	Views
Rotation of indexe using Curator Graylog Central (peer support)	8	1663	April 2, 2021
Problem about elasticsearch nodes disk usage Graylog Central (peer support)	5	1748	April 2, 2021
Move Elasticsearch Cluster without (major) downtime Graylog Central (peer support) elastic	5	1240	September 29, 2023
Graylog rewriting "index template" in Elasticsearch Graylog Central (peer support)	8	4003	March 16, 2021
Graylog 3.3, Hot/Cold storage, How? Graylog Central (peer support)	2	1789	August 25, 2020

How to zip old Elasticsearch archives

Graylog

Elasticsearch

Manual tests

CronJob

Related topics