How to zip old Elasticsearch archives

Hello Friends,

How to zip old Elasticsearch directories as I don’t have enough space in my elasticsearch backup directory.

Hi,

I don’t think that you have the option in elasticsearch to zip data, even if you do it mannualy, you data won’t be “searcheable”.

It’s a good idea to have mor than one elasticsearch node, so you can configure Elasticsearch Curator to manage you data life cycle, moving old data to another node and purge data older than “X” days or something like that.

Curator also allow you to take some Snapshots for backup purposes.

Thanks reimlima,

Can you pls help with the installation and configuration steps of curator?

or Any documentation.

Hi, sure thing!

Here’s the solution that solved a problem similar to yours.


Graylog

In your Graylog UI, go to “system/indices” and edit your “Default index set” to rotate every day:

Elasticsearch

Next, in all node in yout ES Cluster, identify “who is who” in you cluster configuration with this options in your elasticsearch.yml:

node.attr.data: hot     # change here to "warm" for warm nodes or "cold" for your cold ones
node.attr.box_type: hot #
  • In the node you will manage indexes with curator follow the steps below:

Create a script with the content below, I named it wire.sh:

#!/bin/bash

CURRENT_INDEX=$(curl -s -XGET "0.0.0.0:9200/_cat/indices?pretty" -H 'Content-Type: application/json' | grep 'graylog_' | awk '{print $3}' | sort -t _ -k 2 -rn | head -1)

curl -s --output /dev/null -XPUT "0.0.0.0:9200/${CURRENT_INDEX}/_settings?pretty" -H 'Content-Type: application/json' --data '
{
  "index.routing.allocation.include.data": "hot",
  "index.routing.allocation.include.box_type": "warm,cold",
  "index.routing.allocation.require.box_type": "warm,cold",
}'

This script will add some additional options to every new index created by Graylog, so you need to run it sometime after a new index is created.

client:
  hosts:
    - ?.?.?.? # Add here all node in your ES Cluster
    - ?.?.?.? # 
    - ?.?.?.? #
    - ?.?.?.? #
  port: 9200
  url_prefix:
  use_ssl: False
  certificate:
  client_cert:
  client_key:
  ssl_no_validate: False
  username:
  password:
  timeout: 30
  master_only: False

logging:
  loglevel: INFO
  logfile: /var/log/curator.log
  logformat: default
  blacklist: []
  • Next, create a file called data-migration.yml or name it whatever you want, whit the content below:

This is the “Playbook” that tells curator what do with the indexes.
In this case it will force new indexes been migrated to hot nodes, and then migrate old nodes to warm after 1 day, cold after 30 days and delete indexes older than 60 days.
Change here according your needs.

actions:
  1:
    action: allocation
    description: "Apply shard allocation filtering rules to newest indexes"
    options:
      key: box_type
      value: hot
      allocation_type: require
      wait_for_completion: True
      max_wait: 3600
      timeout_override:
      continue_if_exception: False
      disable_action: False
      allow_ilm_indices: True
    filters:
      - filtertype: pattern
        kind: prefix
        value: graylog_
      - filtertype: age
        source: creation_date
        direction: younger
        unit: days
        unit_count: 1

  2:
    action: allocation
    description: "Apply shard allocation filtering rules to the hot log"
    options:
      key: box_type
      value: warm
      allocation_type: require
      wait_for_completion: True
      max_wait: 3600
      timeout_override:
      continue_if_exception: False
      disable_action: False
      allow_ilm_indices: True
    filters:
      - filtertype: pattern
        kind: prefix
        value: graylog_
      - filtertype: age
        source: creation_date
        direction: older
        unit: days
        unit_count: 1

  3:
    action: allocation
    description: "Apply shard allocation filtering rules to the warm log"
    options:
      key: box_type
      value: cold
      allocation_type: require
      wait_for_completion: True
      max_wait: 3600
      timeout_override:
      continue_if_exception: False
      disable_action: False
      allow_ilm_indices: True
    filters:
      - filtertype: pattern
        kind: prefix
        value: graylog_
      - filtertype: age
        source: creation_date
        direction: older
        unit: days
        unit_count: 30

  4:
    action: delete_indices
    description: "Delete index after 60 days of life"
    options:
      timeout_override: 3600
      continue_if_exception: False
      disable_action: False
    filters:
      - filtertype: pattern
        kind: prefix
        value: graylog_
      - filtertype: age
        source: creation_date
        direction: older
        unit: days
        unit_count: 60

  5:
    action: forcemerge
    description: "Perform forceMerge to 'max_num_segments' per shard"
    options:
      max_num_segments: 1
      delay:
      timeout_override: 21600
      continue_if_exception: false
      disable_action: false
    filters:
      - filtertype: pattern
        kind: prefix
        value: graylog_
      - filtertype: age
        source: creation_date
        direction: older
        unit: days
        unit_count: 3

Manual tests

Run “wire.sh” and check if it changes the ES Index Settings.

bash wire.sh
curl -s -XGET "0.0.0.0:9200/graylog_[INDEX NUMBER]/_settings?pretty"

Run curator and check if your data has been migrated:

curator --config /etc/curator/curator.yml /etc/curator/data-migration.yml
less /var/log/curator.log

CronJob

If everything worked fine, add to cron:

  • create a cronjob for “wire” and curator:
    /etc/cron.d/wire
MAILTO=""
SHELL=/bin/bash
10 1 * * *       root bash wire.sh

/etc/cron.d/curator

MAILTO=""
SHELL=/bin/bash
0 1 * * *       root    curator --config /etc/curator/curator.yml /etc/curator/data-migration.yml >/dev/null 2>&1

Hope it helps, let me know if you have any other question.

3 Likes

Nice, Thanks reimlima…!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.