Deleting logs from graylog/elasticsearch (a howto)

I found some older topics on doing this, but they didn’t match up exactly with my environment and use case, so I wrote a script to do what I needed, and figured I would share it along with a slightly modified version of my internal documentation guide with folks, along with the basic methodology behind the process.

My setup:
CentOS 6
ElasticSearch 5.5.1
Graylog 2.3.0

This script is exactly the way I use it, so you’ll need to modify the variables at the top of the script for your environment, and you’ll need to modify the ElasticSearch Query to return a search you want deleted.

Basic Concepts:

  • For a given index “set”, graylog has one index it is currently writing to, and X number of indices that are read only, allowing you to search them, but they can’t be written to anymore.
  • Graylog rotates indices based on user criteria (for example, you can have an index set which keeps 10x indices of 2GB each). After the 2GB index size is reached, graylog creates a new index for writing to, by incrementing the number at the end of the index by 1. So if you were previously writing to graylog_0, graylog will now write to graylog_1, and graylog_0 becomes read only. Indices can be manually rotated.
  • I’ve only tried deleting log entries from indices not currently being written by graylog. Someone smarter than me can comment if it’s possible to do it otherwise, but deleting “old” logs suits my needs.
  • You’ll actually be using Elasticsearch to do all the index manipulation including deleting the messages, not graylog.

The basic Process:

  • In Graylog: Make sure the index containing the logs you want to delete is not being actively written to
  • In Elastic Search: Set the index to writeable
  • In Elastic Search: Run a delete command using the query you copied and pasted from the graylog UI
  • In Elastic Search: Make the index read only again
  • In Elastic Search: Optimize/Forcemerge the index to actually delete the messages which are now marked to be deleted.
  • In Graylog: Re calculate the index ranges.

Why? In my case I had a bunch of junk test messages and logs that didn’t conform to new standards, and I wanted it gone, because I’m like that.

My script:

#!/bin/bash
#Define Index here - This CANNOT be an index being actively written to. Must be a "Rotated" index as far as graylog is concerned. 
ES_INDEX=graylog_0
#This can be a username/password combo, or an api/token combo
GL_USERPASS="adminuser:adminpassword"
GL_API_HOST="192.168.10.222:9000"
ES_HOST="localhost"


##Enable writing for the index:
echo -e "Enabling writing for $ES_INDEX...\n"
curl -XPUT 'http://'"$ES_HOST"':9200/'"$ES_INDEX"'/_settings' -d '{
"index" : {
"blocks.write" : false
}
}'


##Deleting messages based on a search query:
echo -e "\n\nDeleting messages based on your search query..."
#######CHANGE THE QUERY PORTION BELOW######
#To get the search query, run a search in graylog returning all messages you want to delete, and click More Actions > Show Query
#Don't copy "from" or "size", start with "query", and end with the square right bracket ]
#The last Right Curly bracket } is to close the curl input, and not a part of the search query
curl -XPOST 'http://'"$ES_HOST"':9200/'"$ES_INDEX"'/_delete_by_query' -d '{


"query": {
    "bool": {
      "must": {
        "query_string": {
          "query": "UniqeString123DeleteMe",
          "allow_leading_wildcard": true
        }
      },
      "filter": {
        "bool": {
          "must": {
            "range": {
              "timestamp": {
                "from": "1970-01-01 00:00:00.000",
                "to": "2017-08-31 17:29:08.987",
                "include_lower": true,
                "include_upper": true
              }
            }
          }
        }
      }
    }
  },
  "sort": [
    {
      "timestamp": {
        "order": "desc"
      }
    }
  ]


}'


##Making index read-only again.
echo -e "\n\nMaking $ES_INDEX read-only again..."
curl -XPUT 'http://'"$ES_HOST"':9200/'"$ES_INDEX"'/_settings' -d '{
"index" : {
"blocks.write" : true
}
}'

##Optimizing/forcemerge index - In previous versions of ES this was called optimizing. New versions use forcemerge
echo -e "\n\nRunning ElasticSearch ForceMerge to optimize index"
curl -XPOST 'http://'"$ES_HOST"':9200/'"$ES_INDEX"'/_forcemerge?only_expunge_deletes=true'

##Graylog Re-calculating index ranges
#Can also be done manually via UI. System > Indices > Choose index group > Show Details on index you modified > Recalculate index ranges
echo -e "\n\nRebuilding $ES_INDEX in graylog"
curl -u $GL_USERPASS -XPOST 'http://'"$GL_API_HOST"'/api/system/indices/ranges/index2_2/rebuild'
echo -e "\n\nRemember to have your pets spay or neutered... Goodbye everybody!"
2 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.