Old indexes have new data


(Markuchi) #1

We are creating new indexes every 12 hours.
We backup indexes daily using snapshots.
Deletion of indexes occur after 90 days.

What we are finding is that new data is ending up in old indexes.
If we delete the old indexes then we will be losing new data.

The bulk of the messages are in the period where the index was the active write index however it is very concerning that we will be losing some messages when we delete old indexes.

Is this normal behavior?
How do we ensure that old indexes only contain messages from when the index was the active write index?

Also it would greatly help if graylog allowed the use of the index creation date in UTC for the index number so instead of graylog_123 it would be graylog_2018091809T212418


(Markuchi) #2

To add to this.
We are upto index graylog_194 however we are seeing some messages right now going into graylog_13 which was the active write index on the 20th June.


(Markuchi) #3

bump, anyone know why new logs might be going into old indexes and how to stop this?
Can we make indexes read only once they are no longer the active write index?


(Philipp Ruland) #4

Heyo :slight_smile:

Well, AFAIK Graylog marks any index that it does not have as active write index as read only.
Do you have any logs that show deletion of newer data with older indices?

I think there is a Github issue somewhere talking about this, but I’m not sure. Have a look at the Graylog Github Issues and give that issue a bump (or open a new one if you cannot find it). This is your best bet to get something like this implemented :slight_smile:

Greetings,
Philipp


(Markuchi) #5

I can see in elasticsearch that we have some indexes which do not have the following set.

    "blocks" : {
      "write" : "true",
      "metadata" : "false",
      "read" : "false"

Majority of the indexes have this set however I have found one of the old indexes does not have this set.
I guess its safe to add the missing information.

However I would imagine graylog by default should periodically check for read only and set this?

Ive raised a issue on github for the index naming.


(Markuchi) #6

I did some digging and looks like 3 different indexes were not made read only.
In the logs normally it would log “Flushed and set <graylog_183> to read-only” however for these 3 indexes this was missing.

I ran the following to fix for the 3 missing indexes:

curl -X PUT "localhost:9200/graylog_126,graylog_181,graylog_182/_settings" -H 'Content-Type: application/json' -d'
{
        "blocks" : {
          "write" : "true",
          "metadata" : "false",
          "read" : "false"
    }
}

(Philipp Ruland) #7

Well, then you should open an issue in the Graylog Github Issues to get this issue checked and resolved :slight_smile:

Greetings,
Philipp


(Markuchi) #8

Ive raised an issue on github also. Thanks.


(system) #9

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.