Delete old Indexer failures

Hello,

We had a problem with the Elasticsearch Cluster, this was about two years ago. The problems resulted in approx 1.5 million indexer failures. The Index got allready deleted (retention), but the failures still exist. How can we get rid of them? We don’t need that info any longer and the API Call /api/system/indexer/failures show’s them all, which makes the API Call very slow.

I hope it’s pollible to get rid of them?

Best regards,
Christoph

Hi Christoph,

there’s currently no way to delete them in the Graylog web interface but you can delete the documents in the index_failures collection in MongoDB.

See https://docs.mongodb.com/v3.4/tutorial/remove-documents/ for details.

Hello Jochen,

Thanks for the quick reply! I can safely remove all the unneded entreis in the MongoDB colleciton index_failures? With graylog running or should I shut it down first?

Best regards,
Christoph

You can remove them while Graylog is running.

Hello Jochen,

It’s not that easy …

> db.index_failures.remove({ index : "graylog2_8" })
WriteResult({
        "nRemoved" : 0,
        "writeError" : {
                "code" : 20,
                "errmsg" : "cannot remove from a capped collection: graylog2.index_failures"
        }
})

So, I had to dump the mongoDB, drop the index_failures and restore the index_failures colleciton without data:

Dump the collection
/usr/bin/mongodump --out /tmp/backup/ --host localhost --port 27017 --db graylog2 --collection index_failures --username <user>

Delte file
/tmp/backup/graylog2/index_failures.bson

Drop Collection
db.index_failures.drop()

Restore index_failures collection without data
/usr/bin/mongorestore --host localhost --port 27017 --db graylog2 --username <user> /tmp/backup/graylog2/

Now there are no Index failures any more visible on the Graylog Interface and the API.

Thanks for your help.
Christoph

1 Like

Hi KeX,

did you have to stop or restart your graylog for and after restoring the index_failures?

Best regards,

Arnaud

Hello Arnaud,

No, I haven’t restarted graylog for this. In my case, one replica was corrupt. I just reduced the replicas to one and after that increased the replicas again.

Hope that helps.

Christoph

mongo show error messages when I remove all documents:

Graylog:PRIMARY> db.index_failures.remove({})
WriteResult({
“nRemoved” : 0,
“writeError” : {
“code” : 20,
“errmsg” : “cannot remove from a capped collection: graylog.index_failures”
}
})

Looks like the graylog.index_failures is a capped collection.