Index rotation failure

Hello. I’m running Graylog 3.1.0 on a test system, and last Friday, apparently, it’s run into problems trying to rotate its indices. It is still showing the problem, here’s a sample from the current log (server.log):

2019-09-09T08:22:22.890+02:00 WARN [IndexRotationThread] Deflector is pointing to [firewall-1_10], not the newest one: [firewall-1_11]. Re-pointing.
2019-09-09T08:22:22.902+02:00 ERROR [IndexRotationThread] Couldn’t point deflector to a new index
org.graylog2.indexer.ElasticsearchException: Couldn’t switch alias firewall-1_deflector from index firewall-1_10 to index firewall-1_11

blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];
at org.graylog2.indexer.cluster.jest.JestUtils.specificException( ~[graylog.jar:?]
at org.graylog2.indexer.cluster.jest.JestUtils.execute( ~[graylog.jar:?]
at org.graylog2.indexer.cluster.jest.JestUtils.execute( ~[graylog.jar:?]
at org.graylog2.indexer.indices.Indices.cycleAlias( ~[graylog.jar:?]
at org.graylog2.indexer.MongoIndexSet.pointTo( ~[graylog.jar:?]
at org.graylog2.periodical.IndexRotationThread.checkAndRepair( ~[graylog.jar:?]
at org.graylog2.periodical.IndexRotationThread.lambda$doRun$0( ~[graylog.jar:?]
at java.lang.Iterable.forEach( [?:1.8.0_222]
at org.graylog2.periodical.IndexRotationThread.doRun( [graylog.jar:?]
at [graylog.jar:?]
at java.util.concurrent.Executors$ [?:1.8.0_222]
at java.util.concurrent.FutureTask.runAndReset( [?:1.8.0_222]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301( [?:1.8.0_222]
at java.util.concurrent.ScheduledThreadPoolExecutor$ [?:1.8.0_222]
at java.util.concurrent.ThreadPoolExecutor.runWorker( [?:1.8.0_222]
at java.util.concurrent.ThreadPoolExecutor$ [?:1.8.0_222]
at [?:1.8.0_222]

This type of error applies to all indices on the system, of which there are six, the four standard indices and two that I defined myself.

What can I do to remedy the situation? Index rotation has worked until now…


1 Like

I need to correct my previous statement: only the two indices that I had defined are affected. At least I can’t find “Couldn’t switch alias” messages for them in the server log.

I am, however, also seeing errors like the following:

2019-09-09T08:38:25.191+02:00 WARN [Messages] Failed to index message: index=<graylog_6> id=<6ca513f0-d2cc-11e9-a445-005056842309> error=<{“type”:“cluster_block_exception”,“reason”:“blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];”}>
2019-09-09T08:38:25.191+02:00 WARN [Messages] Failed to index message: index=<graylog_6> id=<6ca513f1-d2cc-11e9-a445-005056842309> error=<{“type”:“cluster_block_exception”,“reason”:“blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];”}>
2019-09-09T08:38:25.191+02:00 ERROR [Messages] Failed to index [2] messages. Please check the index error log in your web interface for the reason. Error: One or more of the items in the Bulk request failed, check BulkResult.getItems() for more information.

I think I’ve found the web interface’s index error log, but that doesn’t really tell me more than the server.log entries. These are the most current entries:

Timestamp Index Letter ID Error message
a few seconds ago graylog_6 e0e7a6b1-d2cc-11e9-a445-005056842309 {“type”:“cluster_block_exception”,“reason”:“blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];”}
a few seconds ago graylog_6 e0e7cdc0-d2cc-11e9-a445-005056842309 {“type”:“cluster_block_exception”,“reason”:“blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];”}
a few seconds ago graylog_6 e0e842f0-d2cc-11e9-a445-005056842309 {“type”:“cluster_block_exception”,“reason”:“blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];”}
a few seconds ago graylog_6 e0e89110-d2cc-11e9-a445-005056842309 {“type”:“cluster_block_exception”,“reason”:“blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];”}

I think I may know the cause for my problem: available disk space was below 15%. I have deleted the oldest index and used disk space is down to 79%, so Elasticsearch should be fine now. Simply restarting Elasticsearch and Graylog don’t result in the problem being gone, however. Any hints?

OK, problem solved, hopefully. I reset the read-only flag on all indices with

curl -X PUT “localhost:9200/_all/_settings?pretty” -H ‘Content-Type: application/json’ -d’
“index.blocks.read_only_allow_delete”: null

And that appears to have done the trick. From what I gather, ES put all indices into read-only mode when the system’s disk space fell below the low watermark.


This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.