I am still reading over the post you sent, @drewmiranda-gl . In the meantime, I have the following from the logs…
Graylog server log is saying:
2023-06-21T20:42:00.023-09:00 ERROR [MessagesAdapterES7] Failed to index [3] messages. Please check the index error log in your we
b interface for the reason. Error: failure in bulk execution:
[0]: index [graylog_39], type [_doc], id [2f0b5aab-1052-11ee-8e15-005056a99d43], message [ElasticsearchException[Elasticsearch exc
eption [type=unavailable_shards_exception, reason=[graylog_39][3] primary shard is not active Timeout: [1m], request: [BulkShardRe
quest [[graylog_39][3]] containing [3] requests]]]]
[1]: index [graylog_39], type [_doc], id [2f0bcfd7-1052-11ee-8e15-005056a99d43], message [ElasticsearchException[Elasticsearch exc
eption [type=unavailable_shards_exception, reason=[graylog_39][3] primary shard is not active Timeout: [1m], request: [BulkShardRe
quest [[graylog_39][3]] containing [3] requests]]]]
[2]: index [graylog_39], type [_doc], id [2f8b6019-1052-11ee-8e15-005056a99d43], message [ElasticsearchException[Elasticsearch exc
eption [type=unavailable_shards_exception, reason=[graylog_39][3] primary shard is not active Timeout: [1m], request: [BulkShardRe
quest [[graylog_39][3]] containing [3] requests]]]]
Elastic errors:
[2023-06-21T07:39:24,219][WARN ][o.e.c.r.a.AllocationService] [elastic-01-in-prod] failing shard [failed shard, shard [graylog_39]
[3], node[POHn_aN0R-CE7kteDGVRfA], [P], s[STARTED], a[id=KnnvS0goSM2tlczol_G5Rg], message [shard failure, reason [lucene commit fa
iled]], failure [NoSuchFileException[/var/lib/elasticsearch/nodes/0/indices/utxx9Zc_TKaeZqRwpTRkfw/3/index/_l_Lucene84_0.tim]], ma
rkAsStale [true]]
java.nio.file.NoSuchFileException: /var/lib/elasticsearch/nodes/0/indices/utxx9Zc_TKaeZqRwpTRkfw/3/index/_l_Lucene84_0.tim
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:92) ~[?:?]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106) ~[?:?]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) ~[?:?]
at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:224) ~[?:?]
at java.nio.channels.FileChannel.open(FileChannel.java:308) ~[?:?]
at java.nio.channels.FileChannel.open(FileChannel.java:367) ~[?:?]
at org.apache.lucene.util.IOUtils.fsync(IOUtils.java:469) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed12
23c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.store.FSDirectory.fsync(FSDirectory.java:331) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef
36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.store.FSDirectory.sync(FSDirectory.java:286) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.store.FilterDirectory.sync(FilterDirectory.java:84) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.store.FilterDirectory.sync(FilterDirectory.java:84) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.store.LockValidatingDirectoryWrapper.sync(LockValidatingDirectoryWrapper.java:68) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:5099) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:3460) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3770) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3728) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.elasticsearch.index.engine.InternalEngine.commitIndexWriter(InternalEngine.java:2793) ~[elasticsearch-7.17.10.jar:7.17.10]
at org.elasticsearch.index.engine.InternalEngine.flush(InternalEngine.java:2075) ~[elasticsearch-7.17.10.jar:7.17.10]
at org.elasticsearch.index.shard.IndexShard.flush(IndexShard.java:1432) ~[elasticsearch-7.17.10.jar:7.17.10]
at org.elasticsearch.index.shard.IndexShard$8.doRun(IndexShard.java:3818) ~[elasticsearch-7.17.10.jar:7.17.10]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:777) ~[elasticsearch-7.17.10.jar:7.17.10]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-7.17.10.jar:7.17.10]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) [?:?]
at java.lang.Thread.run(Thread.java:1623) [?:?]
[2023-06-21T07:39:24,418][INFO ][o.e.c.r.a.AllocationService] [elastic-01-in-prod] Cluster health status changed from [GREEN] to [RED] (reason: [shards failed [[graylog_39][3]]]).
[2023-06-21T07:39:25,682][WARN ][o.e.c.r.a.AllocationService] [elastic-01-in-prod] failing shard [failed shard, shard [graylog_39][3], node[POHn_aN0R-CE7kteDGVRfA], [P], recovery_source[existing store recovery; bootstrap_history_uuid=false], s[INITIALIZING], a[id=KnnvS0goSM2tlczol_G5Rg], unassigned_info[[reason=ALLOCATION_FAILED], at[2023-06-21T16:39:24.209Z], failed_attempts[1], delayed=false, details[failed shard on node [POHn_aN0R-CE7kteDGVRfA]: shard failure, reason [lucene commit failed], failure NoSuchFileException[/var/lib/elasticsearch/nodes/0/indices/utxx9Zc_TKaeZqRwpTRkfw/3/index/_l_Lucene84_0.tim]], allocation_status[no_valid_shard_copy]], message [shard failure, reason [corrupt file (source: [start])]], failure [CorruptIndexException[Problem reading index. (resource=/var/lib/elasticsearch/nodes/0/indices/utxx9Zc_TKaeZqRwpTRkfw/3/index/_l_Lucene84_0.tim)]; nested: NoSuchFileException[/var/lib/elasticsearch/nodes/0/indices/utxx9Zc_TKaeZqRwpTRkfw/3/index/_l_Lucene84_0.tim]; ], markAsStale [true]]
org.apache.lucene.index.CorruptIndexException: Problem reading index. (resource=/var/lib/elasticsearch/nodes/0/indices/utxx9Zc_TKaeZqRwpTRkfw/3/index/_l_Lucene84_0.tim)
at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:144) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:83) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:171) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.index.ReadersAndUpdates.getReadOnlyClone(ReadersAndUpdates.java:213) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.index.IndexWriter.lambda$getReader$0(IndexWriter.java:571) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:108) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:629) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:121) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:97) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.elasticsearch.index.engine.InternalEngine.createReaderManager(InternalEngine.java:669) ~[elasticsearch-7.17.10.jar:7.17.10]
at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:261) ~[elasticsearch-7.17.10.jar:7.17.10]
at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:199) ~[elasticsearch-7.17.10.jar:7.17.10]
at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:14) ~[elasticsearch-7.17.10.jar:7.17.10]
at org.elasticsearch.index.shard.IndexShard.innerOpenEngineAndTranslog(IndexShard.java:2064) ~[elasticsearch-7.17.10.jar:7.17.10]
at org.elasticsearch.index.shard.IndexShard.openEngineAndRecoverFromTranslog(IndexShard.java:2028) ~[elasticsearch-7.17.10.jar:7.17.10]
at org.elasticsearch.index.shard.StoreRecovery.internalRecoverFromStore(StoreRecovery.java:472) ~[elasticsearch-7.17.10.jar:7.17.10]
at org.elasticsearch.index.shard.StoreRecovery.lambda$recoverFromStore$0(StoreRecovery.java:90) ~[elasticsearch-7.17.10.jar:7.17.10]
at org.elasticsearch.action.ActionListener.completeWith(ActionListener.java:436) ~[elasticsearch-7.17.10.jar:7.17.10]
at org.elasticsearch.index.shard.StoreRecovery.recoverFromStore(StoreRecovery.java:88) ~[elasticsearch-7.17.10.jar:7.17.10]
at org.elasticsearch.index.shard.IndexShard.recoverFromStore(IndexShard.java:2361) ~[elasticsearch-7.17.10.jar:7.17.10]
at org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:62) ~[elasticsearch-7.17.10.jar:7.17.10]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:777) ~[elasticsearch-7.17.10.jar:7.17.10]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-7.17.10.jar:7.17.10]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) [?:?]
at java.lang.Thread.run(Thread.java:1623) [?:?]
Caused by: java.nio.file.NoSuchFileException: /var/lib/elasticsearch/nodes/0/indices/utxx9Zc_TKaeZqRwpTRkfw/3/index/_l_Lucene84_0.tim
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:92) ~[?:?]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106) ~[?:?]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) ~[?:?]
at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:224) ~[?:?]
at java.nio.channels.FileChannel.open(FileChannel.java:308) ~[?:?]
at java.nio.channels.FileChannel.open(FileChannel.java:367) ~[?:?]
at org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:238) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.elasticsearch.index.store.FsDirectoryFactory$HybridDirectory.openInput(FsDirectoryFactory.java:126) ~[elasticsearch-7.17.10.jar:7.17.10]
at org.apache.lucene.store.FilterDirectory.openInput(FilterDirectory.java:100) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.store.FilterDirectory.openInput(FilterDirectory.java:100) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.codecs.blocktree.BlockTreeTermsReader.<init>(BlockTreeTermsReader.java:141) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.codecs.lucene84.Lucene84PostingsFormat.fieldsProducer(Lucene84PostingsFormat.java:441) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.<init>(PerFieldPostingsFormat.java:315) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:395) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:114) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]
... 25 more
[2023-06-21T07:45:38,828][WARN ][o.e.i.e.Engine ] [elastic-01-in-prod] [graylog_39][2] failed engine [lucene commit failed]
java.nio.file.NoSuchFileException: /var/lib/elasticsearch/nodes/0/indices/utxx9Zc_TKaeZqRwpTRkfw/2/index/_b_Lucene84_0.tim
...