Stuck at Graylog is Restarting

Hello All,

Running GL 2.2, from OVA. Ran the sudo graylog-ctl set-admin-password command. It took just fine, but a reboot now and the web page is stuck at Graylog is restarting

Even raan the graylog-ctl reconfigure…to no avail

Ideas?

Thanks

TP

Does the problem repeat after rebooting the virtual machine?

Yes it does…rebooted a few times. Ssh into the system and graylog-ctl status shows everything running

Are there any error messages in the system logs of the virtual machine?

Check http://docs.graylog.org/en/2.2/pages/configuration/file_location.html#omnibus-package for the correct file locations.

How can one get they’re on the ova with the ubuntu user? Trying to switch directories and it keeps saying permission denied when i get to the server folder

You can change to the super user (root) with the following command:

$ sudo -i

OK…so under Elasticsearch, the graylog.log shows this:

[2017-03-14 14:27:05,185][WARN ][cluster.action.shard     ] [Green Goblin IV] [graylog_25][2] received shard failed for target shard [[graylog_25][2], node[m$
[2017-03-14 14:27:05,207][WARN ][index.translog           ] [Green Goblin IV] [graylog_25][3] deleted previously created, but not yet committed, next generat$
[2017-03-14 14:27:05,216][WARN ][index.translog           ] [Green Goblin IV] [graylog_25][0] failed to delete temp file /var/opt/graylog/data/elasticsearch/$
java.nio.file.NoSuchFileException: /var/opt/graylog/data/elasticsearch/graylog/nodes/0/indices/graylog_25/0/translog/translog-4114821162271181852.tlog
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
        at sun.nio.fs.UnixFileSystemProvider.implDelete(UnixFileSystemProvider.java:244)
        at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103)
        at java.nio.file.Files.delete(Files.java:1126)
        at org.elasticsearch.index.translog.Translog.recoverFromFiles(Translog.java:358)
        at org.elasticsearch.index.translog.Translog.<init>(Translog.java:179)
        at org.elasticsearch.index.engine.InternalEngine.openTranslog(InternalEngine.java:205)
        at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:148)
        at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:25)
        at org.elasticsearch.index.shard.IndexShard.newEngine(IndexShard.java:1513)
        at org.elasticsearch.index.shard.IndexShard.createNewEngine(IndexShard.java:1497)
        at org.elasticsearch.index.shard.IndexShard.internalPerformTranslogRecovery(IndexShard.java:970)
        at org.elasticsearch.index.shard.IndexShard.performTranslogRecovery(IndexShard.java:942)
        at org.elasticsearch.index.shard.StoreRecoveryService.recoverFromStore(StoreRecoveryService.java:241)
        at org.elasticsearch.index.shard.StoreRecoveryService.access$100(StoreRecoveryService.java:56)
        at org.elasticsearch.index.shard.StoreRecoveryService$1.run(StoreRecoveryService.java:129)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
[2017-03-14 14:27:05,216][WARN ][indices.cluster          ] [Green Goblin IV] [[graylog_25][0]] marking and sending shard failed due to [failed recovery]
[graylog_25][[graylog_25][0]] IndexShardRecoveryException[failed to recovery from gateway]; nested: EngineCreationFailureException[failed to create engine]; $
        at org.elasticsearch.index.shard.StoreRecoveryService.recoverFromStore(StoreRecoveryService.java:250)
        at org.elasticsearch.index.shard.StoreRecoveryService.access$100(StoreRecoveryService.java:56)
        at org.elasticsearch.index.shard.StoreRecoveryService$1.run(StoreRecoveryService.java:129)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: [graylog_25][[graylog_25][0]] EngineCreationFailureException[failed to create engine]; nested: FileSystemException[/var/opt/graylog/data/elasticse$
        at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:152)
        at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:25)
        at org.elasticsearch.index.shard.IndexShard.newEngine(IndexShard.java:1513)
        at org.elasticsearch.index.shard.IndexShard.createNewEngine(IndexShard.java:1497)
        at org.elasticsearch.index.shard.IndexShard.internalPerformTranslogRecovery(IndexShard.java:970)
        at org.elasticsearch.index.shard.IndexShard.performTranslogRecovery(IndexShard.java:942)
        at org.elasticsearch.index.shard.StoreRecoveryService.recoverFromStore(StoreRecoveryService.java:241)
        ... 5 more
Caused by: java.nio.file.FileSystemException: /var/opt/graylog/data/elasticsearch/graylog/nodes/0/indices/graylog_25/0/translog/translog.ckp -> /var/opt/gray$
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
        at sun.nio.fs.UnixCopyFile.copyFile(UnixCopyFile.java:253)
        at sun.nio.fs.UnixCopyFile.copy(UnixCopyFile.java:581)
        at sun.nio.fs.UnixFileSystemProvider.copy(UnixFileSystemProvider.java:253)
        at java.nio.file.Files.copy(Files.java:1274)
        at org.elasticsearch.index.translog.Translog.recoverFromFiles(Translog.java:344)
        at org.elasticsearch.index.translog.Translog.<init>(Translog.java:179)
        at org.elasticsearch.index.engine.InternalEngine.openTranslog(InternalEngine.java:205)
        at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:148)
        ... 11 more
[2017-03-14 14:27:05,226][WARN ][index.translog           ] [Green Goblin IV] [graylog_26][1] failed to delete temp file /var/opt/graylog/data/elasticsearch/$

There are no other log files in any of the other directories

Have you manually deleted any of the files in the directory /var/opt/graylog/data/elasticsearch/graylog/?

I have not…but I can…is that the next step?

Thanks

TP

No.

Have you run out of disk space in your virtual machine?

That’s EXACTLY what happened. I just looked at the df -h command and see the data section full.

280GB have been sucked up. How do i purge off some space to get it back up?

TP

Try removing the directory /var/opt/graylog/data/elasticsearch/graylog/nodes/0/indices/graylog_25 while Elasticsearch is stopped.

Well…I deleted the graylog_25 and that got me about 1.5 GB back, rebooted…still got the same thing.

Would I delete some more of the indices?

TP

You can also try removing the (probably corrupted) data files of etcd from /var/opt/graylog/data/etcd/*.

But make sure to check all log files for error and warning messages.