Ran out of disk, now "Failed to acquire lock on file .lock"


(Carl C. Longnecker) #1

I ran out of disk space, so I expanded the partition, grew the file system, confirmed the free space and restarted the server. However, graylog-server will not start.

I get the following error in the error log, followed by a stack trace:

2018-05-29T14:00:54.726-04:00 ERROR [LogManager] There was an error in one of the threads during logs loading: java.lang.IllegalArgumentException
2018-05-29T14:00:54.737-04:00 INFO  [InputBufferImpl] Message journal is enabled.
2018-05-29T14:00:54.743-04:00 ERROR [KafkaJournal] Unable to start logmanager.
kafka.common.KafkaException: Failed to acquire lock on file .lock in /var/lib/graylog-server/journal. A Kafka instance in another process or thread is using this directory.

I have tried deleting the .lock file, but that doesn’t help. What else should I do?

I’m running 2.4.5 on Ubuntu 16.04.4 LTS


(Pedro Miguel Pereira Serrano Martins) #2

From the error I see you also have kafka.
Did you install it on your own, or do you have just a simple garylog2 setup?


(Carl C. Longnecker) #3

I installed it using the DEB/APT instructions here: http://docs.graylog.org/en/2.4/pages/installation/operating_system_packages.html


(Jochen) #4

That’s only the Kafka journal implementation used by Graylog as its disk journal.

Make sure that only one instance of Graylog is running.

Additionally, if the journal directory is on its own disk partition, make sure to read this FAQ item:
http://docs.graylog.org/en/2.4/pages/faq.html#dedicated-partition-for-the-journal


(Pedro Miguel Pereira Serrano Martins) #5

Have you tried to sudo systemctl stop all the services ? ( elastic search, mongodb and graylog-server )?

If nothing works, once all services are stopped, you can always get the PID of the process holding the file via:

and kill it with kill -9 XXXX where XXXX is the pid.

Then start the services and try again!


(Carl C. Longnecker) #6

I have confirmed only one instance of Graylog is running. And using lsof I have confirmed that no other process is using the .lock file.

The journal directory is not on its own disk partition.

But that reminded me of something I read in my prior research: someone mentioned that having any extra files in the journal folder would cause this error. I didn’t have any extra files, but I did rename the journal folder as a test. Graylog starts up correctly now.

So now the question becomes: how can I process the messages in the journal file I moved? Can I just move the log and index files in to the correct messagejournal-0 directory?


(Carl C. Longnecker) #7

Now that my graylog server has caught up with the backlog, I decided to try moving the message journal files back in to the messagejournal-0 folder, and restart the server. No good. I get exactly the same error about the .lock file. I also tried deleting the .index file and leaving the .log, but same error. So there is clearly something wrong with that journal file.

Is there a way to repair the journal files? Or a different way I should be trying to get the data from them?


(system) #8

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.