Negative number of Unprocessed Message

Hello,

I have three Graylog (3.1.2) nodes (Kubernetes). With unexpected load, the Disk journals were full (with some message are deleted), and then utilization was lower to ~70% with ~ 5,000,000 messages in each of their journals.

I increased the capacity (replaced) the node, the processing seemed good (speed up) for a short time. However a little bit later, one of the journals contains -454,097,877 unprocessed messages, and the other two have a few hundred messages in their journal.

What does this mean?

I replace the node one by one again, the numbers are still the same.

The negative number is slowly going up.
It seems the messages in the journals are lost.

Tige

@Tiger It looks like your elasticsearch hosts not accepting messages due to some reason which we can get after reviewing ES logs and also check journal directory utilization if that’s full then you should make some space on it. Please check the below FAQ documentation for your reference.
https://docs.graylog.org/en/latest/pages/faq.html#what-does-uncommited-messages-deleted-from-journal-mean

Hope this helps you :slight_smile:

Found a warning in graylog node. It may be related to this issue:
2020-08-07 20:47:54,158 WARN [AbstractTcpTransport] - receiveBufferSize (SO_RCVBUF) for input GELFTCPInput{title=Gelf_TCP_Graylog, type=org.graylog2.inputs.gelf.tcp.GELFTCPInput, nodeId=null} (channel [id: 0xa27b08c9, L:/0.0.0.0:12201]) should be 1048576 but is 425984. - {}

It seems the bad node only taking in messages without process and I cannot figure out how to fix if in the non-prod environment, turn on graylog.journal.deleteBeforeStart and restart the node to bring it up to normal.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.