Input not receiving any new messages

zrevans826 · November 30, 2020, 4:12pm

Hi. I’m new to graylog and trying to work my way through this.

Last week, I was having issues with Elasticsearch filling up. So I deleted the home volume and expanded the root volume. Since then, I have not been receiving any new messages. Can anyone help me through this issue? I’m using 3.8. I’m not sure what logs you would need to see but I can provide them.

ttsandrew · November 30, 2020, 4:53pm

Hello @zrevans826, welcome!

Are the core services (elasticsearch, mongod, graylog-server) running?
Are the inputs you’re expecting to receive messages through running?
Are your streams started?
Do you see queueing in the in/out buffers or the disk journal?

zrevans826 · November 30, 2020, 5:04pm

Hi. Yes, those are running except that when I just checked the status of graylog-server, I am getting this error Failed to load class “org.slf4j.impl.StaticLoggerBinder”

My inputs show as running.

Streams look like so.

In/Out Buffer looks like so…

ttsandrew · November 30, 2020, 5:08pm

I see 352 in 0 out in the second image. Is your disk journal filling up?

zrevans826 · November 30, 2020, 5:28pm

Its not giving me any errors indicating that it is. Only notification I have is an outdated version of Graylog.

ttsandrew · November 30, 2020, 5:32pm

Right below buffers in the second image there’s “Disk journal”. You would see it there.

Basically I’m trying to determine if anything is being indexed to help isolate this to a problem with Graylog or Elasticsearch.

Also, on the inputs page, do you see traffic being received by the pertinent input?

zrevans826 · November 30, 2020, 5:38pm

I see. No, it does not look like the journal is filling though that was the issue I was having before I delete the home partition.

I only have 1 input so far (just started about 3 weeks ago)

ttsandrew · November 30, 2020, 5:51pm

The number of unprocessed messages in the journal is concerning. I honestly don’t know what to make of such a large negative number of messages in the journal. It suggests that something is confused. @jan @aaronsachs @tmacgbay @shoothub you guys have any input on that?

@zrevans826, is there anything new showing up in your all messages stream? On the indices page, for the index to which these messages should be routed, what does it show for most recent message? How old is it?

Are there any errors in your Graylog log file?

ttsandrew · November 30, 2020, 6:06pm

Follow up, what does the contents of /var/lib/graylog-server/journal look like? We may try to flush all of the messages (clear/reset the disk journal) to see if messages start processing again. This means you will lose everything that has not yet been indexed. You will have to stop the graylog-server service to perform that task.

zrevans826 · November 30, 2020, 6:19pm

Here it is.

ttsandrew · November 30, 2020, 6:24pm

If you don’t have an issue with resetting the journal, stop the graylog-server service and delete the contents of /var/lib/graylog-server/journal

Then, start graylog-server and evaluate.

zrevans826 · November 30, 2020, 6:41pm

I completed that but its still not receiving any messages.

The disk journal is still showing a large amount of unprocessed messages.

Here is what my graylog-server.log says for today.

2020-11-30T13:29:56.164-05:00 WARN [UdpTransport] receiveBufferSize (SO_RCVBUF) for input SyslogUDPInput{title=syslog_udp, type=org.graylog2.inputs.syslog.udp.SyslogUDPInput, nodeId=79bd119d-aca8-483a-8988-fcb31b02867e} (channel [id: 0x0d5dc0a6, L:/0:0:0:0:0:0:0:0%0:1514]) should be 262144 but is 425984.
2020-11-30T13:29:56.164-05:00 WARN [UdpTransport] receiveBufferSize (SO_RCVBUF) for input GELFUDPInput{title=nxlog_udp, type=org.graylog2.inputs.gelf.udp.GELFUDPInput, nodeId=79bd119d-aca8-483a-8988-fcb31b02867e} (channel [id: 0x52015045, L:/0:0:0:0:0:0:0:0%0:3514]) should be 262144 but is 425984.
2020-11-30T13:29:56.165-05:00 WARN [UdpTransport] receiveBufferSize (SO_RCVBUF) for input SyslogUDPInput{title=syslog_udp, type=org.graylog2.inputs.syslog.udp.SyslogUDPInput, nodeId=79bd119d-aca8-483a-8988-fcb31b02867e} (channel [id: 0x72862052, L:/0:0:0:0:0:0:0:0%0:1514]) should be 262144 but is 425984.
2020-11-30T13:29:56.164-05:00 WARN [UdpTransport] receiveBufferSize (SO_RCVBUF) for input SyslogUDPInput{title=Cisco ASA, type=org.graylog2.inputs.syslog.udp.SyslogUDPInput, nodeId=null} (channel [id: 0x2a4264e8, L:/0:0:0:0:0:0:0:0%0:5341]) should be 262144 but is 425984.
2020-11-30T13:29:56.165-05:00 WARN [UdpTransport] receiveBufferSize (SO_RCVBUF) for input GELFUDPInput{title=nxlog_udp, type=org.graylog2.inputs.gelf.udp.GELFUDPInput, nodeId=79bd119d-aca8-483a-8988-fcb31b02867e} (channel [id: 0x5c1aa3e6, L:/0:0:0:0:0:0:0:0%0:3514]) should be 262144 but is 425984.
2020-11-30T13:29:56.165-05:00 WARN [UdpTransport] receiveBufferSize (SO_RCVBUF) for input SyslogUDPInput{title=Cisco ASA, type=org.graylog2.inputs.syslog.udp.SyslogUDPInput, nodeId=null} (channel [id: 0xa34a3a66, L:/0:0:0:0:0:0:0:0%0:5341]) should be 262144 but is 425984.
2020-11-30T13:29:56.165-05:00 WARN [UdpTransport] receiveBufferSize (SO_RCVBUF) for input SyslogUDPInput{title=syslog_udp, type=org.graylog2.inputs.syslog.udp.SyslogUDPInput, nodeId=79bd119d-aca8-483a-8988-fcb31b02867e} (channel [id: 0x75959ce6, L:/0:0:0:0:0:0:0:0%0:1514]) should be 262144 but is 425984.
2020-11-30T13:29:56.165-05:00 WARN [UdpTransport] receiveBufferSize (SO_RCVBUF) for input GELFUDPInput{title=nxlog_udp, type=org.graylog2.inputs.gelf.udp.GELFUDPInput, nodeId=79bd119d-aca8-483a-8988-fcb31b02867e} (channel [id: 0x34f588d5, L:/0:0:0:0:0:0:0:0%0:3514]) should be 262144 but is 425984.
2020-11-30T13:29:56.165-05:00 WARN [AbstractTcpTransport] receiveBufferSize (SO_RCVBUF) for input Beats2Input{title=Beats, type=org.graylog.plugins.beats.Beats2Input, nodeId=null} (channel [id: 0x1bb8b779, L:/0:0:0:0:0:0:0:0%0:5044]) should be 1048576 but is 425984.
2020-11-30T13:29:56.166-05:00 WARN [UdpTransport] receiveBufferSize (SO_RCVBUF) for input GELFUDPInput{title=nxlog_udp, type=org.graylog2.inputs.gelf.udp.GELFUDPInput, nodeId=79bd119d-aca8-483a-8988-fcb31b02867e} (channel [id: 0x29c75722, L:/0:0:0:0:0:0:0:0%0:3514]) should be 262144 but is 425984.
2020-11-30T13:29:56.166-05:00 WARN [UdpTransport] receiveBufferSize (SO_RCVBUF) for input SyslogUDPInput{title=Cisco ASA, type=org.graylog2.inputs.syslog.udp.SyslogUDPInput, nodeId=null} (channel [id: 0xa321863a, L:/0:0:0:0:0:0:0:0%0:5341]) should be 262144 but is 425984.
2020-11-30T13:29:56.165-05:00 WARN [UdpTransport] receiveBufferSize (SO_RCVBUF) for input SyslogUDPInput{title=Cisco ASA, type=org.graylog2.inputs.syslog.udp.SyslogUDPInput, nodeId=null} (channel [id: 0x282c3459, L:/0:0:0:0:0:0:0:0%0:5341]) should be 262144 but is 425984.
2020-11-30T13:29:56.165-05:00 WARN [UdpTransport] receiveBufferSize (SO_RCVBUF) for input SyslogUDPInput{title=syslog_udp, type=org.graylog2.inputs.syslog.udp.SyslogUDPInput, nodeId=79bd119d-aca8-483a-8988-fcb31b02867e} (channel [id: 0xf4018082, L:/0:0:0:0:0:0:0:0%0:1514]) should be 262144 but is 425984.
2020-11-30T13:29:56.167-05:00 INFO [InputStateListener] Input [Syslog UDP/5fa0329a6604fc1a29e78847] is now RUNNING
2020-11-30T13:29:58.322-05:00 WARN [Messages] Retrying 51 messages, because their indices are blocked with status [read-only / allow delete]
2020-11-30T13:30:00.292-05:00 WARN [Messages] Retrying 500 messages, because their indices are blocked with status [read-only / allow delete]
2020-11-30T13:30:01.119-05:00 WARN [Messages] Retrying 500 messages, because their indices are blocked with status [read-only / allow delete]
2020-11-30T13:30:02.344-05:00 WARN [Messages] Retrying 500 messages, because their indices are blocked with status [read-only / allow delete]
2020-11-30T13:34:23.330-05:00 WARN [LicenseChecker] License violation - Detected irregular traffic records

ttsandrew · November 30, 2020, 6:46pm

Your Elasticsearch index is read only. Are there errors in the Elasticsearch logs?

If not we can try setting them back to allow writes.

zrevans826 · November 30, 2020, 6:51pm

This is what I see in .current

Desired survivor size 17891328 bytes, new threshold 6 (max 6)

age 1: 309752 bytes, 309752 total
: 314513K->479K(314560K), 0.0118425 secs] 726308K->429759K(1013632K), 0.0119893 secs] [Times: user=0.03 sys=0.00, real=0.02 secs]
2020-11-30T13:37:58.063-0500: 597857.421: Total time for which application threads were stopped: 0.0124783 seconds, Stopping threads took: 0.0000787 seconds
2020-11-30T13:38:14.070-0500: 597873.428: Total time for which application threads were stopped: 0.0003141 seconds, Stopping threads took: 0.0000742 seconds
2020-11-30T13:38:33.077-0500: 597892.435: Total time for which application threads were stopped: 0.0003228 seconds, Stopping threads took: 0.0000779 seconds
2020-11-30T13:38:34.078-0500: 597893.436: Total time for which application threads were stopped: 0.0006645 seconds, Stopping threads took: 0.0000526 seconds
2020-11-30T13:38:35.081-0500: 597894.439: Total time for which application threads were stopped: 0.0003457 seconds, Stopping threads took: 0.0000529 seconds
2020-11-30T13:38:54.088-0500: 597913.446: Total time for which application threads were stopped: 0.0003934 seconds, Stopping threads took: 0.0000793 seconds
2020-11-30T13:38:56.092-0500: 597915.450: Total time for which application threads were stopped: 0.0003115 seconds, Stopping threads took: 0.0000740 seconds
2020-11-30T13:39:00.096-0500: 597919.454: Total time for which application threads were stopped: 0.0003017 seconds, Stopping threads took: 0.0000551 seconds
2020-11-30T13:39:14.099-0500: 597933.457: Total time for which application threads were stopped: 0.0003184 seconds, Stopping threads took: 0.0000771 seconds
2020-11-30T13:39:30.108-0500: 597949.466: Total time for which application threads were stopped: 0.0003428 seconds, Stopping threads took: 0.0000731 seconds
2020-11-30T13:39:49.122-0500: 597968.479: Total time for which application threads were stopped: 0.0003137 seconds, Stopping threads took: 0.0000741 seconds
2020-11-30T13:40:03.130-0500: 597982.488: Total time for which application threads were stopped: 0.0002666 seconds, Stopping threads took: 0.0000587 seconds
2020-11-30T13:40:33.146-0500: 598012.504: Total time for which application threads were stopped: 0.0002235 seconds, Stopping threads took: 0.0000562 seconds
2020-11-30T13:40:34.147-0500: 598013.505: Total time for which application threads were stopped: 0.0002290 seconds, Stopping threads took: 0.0000520 seconds
2020-11-30T13:41:00.153-0500: 598039.511: Total time for which application threads were stopped: 0.0002542 seconds, Stopping threads took: 0.0000474 seconds
2020-11-30T13:41:05.155-0500: 598044.513: Total time for which application threads were stopped: 0.0003047 seconds, Stopping threads took: 0.0000755 seconds
2020-11-30T13:41:30.160-0500: 598069.518: Total time for which application threads were stopped: 0.0002678 seconds, Stopping threads took: 0.0000626 seconds
2020-11-30T13:44:24.304-0500: 598243.662: Total time for which application threads were stopped: 0.0003928 seconds, Stopping threads took: 0.0001333 seconds
2020-11-30T13:44:33.311-0500: 598252.668: Total time for which application threads were stopped: 0.0002901 seconds, Stopping threads took: 0.0000646 seconds
2020-11-30T13:46:33.368-0500: 598372.726: Total time for which application threads were stopped: 0.0004696 seconds, Stopping threads took: 0.0001045 seconds
2020-11-30T13:46:53.270-0500: 598392.627: [GC (Allocation Failure) 2020-11-30T13:46:53.270-0500: 598392.627: [ParNew
Desired survivor size 17891328 bytes, new threshold 6 (max 6)
age 1: 277248 bytes, 277248 total
age 2: 81480 bytes, 358728 total
: 280095K->436K(314560K), 0.0114550 secs] 709375K->429715K(1013632K), 0.0116189 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]

ttsandrew · November 30, 2020, 6:59pm

That just looks like a GC issue, if the JVM isn’t stopping/restarting then I don’t think it’s related.

Have you tried cycling the active write index? If not give that a try. If that doesn’t work, try disabling global read only.

Assuming ES is running on the Graylog node and only services Graylog:
curl -X PUT “localhost:9200/_all/_settings” -H ‘Content-Type: application/json’ -d’{ “index.blocks.read_only_allow_delete” : null } }’

zrevans826 · November 30, 2020, 8:33pm

I’m finally starting to see both in and outgoing data. Right now it is showing 154 in and 8500 out. But its still showing unprocessed messages.

Utilization

4.83%

158,362 unprocessed messages are currently in the journal, in 3 segments.
277 messages have been appended in the last second, 12,910 messages have been read in the last second.

ttsandrew · November 30, 2020, 8:33pm

What change did you make?

That’s going to be the backlog processing through if you just made a change to allow writes.

zrevans826 · November 30, 2020, 8:35pm

I cycled the active write index and then the “index.block.read_only_allow_delete” command you mentioned above. I then restarted graylog-server.

ttsandrew · November 30, 2020, 8:37pm

Ok. Are you still seeing that negative number on the journal size? It sounds like backlogged messages are clearing but I’m still a bit confused about that. Did you happen to resize the fs while the Graylog and Elasticsearch services were running?

zrevans826 · November 30, 2020, 8:40pm

Here’s my journal

Elasticsearch was running (oops) but I stopped graylog-server

Topic		Replies	Views
Graylog not receiving messages, unprocessed messages Graylog Central (peer support)	22	4342	June 23, 2022
Messages being fed to Graylog and apparently indexed, but they're not visible anymore Graylog Central (peer support)	32	9186	November 19, 2019
Suddenly Graylog inputs stop receiving messages? Graylog Central (peer support)	9	5505	July 4, 2018
No more messages flowing inbound? Started over twice now... what am I doing wrong? Graylog Central (peer support) elastic	5	1164	February 24, 2022
HELP, its all stopped working! Graylog Central (peer support)	31	7865	March 27, 2017

Input not receiving any new messages

Related topics