Graylog 3.0.1 - io.netty.handler.timeout.ReadTimeoutException


I’m getting lots of these errors after a graylog upgrade.
2019-05-02T16:55:18.605-07:00 ERROR [AbstractTcpTransport] Error in Input [GELF HTTP/5cc0dc1ddb8cf7506033f8af] (channel [id: 0xc5381725, L:/ ! R:/:18011]) (cause io.netty.handler.timeout.ReadTimeoutException)

I upgraded to 3.0.1-2 after I saw this issue raised. Gelf Http input ReadTimeoutException

Version of graylog installed
graylog-server/stable,stable,now 3.0.1-2 all [installed]

I can say that the incidence of the error is an order of magnitude less after the upgrade than before but the symptom still appears.

Any thoughts? Let me know if you need more information…

could be an issue with your sender - but without context that is hard to tell. Are the connections keeped open? Does the sender close the connection or just drops?

Thanks for response.

The sender in this case is Serilog.Sinks.Graylog which delegates to System.Net.Http.HttpClient

Client Source Code:

Through inspecting open TCP connections on the graylog server it looks like the connection stays established for a while and then disappear. This disappearance is correlated with the timing of the netty.handler.timeout.ReadTimeoutException

tcp FIN-WAIT-2 0 0 [::ffff:]:12201 [::ffff:SENDER_IP]:12332
tcp ESTAB 0 0 [::ffff:]:12201 [::ffff:SENDER_IP]:52538

Hazarding a guess… at a high level… perhaps an exception is being logged when maybe it shouldn’t be. I haven’t noticed any dropped messages because of this behavior though so why show an exception? Confidence about this guess is pretty low though :slight_smile:

Maybe a reasonable next step is try to create a simpler reproducible case and go from there… i might pull down the repo and try to do that…

that would be really helpful as we currently would not be able to investigate on that in time.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.