Log messages not retransmitted in case of network interruption


#1

Dear Team Members and Community,

I observed the following issue when using graylog in a way where a local graylog collects syslog messages locally and forwards those messages to a central graylog instance using GELF TCP/TLS.

(local net – graylog forwarder — DSL Router---- Internet ---- Firewall ---- central graylog )
UDP --------------------> ------- GELF ------------------------------------->

When the network connection is interrupted, e.g. Firewall reboot, DSL outage, the forwarder tries to send the last received local message continuously over the network. It does not recognize the connection outage.

Tcpdump:

Flags [P.], seq 8592478:8593108, ack 2541, win 274, options [nop,nop,TS val 78770560 ecr 2539788814], length 630
Flags [P.], seq 8592478:8593108, ack 2541, win 274, options [nop,nop,TS val 78800640 ecr 2539788814], length 630
Flags [P.], seq 8592478:8593108, ack 2541, win 274, options [nop,nop,TS val 78830720 ecr 2539788814], length 630
Flags [P.], seq 8592478:8593108, ack 2541, win 274, options [nop,nop,TS val 78860800 ecr 2539788814], length 630

In the same time the forwarder is receiving more messages from the local network.

Then after aprx. 15 minutes graylog detects the interrupted connection and initiates a new one.

2018-08-31T06:45:42.189+02:00 INFO [GelfTcpTransport] Channel disconnected!

But All Messages the forwarder received in between are not forwarded anymore.

Screenshot forwarder:

Screenshot receiver:

My expectation and requirement to such kind of structure is, that the forwarder collects and caches all messages in case of an network outage and then forwards it later to the central receiver, when it is available again.

Meanwhile I tweaked the Linux TCP Settings, so that only 1-2 Messages may be lost by using:

/proc/sys/net/ipv4/tcp_retries1 to 1
/proc/sys/net/ipv4/tcp_retries2 to 1

Then it does not take 15 minutes to detect the interrupted connection. It is restarted after some seconds.

Is there any other way to make sure that all locally received messages are fowarded to a central instance?


(Jan Doberstein) #2

What forwarder did you use? What Graylog Version did you run?


#3

I used graylog Version 2.4.6 for the forwarder as well as for the Central instance.


(Jan Doberstein) #4

So to clarify, you have a local Graylog that uses the output ability of Graylog to forward messages via unstable line to a central Graylog.

The problem is located in the way Graylog work with the output management. Could you please check over at Github if such a bug report already exists and if not open a new?

thanks
Jan


#5

Yes, the forwarder and the central instance are graylog systems and using graylog built in output and input mechanisms

I opened a github issue. Let me know if you need further information.


(system) #6

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.