NXLOG SSL error: Connection timed out

Hello Guys, I been receiving this error on the collectors installed on RedHat servers, it is from time to time but after that they stop working, I recycle them and start working normally, so I don’t get which could be the cause, I’m using the collector-sidecar-0.1.1-1.x86_64 version and nxlog-ce-2.9.1716-1_rhel7.x86_64.

thanks in advance.

Please post the complete logs of the Collector Sidecar or at least the complete error message.

hello here is a sample of the last lines of the log file

2017-09-26 18:43:43 INFO reconnecting in 1 seconds
2017-09-26 18:43:43 INFO reconnecting in 2 seconds
2017-09-26 18:43:43 INFO reconnecting in 1 seconds
2017-09-26 18:43:43 INFO reconnecting in 2 seconds
2017-09-26 18:43:43 INFO reconnecting in 1 seconds
2017-09-26 18:43:43 INFO reconnecting in 2 seconds
2017-09-26 18:43:44 INFO connecting to graylog.mcmcg.com:5044
2017-09-26 18:43:44 INFO connecting to graylog.mcmcg.com:5047
2017-09-26 18:43:44 INFO connecting to graylog.mcmcg.com:5044
2017-09-26 18:43:44 INFO successfully connected to graylog.mcmcg.com:5044
2017-09-26 18:43:44 INFO successfully connected to graylog.mcmcg.com:5044
2017-09-26 18:43:44 INFO successfully connected to graylog.mcmcg.com:5047
2017-09-26 20:55:20 INFO reconnecting in 1 seconds
2017-09-26 20:55:20 INFO reconnecting in 2 seconds
2017-09-26 20:55:20 INFO reconnecting in 1 seconds
2017-09-26 20:55:20 INFO reconnecting in 1 seconds
2017-09-26 20:55:20 INFO reconnecting in 2 seconds
2017-09-26 20:55:20 INFO reconnecting in 2 seconds
2017-09-26 20:55:21 INFO connecting to graylog.mcmcg.com:5044
2017-09-26 20:55:21 INFO connecting to graylog.mcmcg.com:5047
2017-09-26 20:55:21 INFO connecting to graylog.mcmcg.com:5044
2017-09-26 20:55:21 INFO successfully connected to graylog.mcmcg.com:5044
2017-09-26 20:55:21 INFO successfully connected to graylog.mcmcg.com:5044
2017-09-26 20:55:21 INFO successfully connected to graylog.mcmcg.com:5047
2017-09-26 23:06:57 INFO reconnecting in 1 seconds
2017-09-26 23:06:57 INFO last message repeated 2 times
2017-09-26 23:06:57 ERROR last message repeated 0 times
2017-09-26 23:06:57 INFO reconnecting in 2 seconds
2017-09-26 23:06:57 INFO last message repeated 1 times
2017-09-26 23:06:57 INFO reconnecting in 2 seconds
2017-09-26 23:06:58 INFO connecting to graylog.mcmcg.com:5044
2017-09-26 23:06:58 INFO connecting to graylog.mcmcg.com:5047
2017-09-26 23:06:58 INFO connecting to graylog.mcmcg.com:5044
2017-09-26 23:06:58 INFO successfully connected to graylog.mcmcg.com:5044
2017-09-26 23:06:58 INFO successfully connected to graylog.mcmcg.com:5044
2017-09-26 23:06:58 INFO successfully connected to graylog.mcmcg.co2017-09-27 01:18:35 ERROR SSL error, SSL_ERROR_SYSCALL: retval -1, errno: 110;Connection timed out
2017-09-27 01:18:35 ERROR SSL error, SSL_ERROR_SYSCALL: retval -1, errno: 110;Connection timed out
2017-09-27 01:18:35 ERROR SSL error, SSL_ERROR_SYSCALL: retval -1, errno: 110;Connection timed out
2017-09-27 03:30:12 ERROR SSL error, SSL_ERROR_SYSCALL: retval -1, errno: 110;Connection timed out
2017-09-27 03:30:12 ERROR SSL error, SSL_ERROR_SYSCALL: retval -1, errno: 110;Connection timed out
2017-09-27 03:30:12 ERROR SSL error, SSL_ERROR_SYSCALL: retval -1, errno: 110;Connection timed out
2017-09-27 05:41:49 ERROR SSL error, SSL_ERROR_SYSCALL: retval -1, errno: 110;Connection timed out
2017-09-27 05:41:49 ERROR SSL error, SSL_ERROR_SYSCALL: retval -1, errno: 110;Connection timed out
2017-09-27 05:41:49 ERROR SSL error, SSL_ERROR_SYSCALL: retval -1, errno: 110;Connection timed out
2017-09-27 07:53:26 ERROR SSL error, SSL_ERROR_SYSCALL: retval -1, errno: 110;Connection timed out
2017-09-27 07:53:26 ERROR SSL error, SSL_ERROR_SYSCALL: retval -1, errno: 110;Connection timed out
2017-09-27 07:53:26 ERROR SSL error, SSL_ERROR_SYSCALL: retval -1, errno: 110;Connection timed out

Which process emits these logs? NXLOG or the Graylog Collector Sidecar?

Hi @jochen that is being printed on the nxlog_stderr.log, generated on /var/log/graylog/collector-sidecar/

Please post the configuration of the Graylog Collector Sidecar and the generated NXLOG configuration.

here is the sidecar configuration:

 cat /etc/graylog/collector-sidecar/collector_sidecar.yml
server_url: https://graylog.mcmcg.com:9000/api/
update_interval: 1
tls_skip_verify: true
send_status: true
list_log_files:
node_id: phxior2kcorep01-graylog-collector-sidecar
collector_id: file:/etc/graylog/collector-sidecar/collector-id
log_path: /var/log/graylog/collector-sidecar
log_rotation_time: 86400
log_max_age: 604800
tags:
    - phxior2kcorep01
    - r2kerror
    - httpd
    - letterProcessorService

backends:
    - name: nxlog
      enabled: true
      binary_path: /usr/bin/nxlog
      configuration_path: /etc/graylog/collector-sidecar/generated/nxlog.conf

and here is the configuration generated:

cat /etc/graylog/collector-sidecar/generated/nxlog.conf
define ROOT /usr/bin

<Extension gelf>
  Module xm_gelf
</Extension>




<Input 5966d00e862ea90af08fa522>
        Module im_file
        File '/opt/httpd/logs/*_log'
        PollInterval 1
        SavePos True
        ReadFromLast True
        Recursive True
        RenameCheck False
        Exec $FileName = file_name(); # Send file name with each message
        Exec $software = "httpd";
</Input>
<Input 59809cee862ea92c9762900e>
        Module im_file
        File '/opt/tomcat_r2k-core/logs/letterProcessorService.log'
        PollInterval 1
        SavePos True
        ReadFromLast True
        Recursive True
        RenameCheck False
        Exec $FileName = file_name(); # Send file name with each message
        Exec $application = "letterProcessorService";
        Exec $software = "tomcat";
</Input>
<Input 59c94dfdc6203b046c8a94b2>
        Module im_file
        File '/opt/tomcat_r2k-core/logs/r2kerror.log'
        PollInterval 1
        SavePos True
        ReadFromLast True
        Recursive True
        RenameCheck False
        Exec $FileName = file_name(); # Send file name with each message
        Exec $application = "r2kerror";
        Exec $software = "tomcat";
</Input>






<Output 5966cfe2862ea90af08fa4f2>
        Module om_ssl
        Host graylog.mcmcg.com
        Port 5047
        OutputType GELF_TCP
        CAFile /opt/certs/graylog_client_prd.cer
        Exec $short_message = $raw_event; # Avoids truncation of the short_message field.
        Exec $gl2_source_collector = 'adcf7a28-f752-4bc8-82a8-277af2737e4d';
        Exec $collector_node_id = 'phxior2kcorep01-graylog-collector-sidecar';
        Exec $Hostname = hostname_fqdn();
</Output>
<Output 59809cee862ea92c9762900d>
        Module om_ssl
        Host graylog.mcmcg.com
        Port 5044
        OutputType GELF_TCP
        CAFile /opt/certs/graylog_client_prd.cer
        Exec $short_message = $raw_event; # Avoids truncation of the short_message field.
        Exec $gl2_source_collector = 'adcf7a28-f752-4bc8-82a8-277af2737e4d';
        Exec $collector_node_id = 'phxior2kcorep01-graylog-collector-sidecar';
        Exec $Hostname = hostname_fqdn();
</Output>
<Output 59c94dfdc6203b046c8a94b1>
        Module om_ssl
        Host graylog.mcmcg.com
        Port 5044
        OutputType GELF_TCP
        CAFile /opt/certs/graylog_client_prd.cer
        Exec $short_message = $raw_event; # Avoids truncation of the short_message field.
        Exec $gl2_source_collector = 'adcf7a28-f752-4bc8-82a8-277af2737e4d';
        Exec $collector_node_id = 'phxior2kcorep01-graylog-collector-sidecar';
        Exec $Hostname = hostname_fqdn();
</Output>

<Route route-0>
  Path 5966d00e862ea90af08fa522 => 5966cfe2862ea90af08fa4f2
</Route>
<Route route-1>
  Path 59809cee862ea92c9762900e => 59809cee862ea92c9762900d
</Route>
<Route route-2>
  Path 59c94dfdc6203b046c8a94b2 => 59c94dfdc6203b046c8a94b1
</Route>

hi,

you seem to have two output modules trying to output to a single Graylog input: graylog.mcmcg.com:5044/TCP, and both seem very similar to me. Are you sure you need both, or could you just set two inputs in the same path to a single output?

(this would not probably solve the issue though)

Hello Guys,

Any idea on this?

I think I also got this problem in Centos.

This one could be a bug in nxlog-ce. If you cannot debug it, there are several options: if it is a bug, it might be fixed in nxlog-ee, but I don’t know. Another option is a work-around.

I made the following workaround:

  1. I added a schedule block in the output that reconnects regularly. My main motivation was to allow load balancing to work, though.
  2. I added a line in /etc/crontab, something like the following: 15 1 * * * root <path>/systemctl restart graylog-collector-sidecar.service (check the correct path and service name). This would restart the sidecar every day, and when restarting the sidecar, the nxlog is also restarted, and will work OK until the next restart.

Hi Man,

Checking the logs on the graylog server I notice a lot of failures (bellow) on the apache error log, I notice those error happened when apache start and graylog is not fully up so I’m delaying the apache start to be the final and seems to be working, I’m monitoring if that was the issue

[Fri Oct 06 09:45:30.529533 2017] [proxy:error] [pid 8561] (111)Connection refused: AH00957: HTTPS: attempt to connect to 10.100.83.17:9000 (phxiograylogp03.internal.mcmcg.com) failed
[Fri Oct 06 09:45:30.529637 2017] [proxy:error] [pid 8561] AH00959: ap_proxy_connect_backend disabling worker for (phxiograylogp03.internal.mcmcg.com) for 60s
[Fri Oct 06 09:45:30.529654 2017] [proxy_http:error] [pid 8561] [client 10.100.83.252:4120] AH01114: HTTP: failed to make connection to backend: phxiograylogp03.internal.mcmcg.com
[Fri Oct 06 09:45:32.138932 2017] [proxy:error] [pid 8562] (111)Connection refused: AH00957: HTTPS: attempt to connect to 10.100.83.17:9000 (phxiograylogp03.internal.mcmcg.com) failed
[Fri Oct 06 09:45:32.139020 2017] [proxy:error] [pid 8562] AH00959: ap_proxy_connect_backend disabling worker for (phxiograylogp03.internal.mcmcg.com) for 60s
[Fri Oct 06 09:45:32.139037 2017] [proxy_http:error] [pid 8562] [client 10.100.83.253:12867] AH01114: HTTP: failed to make connection to backend: phxiograylogp03.internal.mcmcg.com
[Fri Oct 06 09:45:37.191386 2017] [proxy:error] [pid 8561] AH00940: HTTPS: disabled connection for (phxiograylogp03.internal.mcmcg.com

False Alarm, did not fix the issue!!!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.