Hello everyone, and thanks for that template. That’s a very good idea !
I am still struggling with this cluster / load balancer since my previous post.
Things moved and I fixed some stuff - Thanks @gsmith !
I was trying to get sidecars for Linux and Windows to work behind an NGINX loadbalancer, but I couldn’t make it, so I gave up (problem explain in the previous topic).
I then decided to use rsyslog for Linux. It was faster at first, not as convenient but Linux native. So I could totally work with that.
And for Windows, I tried NXLog. Which is pretty good too and work well with Windows event manager.
But, I can’t send those logs to my Graylog cluster through my NGINX Load Balancer. I’ts been days and days…I’m propably doing things wrong : feel free to tell me !
Description of your problem
- I can’t get rsyslog and NXlogs to send log files via the NGINX load balancer.
- The connection between rsyslog clients and the NGINX load balancer initializes correctly when starting the rsyslog service, but it seems that error messages come when I end the SSH connection.
- If I bypass the load balancer to send the logs directly to one of the Graylog nodes, I get the logs without any problem, without losing the connection.
- I’m still getting logs from NXLog, but it can’t seem to make a decent connection.
- I have no problem with other inputs using the same NGINX configurations (RAW UDP, PAN OS).
Description of steps you’ve taken to attempt to solve the issue
This setup is already a “backup plan”: I can’t send the logs over TLS in the loadbalancer. This works fine on my pre-production server, where Graylog / MongoDB and Elasticsearch are on the same machine. But first things first, I’m first trying to make this work.
- Tried to use a more traditional format for rsyslog.
- Reloaded rsyslog.service
- Reloaded NGINX
- Stop / Start Input
- tcpdump on the relevant ports
- Sending logs directly to one of the node.
Environmental information
192.168.1.20
= Graylog UI Load Balancer- Apache
192.168.1.21
= Graylog Inputs Load Balancer - NGINX
192.168.1.22
= Graylog Node 1
192.168.1.23
= Graylog Node 2
192.168.1.24
= Graylog Node 3
192.168.1.25
= Elasticsearch 1
192.168.1.26
= Elasticsearch 2
192.168.1.27
= Elasticsearch 3
Configuration Files
/etc/rsyslog.d/my.conf
*.* action (
type = "omfwd"
target = "192.168.1.21"
port = "10514"
protocol = "tcp"
template = "RSYSLOG_SyslogProtocol23Format"
action.resumeRetryCount = "-1"
queue.fileName = "rsyslog_metsys_tcp"
queue.type = "linkedList"
queue.size = "10000"
queue.saveonshutdown = "on"
)
/etc/nginx/nginx.conf
# rsyslog TCP
upstream graylog_rsyslog_tcp
{
server 192.168.1.22:10514 max_fails=3 fail_timeout=30s;
server 192.168.1.23:10514 max_fails=3 fail_timeout=30s;
server 192.168.1.24:10514 max_fails=3 fail_timeout=30s;
}
server
{
listen 10514;
proxy_pass graylog_rsyslog_tcp;
proxy_timeout 10s;
error_log /var/log/nginx/graylog_rsyslog_tcp.log;
}
# NXLog TCP
upstream nxlog_tcp
{
server 192.168.1.22:10517 max_fails=3 fail_timeout=30s;
server 192.168.1.23:10517 max_fails=3 fail_timeout=30s;
server 192.168.1.24:10517 max_fails=3 fail_timeout=30s;
}
server
{
listen 10517;
proxy_pass nxlog_tcp;
proxy_timeout 10s;
error_log /var/log/nginx/graylog_nxlog_tcp.log;
nxlog.conf
define ROOT C:\Program Files (x86)\nxlog
define CERTDIR %ROOT%\cert
define CONFDIR %ROOT%\conf
define LOGDIR %ROOT%\data
define LOGFILE %LOGDIR%\nxlog.log
LogFile %LOGFILE%
Moduledir %ROOT%\modules
CacheDir %ROOT%\data
Pidfile %ROOT%\data\nxlog.pid
SpoolDir %ROOT%\data
<Extension _gelf>
Module xm_gelf
</Extension>
<Input in>
Module im_msvistalog
<QueryXML>
[ SOME EVENT IDs FROM WINDOWS SERVER ]
</QueryXML>
</Input>
<Output graylog_gelf>
Module om_tcp
Host 192.168.1.21
Port 10517
OutputType GELF_TCP
</Output>
<Route 1>
Path in => graylog_gelf
</Route>
Errors
- GELF Input for NXLog on 10517
No active connection, but lots of connections in total. For 1 server.
Throughput / Metrics
1 minute average rate: 0 msg/s
Network IO: 0B 0B (total: 1.7MiB 0B )
Active connections: 0 (669 total)
Empty messages discarded: 0
- Syslog TCP Input on 10514
No active connection, but lots of connections in total. For 8 servers.
Throughput / Metrics
1 minute average rate: 0 msg/s
Network IO: 0B 0B (total: 55.2MiB 0B )
Active connections: 0 (2942 total)
Empty messages discarded: 0
/var/log/syslog
rsyslogd: omfwd: TCPSendBuf error -2027, destruct TCP Connection to 10.100.11.21:10514 [v8.1901.0 try https://www.rsyslog.com/e/2027 ]
rsyslogd: action 'action-0-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.1901.0 try https://www.rsyslog.com/e/2007 ]
rsyslogd: action 'action-0-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.1901.0 try https://www.rsyslog.com/e/2359 ]
rsyslogd: omfwd: TCPSendBuf error -2027, destruct TCP Connection to 10.100.11.21:10514 [v8.1901.0 try https://www.rsyslog.com/e/2027 ]
rsyslogd: action 'action-0-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.1901.0 try https://www.rsyslog.com/e/2007 ]
rsyslogd: action 'action-0-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.1901.0 try https://www.rsyslog.com/e/2359 ]
Operating system information
- Debian 10
Package versions
- Graylog = 4.1.2+20cd592, codename Noir
- MongoDB = version v5.0.1
- Elasticsearch = “number” : “7.10.2”, “build_flavor” : “oss”
- Service logs, configuration, and environment variables
- See the docs site for all file locations
Thanks to all of those who took some of their time to read me !
It’s the end of the day, and I’m pretty tired. I hope I hope my explanations are clear !
See you !