New problem here : not sure if I should go on with this topic or open another one. Feel free to tell me if I’m doing this wrong, but I feel like it’s a Reverse Proxy / Load Balancing problem. I never had this kind of message on a Graylog-Server / MongoDB / Elasticsearch single instance.
I, from time to time, have got a 401 error from Sidecars with Filebeat.
Here are the typical type of logs from var/log/graylog-sidecar/sidecar.log
(and I can also see them form Graylog Server UI).
[...]
time="2021-08-09T14:25:18+02:00" level=error msg="[RequestBackendList] Bad response status from Graylog server: 401 Unauthorized"
time="2021-08-09T14:25:18+02:00" level=error msg="Can't fetch collector list from Graylog API: GET http://graylog.my.domain/api/sidecar/collectors: 401 "
time="2021-08-09T14:25:28+02:00" level=error msg="[RequestBackendList] Bad response status from Graylog server: 401 Unauthorized"
time="2021-08-09T14:25:28+02:00" level=error msg="Can't fetch collector list from Graylog API: GET http://graylog.my.domain/api/sidecar/collectors: 401 "
time="2021-08-09T14:25:38+02:00" level=error msg="[RequestBackendList] Bad response status from Graylog server: 401 Unauthorized"
time="2021-08-09T14:25:38+02:00" level=error msg="Can't fetch collector list from Graylog API: GET http://graylog.my.domain/api/sidecar/collectors: 401 "
time="2021-08-09T14:25:48+02:00" level=error msg="[RequestBackendList] Bad response status from Graylog server: 401 Unauthorized"
time="2021-08-09T14:25:48+02:00" level=error msg="Can't fetch collector list from Graylog API: GET http://graylog.my.domain/api/sidecar/collectors: 401 "
time="2021-08-09T14:25:58+02:00" level=error msg="[RequestConfiguration] Bad response status from Graylog server: 401 Unauthorized"
time="2021-08-09T14:25:58+02:00" level=error msg="Can't fetch configuration from Graylog API: GET http://graylog.my.domain/api/sidecar/configurations/render/15520d3d-b40e-4316-97ad-e7f200e61d40/61093806666bd536b0701b2g: 401 "
[...]
This happens after workdays, or at lunch. Feels like it happens some time after the UI is closed.
When I restart graylog.sidecar, it goes back up and running, till I get a 401 again.
My Graylog Cluster is behind an Apache Load balancer for the web interface whose configuration goes like this (as shown in a previous message) :
<VirtualHost *:80>
ServerName graylog.my.domain
ProxyRequests Off
Header add Set-Cookie "ROUTEID=.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANGED
<Location /balancer-manager>
SetHandler balancer-manager
AuthType Basic
AuthName "Load_Balancer_Manager"
AuthBasicProvider file
AuthUserFile "/path/to/passwords/files"
Require user secretuser
</location>
ProxyPass /balancer-manager !
<Proxy balancer://graylog>
BalancerMember "http://node01.my.domain:9000" route=node1
BalancerMember "http://node02.my.domain:9000" route=node2
BalancerMember "http://node03.my.domain:9000" route=node3
ProxySet lbmethod=byrequests
ProxySet stickysession=ROUTEID
</Proxy>
RequestHeader set X-Graylog-Server-URL "http://graylog.my.domain"
ProxyPass / balancer://graylog/
ProxyPassReverse / balancer://graylog/
</VirtualHost>
And behind an Nginx LoadBalancer for inputs, whose configuration goes like this (inspired by Nginx Config Examples) :
user www-data;
worker_processes auto;
pid /run/nginx.pid;
include /etc/nginx/modules-enabled/*.conf;
events {
worker_connections 1024;
}
stream {
upstream graylog_beats {
server 10.100.10.22:5044 max_fails=3 fail_timeout=30s;
server 10.100.10.23:5044 max_fails=3 fail_timeout=30s;
server 10.100.10.24:5044 max_fails=3 fail_timeout=30s;
}
server {
listen 5044;
proxy_pass graylog_beats;
proxy_timeout 10s;
error_log /var/log/nginx/graylog_beats.log;
}
}
We only ingest Filebeats for now, but we will add more type of inputs soon.
The beat input, which is a global input is configured like this :
bind_address: 0.0.0.0
no_beats_prefix: false
number_worker_threads: 1
override_source: <empty>
port: 5044
recv_buffer_size: 1048576
tcp_keepalive: true
tls_cert_file: <empty>
tls_client_auth: disabled
tls_client_auth_cert_file: <empty>
tls_enable: false
tls_key_file: <empty>
tls_key_password:********
If anyone has any idea, it would be more than welcome.
Have a good day !