graylog leader node keeps losing from cluster, it is processing logs but sometimes is missing from UI, every 5 minutes alert is triggered
Graylog:
OS is ubuntu 22.04
RAM: 48
CPU: 32
Nodes: 5
Elasticsearch:
RAM: 128
CPU: 16
Nodes: 9
the only error i am getting in graylog-server’s journal is java.net.SocketTimeoutException: Connect timed out. No more errors in any logs
this is the configuration:
http_bind_address = 192.168.133.147:9000
http_publish_uri = http://192.168.133.147:9000/
http_external_uri = http://example-graylog01.example.com/
every node has correct addresses
i have tried to change those threee arguments but nothing would work, ping and telnet is successful from every node towards leader node. what you guys think i am missing?
This graylog cluster is about 1 and a half years old but this error appeared only 2-3 months ago. there are not any errors in elasticsearch nor mongodb rs, i also have tried to ping and telnet from elastic and mongo and those went successful too.
Here are some screenshots: