We have been running Graylog Open for some time now and it has bene stable. We have been running the monthly updates a week or so after release. We have just updated from 6.0.7 to 6.1.1 and hit some issues.
Initially Graylog works fine but after a random amount of time search results hang and give no results. If I check the server.log the following error appears when the results stop working:
WARN [ProxiedResource] Failed to call API on node , cause: Failed to connect to REDACTED/127.0.0.1:9000 (duration: 2 ms)
If I restart the VM (or sometimes just the Graylog service) everything starts working again.
When it’s not working if I curl the URL on the Graylog VM I get connection refused (but externally I get the web interface), when It’s working I get the correct curl results.
Sometimes if i restart the Graylog service the curl results are fine but I can’t access the web page externally.
When I ran the update it did prompt to replace the server.conf and I said no.
I’m not going to hold my breath but I think it’s working…
I had added the full FQDN to /etc/hosts pointing to the actual IP address of the server. I’ve not had to do this before as the FQDN is resolved by DNS, so it just works, seems that it’s become more of a requirement to have the hosts entry specified with 6.1 as it doesn’t seem to like DNS so much.
Was there an entry in hosts file for the FQDN against 127.0.0.1, the server seemed is was failing because it was calling the api on 127.0.01 which it got from looking up the FQDN it’s bound to.
There was no complete entry against the FQDN in the hosts file, the only 127.0.0.1 was localhost and the actual host FQDN (which isn’t the FQDN it’s accessed through).
I did initially try adding 127.0.0.1 FQDN to the hosts file but then I couldn’t access the web interface, changing it to IP FQDN got it working fine.
It’s now been nearly 3 hours and it’s still working so it does look like it’s fixed.