I’m setting up a new instance of Graylog with an OpenSearch cluster, but I’m having trouble connecting Graylog to OpenSearch.
Here are the logs:
2023-08-30T04:02:43.754-03:00 ERROR [VersionProbe] Unable to retrieve version from Elasticsearch node: unexpected end of stream on http://opensearch01:9200/... - \n not found: limit=0 content=….
2023-08-30T04:02:43.757-03:00 ERROR [VersionProbe] Unable to retrieve version from Elasticsearch node: unexpected end of stream on http://opensearch02:9200/... - \n not found: limit=0 content=….
2023-08-30T04:02:43.761-03:00 ERROR [VersionProbe] Unable to retrieve version from Elasticsearch node: unexpected end of stream on http://opensearch03:9200/... - \n not found: limit=0 content=….
2023-08-30T04:02:43.761-03:00 INFO [VersionProbe] Elasticsearch is not available. Retry #176
Does anyone know what this problem could be?
I’ve already tried to set the config elasticsearch_disable_version_check = true , but it still doesn’t work. I also tried to set elasticsearch_version = 7 to see if it would go forward, but it didn’t. I also tried the value 2 which is the opensearch version but it didn’t work.
In the graylog config, I tested both configurations with user and without user to authenticate, but since it does not present an authentication error, I believe that this does not influence.
The graylog service goes up, but the port 9000 doesn’t open.
Gryalog Version: 5.1 (single node, for now)
MongoDB: 7.0 (3 node cluster, within graylog server)
OpenSearch Version: 2.9 (3 node cluster)
Operating System: Ubuntu 23.04 on all servers.
Thank you, this helps! I can see what is wrong but not how to fix it. Long story short there is a very specific and particular way to configure graylog and opensearch to work with TLS auth. I’m looking for any resources we can share but wanted to give you an update.
Taking a look on OpenSearch cluster logs, I found something related to ssl not being trusted, so I tried to disable ssl, for test purpose only.
plugins.security.ssl.http.enabled: false
Then, another problem showed up. But it’s not related to that. Despite this, I’m pasting and explaining in case anyone needs it.
2023-09-05T15:39:51.615-03:00 INFO [ServiceManagerListener] Services are now stopped.
2023-09-05T15:39:51.615-03:00 ERROR [ServerBootstrap] Graylog startup failed. Exiting. Exception was:
java.lang.IllegalStateException: Expected to be healthy after starting. The following services are not running: {FAILED=[JerseyService [FAILED]]}
at com.google.common.util.concurrent.ServiceManager$ServiceManagerState.checkHealthy(ServiceManager.java:769) ~[graylog.jar:?]
at com.google.common.util.concurrent.ServiceManager$ServiceManagerState.awaitHealthy(ServiceManager.java:581) ~[graylog.jar:?]
at com.google.common.util.concurrent.ServiceManager.awaitHealthy(ServiceManager.java:295) ~[graylog.jar:?]
at org.graylog2.bootstrap.ServerBootstrap.startCommand(ServerBootstrap.java:321) [graylog.jar:?]
at org.graylog2.bootstrap.CmdLineTool.doRun(CmdLineTool.java:323) [graylog.jar:?]
at org.graylog2.bootstrap.CmdLineTool.run(CmdLineTool.java:259) [graylog.jar:?]
at org.graylog2.bootstrap.Main.main(Main.java:45) [graylog.jar:?]
Suppressed: com.google.common.util.concurrent.ServiceManager$FailedService: JerseyService [FAILED]
Caused by: java.net.BindException: Permission denied
at sun.nio.ch.Net.bind0(Native Method) ~[?:?]
at sun.nio.ch.Net.bind(Unknown Source) ~[?:?]
at sun.nio.ch.ServerSocketChannelImpl.netBind(Unknown Source) ~[?:?]
at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source) ~[?:?]
at sun.nio.ch.ServerSocketAdaptor.bind(Unknown Source) ~[?:?]
at org.glassfish.grizzly.nio.transport.TCPNIOBindingHandler.bindToChannelAndAddress(TCPNIOBindingHandler.java:107) ~[graylog.jar:?]
at org.glassfish.grizzly.nio.transport.TCPNIOBindingHandler.bind(TCPNIOBindingHandler.java:64) ~[graylog.jar:?]
at org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:215) ~[graylog.jar:?]
at org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:195) ~[graylog.jar:?]
at org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:186) ~[graylog.jar:?]
at org.glassfish.grizzly.http.server.NetworkListener.start(NetworkListener.java:711) ~[graylog.jar:?]
at org.glassfish.grizzly.http.server.HttpServer.start(HttpServer.java:256) ~[graylog.jar:?]
at org.graylog2.shared.initializers.JerseyService.startUpApi(JerseyService.java:203) ~[graylog.jar:?]
at org.graylog2.shared.initializers.JerseyService.startUp(JerseyService.java:157) ~[graylog.jar:?]
at com.google.common.util.concurrent.AbstractIdleService$DelegateService$1.run(AbstractIdleService.java:62) ~[graylog.jar:?]
at com.google.common.util.concurrent.Callables$4.run(Callables.java:121) ~[graylog.jar:?]
at java.lang.Thread.run(Unknown Source) ~[?:?]
2023-09-05T15:39:51.622-03:00 INFO [Server] SIGNAL received. Shutting down.
2023-09-05T15:39:51.631-03:00 INFO [GracefulShutdown] Graceful shutdown initiated.
2023-09-05T15:39:51.632-03:00 INFO [GracefulShutdown] Node status: [Override lb:DEAD [LB:DEAD]]. Waiting <3sec> for possible load balancers to recognize state change.
2023-09-05T15:39:54.641-03:00 INFO [GracefulShutdown] Goodbye.
After a series of try and error, I realized I was using port 443, instead of 9000, while trying to make this work. That made it have a “permission denied” when binding the socket. I just messed up with my config while trying to get this running. Now it’s running (on port 9000).