Graylog 5.1 and Opensearch 2.9 integration error

Hello, everyone.

I’m setting up a new instance of Graylog with an OpenSearch cluster, but I’m having trouble connecting Graylog to OpenSearch.

Here are the logs:

2023-08-30T04:02:43.754-03:00 ERROR [VersionProbe] Unable to retrieve version from Elasticsearch node: unexpected end of stream on http://opensearch01:9200/... - \n not found: limit=0 content=….
2023-08-30T04:02:43.757-03:00 ERROR [VersionProbe] Unable to retrieve version from Elasticsearch node: unexpected end of stream on http://opensearch02:9200/... - \n not found: limit=0 content=….
2023-08-30T04:02:43.761-03:00 ERROR [VersionProbe] Unable to retrieve version from Elasticsearch node: unexpected end of stream on http://opensearch03:9200/... - \n not found: limit=0 content=….
2023-08-30T04:02:43.761-03:00 INFO  [VersionProbe] Elasticsearch is not available. Retry #176

Does anyone know what this problem could be?

I’ve already tried to set the config elasticsearch_disable_version_check = true , but it still doesn’t work. I also tried to set elasticsearch_version = 7 to see if it would go forward, but it didn’t. I also tried the value 2 which is the opensearch version but it didn’t work.

In the graylog config, I tested both configurations with user and without user to authenticate, but since it does not present an authentication error, I believe that this does not influence.

The graylog service goes up, but the port 9000 doesn’t open.

Gryalog Version: 5.1 (single node, for now)
MongoDB: 7.0 (3 node cluster, within graylog server)
OpenSearch Version: 2.9 (3 node cluster)
Operating System: Ubuntu 23.04 on all servers.

Config file:

is_leader = true
node_id_file = /etc/graylog/server/node-id
bin_dir = /usr/share/graylog-server/bin
data_dir = /var/lib/graylog-server
plugin_dir = /usr/share/graylog-server/plugin
http_bind_address = graylog01:9000
stream_aware_field_types=false
elasticsearch_hosts = http://opensearch01:9200,http://opensearch02:9200,http://opensearch03:9200
elasticsearch_max_number_of_indices = 60
elasticsearch_disable_version_check = true
elasticsearch_version = 7
allow_leading_wildcard_searches = true
allow_highlighting = true
output_batch_size = 500
output_flush_interval = 1
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30
processbuffer_processors = 5
outputbuffer_processors = 3
processor_wait_strategy = blocking
ring_size = 65536
inputbuffer_ring_size = 65536
inputbuffer_processors = 2
inputbuffer_wait_strategy = blocking
message_journal_enabled = true
message_journal_dir = /var/lib/graylog-server/journal
lb_recognition_period_seconds = 3
mongodb_uri = mongodb://graylog01:27017,graylog02:27017,graylog03:27017/graylog?replicaSet=gl-mongodb-cluster
mongodb_max_connections = 1000

This appears that graylog cannot communicate with opensearch. Can you share your opensearch config?

Can you verify you can reach the opensearch cluster and get a response when you query its api (even the default page http://host:9200?

Sure…

Opensearch config:

cluster.name: graylog-cluster
node.name: opensearch03
path.data: /zfsStore/data
path.logs: /zfsStore/log
network.host: 0.0.0.0
http.port: 9200
discovery.seed_hosts: ["opensearch01", "opensearch02", "opensearch03"]
cluster.initial_cluster_manager_nodes: ["opensearch01", "opensearch02", "opensearch03"]
plugins.security.ssl.transport.pemcert_filepath: esnode.pem
plugins.security.ssl.transport.pemkey_filepath: esnode-key.pem
plugins.security.ssl.transport.pemtrustedcas_filepath: root-ca.pem
plugins.security.ssl.transport.enforce_hostname_verification: false
plugins.security.ssl.http.enabled: true
plugins.security.ssl.http.pemcert_filepath: esnode.pem
plugins.security.ssl.http.pemkey_filepath: esnode-key.pem
plugins.security.ssl.http.pemtrustedcas_filepath: root-ca.pem
plugins.security.allow_unsafe_democertificates: true
plugins.security.allow_default_init_securityindex: true
plugins.security.authcz.admin_dn:
  - CN=kirk,OU=client,O=client,L=test, C=de
plugins.security.audit.type: internal_opensearch
plugins.security.enable_snapshot_restore_privilege: true
plugins.security.check_snapshot_restore_write_privileges: true
plugins.security.restapi.roles_enabled: ["all_access", "security_rest_api_access"]
plugins.security.system_indices.enabled: true
plugins.security.system_indices.indices: [".plugins-ml-model", ".plugins-ml-task", ".opendistro-alerting-config", ".opendistro-alerting-alert*", ".opendistro-anomaly-results*", ".opendistro-anomaly-detector*", ".opendistro-anomaly-checkpoints", ".opendistro-anomaly-detection-state", ".opendistro-reports-*", ".opensearch-notifications-*", ".opensearch-notebooks", ".opensearch-observability", ".opendistro-asynchronous-search-response*", ".replication-metadata-store"]
node.max_local_storage_nodes: 3
action.auto_create_index: false

Communication test.

root@graylog01:~# curl -X GET -k -u admin:admin https://opensearch01:9200/_cluster/health?pretty
{
  "cluster_name" : "graylog-cluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "discovered_master" : true,
  "discovered_cluster_manager" : true,
  "active_primary_shards" : 1,
  "active_shards" : 3,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

Thank you, this helps! I can see what is wrong but not how to fix it. Long story short there is a very specific and particular way to configure graylog and opensearch to work with TLS auth. I’m looking for any resources we can share but wanted to give you an update.

If it’s a TLS issue you only need to add the CA certificate (which created the OpenSearch certificate) in the Graylog TrustStore and it works fine.

Taking a look on OpenSearch cluster logs, I found something related to ssl not being trusted, so I tried to disable ssl, for test purpose only.

plugins.security.ssl.http.enabled: false

Then, another problem showed up. But it’s not related to that. Despite this, I’m pasting and explaining in case anyone needs it.

2023-09-05T15:39:51.615-03:00 INFO  [ServiceManagerListener] Services are now stopped.
2023-09-05T15:39:51.615-03:00 ERROR [ServerBootstrap] Graylog startup failed. Exiting. Exception was:
java.lang.IllegalStateException: Expected to be healthy after starting. The following services are not running: {FAILED=[JerseyService [FAILED]]}
        at com.google.common.util.concurrent.ServiceManager$ServiceManagerState.checkHealthy(ServiceManager.java:769) ~[graylog.jar:?]
        at com.google.common.util.concurrent.ServiceManager$ServiceManagerState.awaitHealthy(ServiceManager.java:581) ~[graylog.jar:?]
        at com.google.common.util.concurrent.ServiceManager.awaitHealthy(ServiceManager.java:295) ~[graylog.jar:?]
        at org.graylog2.bootstrap.ServerBootstrap.startCommand(ServerBootstrap.java:321) [graylog.jar:?]
        at org.graylog2.bootstrap.CmdLineTool.doRun(CmdLineTool.java:323) [graylog.jar:?]
        at org.graylog2.bootstrap.CmdLineTool.run(CmdLineTool.java:259) [graylog.jar:?]
        at org.graylog2.bootstrap.Main.main(Main.java:45) [graylog.jar:?]
        Suppressed: com.google.common.util.concurrent.ServiceManager$FailedService: JerseyService [FAILED]
        Caused by: java.net.BindException: Permission denied
                at sun.nio.ch.Net.bind0(Native Method) ~[?:?]
                at sun.nio.ch.Net.bind(Unknown Source) ~[?:?]
                at sun.nio.ch.ServerSocketChannelImpl.netBind(Unknown Source) ~[?:?]
                at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source) ~[?:?]
                at sun.nio.ch.ServerSocketAdaptor.bind(Unknown Source) ~[?:?]
                at org.glassfish.grizzly.nio.transport.TCPNIOBindingHandler.bindToChannelAndAddress(TCPNIOBindingHandler.java:107) ~[graylog.jar:?]
                at org.glassfish.grizzly.nio.transport.TCPNIOBindingHandler.bind(TCPNIOBindingHandler.java:64) ~[graylog.jar:?]
                at org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:215) ~[graylog.jar:?]
                at org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:195) ~[graylog.jar:?]
                at org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:186) ~[graylog.jar:?]
                at org.glassfish.grizzly.http.server.NetworkListener.start(NetworkListener.java:711) ~[graylog.jar:?]
                at org.glassfish.grizzly.http.server.HttpServer.start(HttpServer.java:256) ~[graylog.jar:?]
                at org.graylog2.shared.initializers.JerseyService.startUpApi(JerseyService.java:203) ~[graylog.jar:?]
                at org.graylog2.shared.initializers.JerseyService.startUp(JerseyService.java:157) ~[graylog.jar:?]
                at com.google.common.util.concurrent.AbstractIdleService$DelegateService$1.run(AbstractIdleService.java:62) ~[graylog.jar:?]
                at com.google.common.util.concurrent.Callables$4.run(Callables.java:121) ~[graylog.jar:?]
                at java.lang.Thread.run(Unknown Source) ~[?:?]
2023-09-05T15:39:51.622-03:00 INFO  [Server] SIGNAL received. Shutting down.
2023-09-05T15:39:51.631-03:00 INFO  [GracefulShutdown] Graceful shutdown initiated.
2023-09-05T15:39:51.632-03:00 INFO  [GracefulShutdown] Node status: [Override lb:DEAD [LB:DEAD]]. Waiting <3sec> for possible load balancers to recognize state change.
2023-09-05T15:39:54.641-03:00 INFO  [GracefulShutdown] Goodbye.

After a series of try and error, I realized I was using port 443, instead of 9000, while trying to make this work. That made it have a “permission denied” when binding the socket. I just messed up with my config while trying to get this running. Now it’s running (on port 9000).

Thank you all.

Configuring authentication in opensearch

/graylog/opensearch/config/opensearch-security/config.yml

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.