DNS resolution issues w multi-node setup

I’m running graylog in Kubernetes, but when I add an additional node (as a pod) I can’t view any metrics for the 2nd pod in the UI.
image

I’ve tried playing w the publish uri, the http bind uri to no avail. I verified pod to pod connectivity is good. I think it’s because it’s a pod trying to hit a pod with non k8s DNS resolution, but I’m not sure how to fix that or if that’s the issue even. Would very much appreciate a tip!

Running graylog v3.3.15 as a stateful set in OKD 4.13.0-0.okd-2023-09-03-082426
Here are some log outputs:
2024-02-23 20:57:36,038 WARN : org.graylog2.shared.rest.resources.ProxiedResource - Unable to call http://graylog-workers-1:9000/api/system/metrics/multiple on node : graylog-workers-1

So I don’t know k8 well enough to help directly, but publish Uri is exactly the setting that it is using to know how to communicate between them.

Thanks, I should have that set to the endpoint where I access graylog then right? For example: http_publish_uri = k8s-graylog-dev.local

Because I just tried setting it to the same value as my external uri (external_uri = k8s-graylog-dev.local) to no avail.

Publish is the address that other nodes will use to talk to this node, external would be the address of the load balancer, what you will actually be typing in to your web browser to access the web interface.

Thanks, it seems they know about each other as one of my worker pods is attempting to call the 2nd one once it’s spun up. But the metrics calls can’t be successful because of how k8s DNS works. It’s calling “graylog-workers-1” but I need it to call the other node via K8s DNS (graylog-workers-0.service-name.namespace.svc.cluster.local) is there any way to support this aside from directly renaming our nodes to support K8s DNS?

One more thing, do you happen to know what variable is getting called by graylog to make this node specific API call: /api/system/metrics/multiple

If you open the browser inspector on any page of the UI with constantly updating information you will see how this API works really fast, the UI calls it every couple of seconds. Ideally do that on a page with the metrics being shown you want to grab via the API.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.