I’m running a new Graylog 6.2.0 cluster with nodes distributed across multiple geographical regions. When I log in to a node in one region, I’m unable to view performance metrics (e.g., Memory/Heap, Buffers, Journal) for nodes in other regions. However, if I log in to a node in the same geographic location, all metrics display correctly.
In the logs, I’m consistently seeing inter-node API timeouts like this:
2025-04-29T18:30:10.070Z WARN [ProxiedResource] Failed to call API on node <1ce8335f-e3a9-4d66-b1eb-4bdcdedf827b>, cause: timeout (duration: 1002 ms)
2025-04-29T18:30:10.070Z WARN [ProxiedResource] Failed to call API on node <09ac9ab1-ec2c-4e54-88a0-1c74f0291dfe>, cause: timeout (duration: 1001 ms)
2025-04-29T18:30:10.070Z WARN [ProxiedResource] Failed to call API on node <a1e3a52c-7d66-4514-afe0-3e4d7afb5e68>, cause: timeout (duration: 1001 ms)
2025-04-29T18:30:10.070Z WARN [ProxiedResource] Failed to call API on node <1db2c9da-8c1c-4f24-9b6e-b85f49fadd94>, cause: timeout (duration: 1002 ms)
2025-04-29T18:31:10.071Z WARN [ProxiedResource] Failed to call API on node <c81809a0-b020-489f-892c-15211dd73696>, cause: timeout (duration: 1001 ms)
2025-04-29T18:31:10.071Z WARN [ProxiedResource] Failed to call API on node <1ce8335f-e3a9-4d66-b1eb-4bdcdedf827b>, cause: timeout (duration: 1001 ms)
2025-04-29T18:31:10.071Z WARN [ProxiedResource] Failed to call API on node <1db2c9da-8c1c-4f24-9b6e-b85f49fadd94>, cause: timeout (duration: 1001 ms)
2025-04-29T18:31:10.071Z WARN [ProxiedResource] Failed to call API on node <09ac9ab1-ec2c-4e54-88a0-1c74f0291dfe>, cause: timeout (duration: 1001 ms)
2025-04-29T18:31:10.071Z WARN [ProxiedResource] Failed to call API on node <a1e3a52c-7d66-4514-afe0-3e4d7afb5e68>, cause: timeout (duration: 1001 ms)
2025-04-29T18:31:10.071Z WARN [ProxiedResource] Failed to call API on node <7569ace1-6ea9-4f67-8531-da9c0919ee3d>, cause: timeout (duration: 1002 ms)
2025-04-29T18:31:58.277Z WARN [ProxiedResource] Failed to call API on node <a1e3a52c-7d66-4514-afe0-3e4d7afb5e68>, cause: timeout (duration: 1001 ms)
2025-04-29T18:31:58.277Z WARN [ProxiedResource] Failed to call API on node <c81809a0-b020-489f-892c-15211dd73696>, cause: timeout (duration: 1001 ms)
2025-04-29T18:31:58.277Z WARN [ProxiedResource] Failed to call API on node <1ce8335f-e3a9-4d66-b1eb-4bdcdedf827b>, cause: timeout (duration: 1001 ms)
2025-04-29T18:31:58.277Z WARN [ProxiedResource] Failed to call API on node <1db2c9da-8c1c-4f24-9b6e-b85f49fadd94>, cause: timeout (duration: 1001 ms)
I’ve tried uncommenting and setting proxied_requests_default_call_timeout = 5s
in server.conf
, but it doesn’t seem to have any effect—the timeouts still occur at around 1 second. I’ve also reviewed the config file but couldn’t find any other relevant settings to adjust this timeout.
Everything else in the cluster appears to be functioning properly. This issue is specifically with viewing node performance stats across regions.
Environment Details:
- OS: Ubuntu 24.04
- Graylog Version: 6.2.0
- Number of Graylog nodes: 9 (will be scaling to 20+)
- OpenSearch Version: 2.15.0
- Number of OpenSearch data nodes: 32
Is there a new or alternative setting in Graylog 6.2.0 to increase the inter-node API call timeout? Any help would be appreciated.