Before you post: Your responses to these questions will help the community help you. Please complete this template if you’re asking a support question.
Don’t forget to select tags to help index your topic!
1. Describe your incident:
Overnight, the physical server that hosts my graylog cluster went down. Since bringing it back up, I get a blank page when attempting to access the cluster via Cloudflare Load-balancing.
My setup is as follows:
My setup:
2. Describe your environment:
-
OS Information:
All environments are Debian 12 within a LXD container hosted by PVE -
Package Version:
Graylog 5.1.5
Mongodb 6.0.10
opensearch 2.9.0 -
Service logs, configurations, and environment variables:
All server.confs are the same with the following difference:
is_leader = true
applies to grayserver-1 / 192.168.1.1
is_leader = false
applies to grayserver-{2,3} / 192.168.1.{2/3}
and IP/fqdns are charged to match the respective host:
http_bind_address = 192.168.1.1:9000
http_publish_uri = http://192.168.1.1.9000/
http_external_uri = https://grayserver-1.foo.bar/
http_enable_cors = true
Further Info:
So, if I navigated to https://graylog.foo.bar, I get the blank screen.
If I navigate to any of https://grayserver-{1,2,3}.foo.bar, it loads properly.
The server.log for any of the grayserver-{1,2,3} servers only have this recent error (on grayserver-1):
2023-09-26T10:29:49.883-04:00 ERROR [IndexRotationThread] Couldn't point deflector to a new index
java.lang.IllegalStateException: No index size
at org.graylog2.indexer.rotation.strategies.TimeBasedSizeOptimizingStrategy.lambda$shouldRotate$1(TimeBasedSizeOptimizingStrategy.java:81) ~[graylog.jar:?]
at java.util.Optional.orElseThrow(Unknown Source) ~[?:?]
at org.graylog2.indexer.rotation.strategies.TimeBasedSizeOptimizingStrategy.shouldRotate(TimeBasedSizeOptimizingStrategy.java:81) ~[graylog.jar:?]
at org.graylog2.indexer.rotation.strategies.AbstractRotationStrategy.rotate(AbstractRotationStrategy.java:77) ~[graylog.jar:?]
at org.graylog2.periodical.IndexRotationThread.checkForRotation(IndexRotationThread.java:127) ~[graylog.jar:?]
at org.graylog2.periodical.IndexRotationThread.lambda$doRun$0(IndexRotationThread.java:91) ~[graylog.jar:?]
at java.lang.Iterable.forEach(Unknown Source) [?:?]
at org.graylog2.periodical.IndexRotationThread.doRun(IndexRotationThread.java:87) [graylog.jar:?]
at org.graylog2.plugin.periodical.Periodical.run(Periodical.java:99) [graylog.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [?:?]
at java.util.concurrent.FutureTask.runAndReset(Unknown Source) [?:?]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?]
at java.lang.Thread.run(Unknown Source) [?:?]
Which I presume is related to a situation that occurred last week wherein I had a separate physical server housing another Opensearch node that was part of the cluster, crash:
The browser throws these errors when navigating to https://graylog.foo.bar:
But no errors when navigating to https://grayserver-{1,2,3}.foo.bar
NOTE: All https is handled by Cloudflare via Cloudflare Tunnel
EDIT: I forgot to add, here is an example of how the Cloudflare tunnel is configured:
EDIT 2: A bit more info, Sidecars appear to be working fine (i.e. Graylog sees them and they are configured to look for https://graylog.foo.bar) however the data is actually sent to an NGINX load balancer that is defined within the configs pushed by the sidecar.
Any help is greatly appreciated; thank you!