1. Incident:
Hello,
after upgrade graylog from version 4.2.7 to version 4.3.3 my graylog node started, but not connected to cluster because I found in log the following java error
2022-08-04 13:32:22,514 INFO [ServerBootstrap] - Graylog server 4.3.3+86369d3 starting up - {}
2022-08-04 13:32:22,514 INFO [ServerBootstrap] - JRE: Oracle Corporation 1.8.0_332 on Linux 5.4.170+ - {}
2022-08-04 13:32:22,514 INFO [ServerBootstrap] - Deployment: docker - {}
2022-08-04 13:32:22,515 INFO [ServerBootstrap] - OS: Debian GNU/Linux 11 (bullseye) (debian) - {}
2022-08-04 13:32:22,515 INFO [ServerBootstrap] - Arch: amd64 - {}
2022-08-04 13:32:22,564 INFO [PeriodicalsService] - Starting 7 periodicals ... - {}
2022-08-04 13:32:22,564 INFO [PeriodicalsService] - Delaying start of 21 periodicals until this node becomes leader ... - {}
2022-08-04 13:32:22,565 INFO [Periodicals] - Starting [org.graylog2.periodical.GarbageCollectionWarningThread] periodical, running forever. - {}
2022-08-04 13:32:22,577 INFO [Periodicals] - Starting [org.graylog2.periodical.TrafficCounterCalculator] periodical in [0s], polling every [1s]. - {}
2022-08-04 13:32:22,597 INFO [Periodicals] - Starting [org.graylog2.periodical.NodePingThread] periodical in [0s], polling every [1s]. - {}
2022-08-04 13:32:22,608 INFO [Periodicals] - Starting [org.graylog2.events.ClusterEventPeriodical] periodical in [0s], polling every [1s]. - {}
2022-08-04 13:32:22,611 INFO [Periodicals] - Starting [org.graylog2.periodical.ThroughputCalculator] periodical in [0s], polling every [1s]. - {}
2022-08-04 13:32:22,638 INFO [Periodicals] - Starting [org.graylog2.periodical.BatchedElasticSearchOutputFlushThread] periodical in [0s], polling every [1s]. - {}
2022-08-04 13:32:22,642 INFO [Periodicals] - Starting [org.graylog2.periodical.ThrottleStateUpdaterThread] periodical in [1s], polling every [1s]. - {}
2022-08-04 13:32:22,670 INFO [connection] - Opened connection [connectionId{localValue:12, serverValue:2123918}] to mongodb-secondary-0.mongodb.graylog.svc.cluster.local:27017 - {}
2022-08-04 13:32:22,854 INFO [PrometheusExporterHTTPServer] - Exporting Prometheus metrics on <0.0.0.0:9833> via HTTP - {}
2022-08-04 13:32:22,882 INFO [JerseyService] - Enabling CORS for HTTP endpoint - {}
2022-08-04 13:32:24,455 INFO [NetworkListener] - Started listener bound to [0.0.0.0:9000] - {}
2022-08-04 13:32:24,457 INFO [HttpServer] - [HttpServer] Started. - {}
2022-08-04 13:32:24,457 INFO [JerseyService] - Started REST API at <0.0.0.0:9000> - {}
2022-08-04 13:32:24,457 INFO [ServiceManagerListener] - Services are healthy - {}
2022-08-04 13:32:24,457 INFO [JobSchedulerService] - Job scheduler execution is disabled. Waiting and trying again until enabled. - {}
2022-08-04 13:32:24,458 INFO [ServerBootstrap] - Services started, startup times in ms: {FailureHandlingService [RUNNING]=5, UserSessionTerminationService [RUNNING]=11, JobSchedulerService [RUNNING]=27, InputSetupService [RUNNING]=28, BufferSynchronizerService [RUNNING]=32, OutputSetupService [RUNNING]=32, LocalKafkaMessageQueueWriter [RUNNING]=33, UrlWhitelistService [RUNNING]=33, GracefulShutdownService [RUNNING]=34, LocalKafkaMessageQueueReader [RUNNING]=36, EtagService [RUNNING]=84, ConfigurationEtagService [RUNNING]=86, LocalKafkaJournal [RUNNING]=89, MongoDBProcessingStatusRecorderService [RUNNING]=93, LookupTableService [RUNNING]=94, PeriodicalsService [RUNNING]=105, StreamCacheService [RUNNING]=115, PrometheusExporter [RUNNING]=283, JerseyService [RUNNING]=1895} - {}
2022-08-04 13:32:24,458 INFO [InputSetupService] - Triggering launching persisted inputs, node transitioned from Uninitialized [LB:DEAD] to Running [LB:ALIVE] - {}
2022-08-04 13:32:24,467 INFO [ServerBootstrap] - Graylog server up and running. - {}
2022-08-04 13:32:24,472 INFO [InputLauncher] - Launching input [Beats/PubSub-input/620a4cab38926a16a72d774c] - desired state is RUNNING - {}
2022-08-04 13:32:24,477 INFO [InputStateListener] - Input [Beats/620a4cab38926a16a72d774c] is now STARTING - {}
2022-08-04 13:32:24,542 INFO [InputStateListener] - Input [Beats/620a4cab38926a16a72d774c] is now RUNNING - {}
2022-08-04 13:33:00,226 ERROR [AnyExceptionClassMapper] - Unhandled exception in REST resource - {}
java.lang.NullPointerException: null
at org.graylog2.cluster.NodeImpl.isLeader(NodeImpl.java:51) ~[graylog.jar:?]
at org.graylog2.rest.resources.system.ClusterResource.nodeSummary(ClusterResource.java:110) ~[graylog.jar:?]
at org.graylog2.rest.resources.system.ClusterResource.nodes(ClusterResource.java:76) ~[graylog.jar:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_332]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_332]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_332]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_332]
at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52) ~[graylog.jar:?]
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:124) ~[graylog.jar:?]
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:167) ~[graylog.jar:?]
at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:219) ~[graylog.jar:?]
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:79) ~[graylog.jar:?]
at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:469) ~[graylog.jar:?]
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:391) ~[graylog.jar:?]
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:80) ~[graylog.jar:?]
at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:255) [graylog.jar:?]
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248) [graylog.jar:?]
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244) [graylog.jar:?]
at org.glassfish.jersey.internal.Errors.process(Errors.java:292) [graylog.jar:?]
at org.glassfish.jersey.internal.Errors.process(Errors.java:274) [graylog.jar:?]
at org.glassfish.jersey.internal.Errors.process(Errors.java:244) [graylog.jar:?]
at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265) [graylog.jar:?]
at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:234) [graylog.jar:?]
at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:680) [graylog.jar:?]
at org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpContainer.service(GrizzlyHttpContainer.java:356) [graylog.jar:?]
at org.glassfish.grizzly.http.server.HttpHandler$1.run(HttpHandler.java:200) [graylog.jar:?]
at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:180) [graylog.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_332]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_332]
at java.lang.Thread.run(Thread.java:750) [?:1.8.0_332]
I don’t know what this error means, because I tried upgrade to new version on very small cluster with only one Beats input. Cluster have 3 nodes, and I updated it one by one, but this error showed on the first node. After downgrade graylog node to previous version, node was successfully connected to graylog cluster.
2. Environment:
Graylog cluster are running in kubernetes.
Kubernetes version: 1.21.9-gke.300
ElasticSearch version: 7.10.1
MongoDB version: 4.2.1
I don’t know if this error is related with graylog input or connection to some databases ES or Mongodb, but in logs I didn’t find any error about connection to databases.
Thank for help or some information what can be wrong.