Graylog does not start after upgrade to 4.3

I upgraded Graylog to version 4.3 using yum on my Centos 7 server. However, after upgrade the server did not start and I found the following message from “/var/log/graylog-server/server.log”:

2022-05-30T20:14:57.326+03:00 INFO  [ImmutableFeatureFlagsCollector] Following feature flags are used: {}
2022-05-30T20:14:59.717+03:00 INFO  [CmdLineTool] Loaded plugin: AWS plugins 4.3.0 [org.graylog.aws.AWSPlugin]
2022-05-30T20:14:59.718+03:00 INFO  [CmdLineTool] Loaded plugin: Collector 4.3.0 [org.graylog.plugins.collector.CollectorPlugin]
2022-05-30T20:14:59.720+03:00 INFO  [CmdLineTool] Loaded plugin: Threat Intelligence Plugin 4.3.0 [org.graylog.plugins.threatintel.ThreatIntelPlugin]
2022-05-30T20:14:59.720+03:00 INFO  [CmdLineTool] Loaded plugin: Elasticsearch 6 Support 4.3.0+7c09aad [org.graylog.storage.elasticsearch6.Elasticsearch6Plugin]
2022-05-30T20:14:59.720+03:00 INFO  [CmdLineTool] Loaded plugin: Elasticsearch 7 Support 4.3.0+7c09aad [org.graylog.storage.elasticsearch7.Elasticsearch7Plugin]
2022-05-30T20:14:59.820+03:00 INFO  [CmdLineTool] Running with JVM arguments: -Xms1g -Xmx1g -XX:NewRatio=1 -XX:+ResizeTLAB -XX:-OmitStackTraceInFastThrow -Djdk.tls.acknowledgeCloseNotify=true -Dlog4j2.formatMsgNoLookups=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSClassUnloadingEnabled -Dlog4j.configurationFile=file:///etc/graylog/server/log4j2.xml -Djava.library.path=/usr/share/graylog-server/lib/sigar -Dgraylog2.installation_source=rpm
2022-05-30T20:15:01.689+03:00 INFO  [cluster] Cluster created with settings {hosts=[localhost:27017], mode=SINGLE, requiredClusterType=UNKNOWN, serverSelectionTimeout='30000 ms', maxWaitQueueSize=5000}
2022-05-30T20:15:01.850+03:00 INFO  [cluster] Cluster description not yet available. Waiting for 30000 ms before timing out
2022-05-30T20:15:01.967+03:00 INFO  [connection] Opened connection [connectionId{localValue:1, serverValue:389}] to localhost:27017
2022-05-30T20:15:01.987+03:00 INFO  [cluster] Monitor thread successfully connected to server with description ServerDescription{address=localhost:27017, type=STANDALONE, state=CONNECTED, ok=true, version=ServerVersion{versionList=[4, 0, 28]}, minWireVersion=0, maxWireVersion=7, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=17597790}
2022-05-30T20:15:02.077+03:00 INFO  [connection] Opened connection [connectionId{localValue:2, serverValue:390}] to localhost:27017
2022-05-30T20:15:02.175+03:00 INFO  [connection] Closed connection [connectionId{localValue:2, serverValue:390}] to localhost:27017 because the pool has been closed.
2022-05-30T20:15:02.178+03:00 INFO  [MongoDBPreflightCheck] Connected to MongoDB version 4.0.28
2022-05-30T20:15:02.707+03:00 INFO  [SearchDbPreflightCheck] Connected to (Elastic/Open)Search version <Elasticsearch:6.8.12>
2022-05-30T20:15:03.305+03:00 INFO  [Version] HV000001: Hibernate Validator null
2022-05-30T20:15:13.178+03:00 INFO  [InputBufferImpl] Message journal is enabled.
2022-05-30T20:15:13.265+03:00 INFO  [NodeId] Node ID: 8eab9e8e-2e3e-4409-8ee5-06609758fe11
2022-05-30T20:15:13.998+03:00 INFO  [LogManager] Loading logs.
2022-05-30T20:15:14.101+03:00 WARN  [Log] Found a corrupted index file, /var/lib/graylog-server/journal/messagejournal-0/00000000004392243507.index, deleting and rebuilding index...
2022-05-30T20:15:19.815+03:00 INFO  [LogManager] Logs loading complete.
2022-05-30T20:15:19.821+03:00 INFO  [LocalKafkaJournal] Initialized Kafka based journal at /var/lib/graylog-server/journal
2022-05-30T20:15:19.845+03:00 INFO  [cluster] Cluster created with settings {hosts=[localhost:27017], mode=SINGLE, requiredClusterType=UNKNOWN, serverSelectionTimeout='30000 ms', maxWaitQueueSize=5000}
2022-05-30T20:15:19.859+03:00 INFO  [cluster] Cluster description not yet available. Waiting for 30000 ms before timing out
2022-05-30T20:15:19.869+03:00 INFO  [connection] Opened connection [connectionId{localValue:3, serverValue:391}] to localhost:27017
2022-05-30T20:15:19.871+03:00 INFO  [cluster] Monitor thread successfully connected to server with description ServerDescription{address=localhost:27017, type=STANDALONE, state=CONNECTED, ok=true, version=ServerVersion{versionList=[4, 0, 28]}, minWireVersion=0, maxWireVersion=7, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=1095149}
2022-05-30T20:15:19.887+03:00 INFO  [connection] Opened connection [connectionId{localValue:4, serverValue:392}] to localhost:27017
2022-05-30T20:15:20.303+03:00 INFO  [InputBufferImpl] Initialized InputBufferImpl with ring size <65536> and wait strategy <BlockingWaitStrategy>, running 2 parallel message handlers.
2022-05-30T20:15:21.511+03:00 INFO  [ElasticsearchVersionProvider] Elasticsearch cluster is running Elasticsearch:6.8.12
2022-05-30T20:15:21.673+03:00 INFO  [AbstractJestClient] Setting server pool to a list of 1 servers: [http://127.0.0.1:9200]
2022-05-30T20:15:21.674+03:00 INFO  [JestClientFactory] Using multi thread/connection supporting pooling connection manager
2022-05-30T20:15:21.931+03:00 INFO  [JestClientFactory] Using custom ObjectMapper instance
2022-05-30T20:15:21.931+03:00 INFO  [JestClientFactory] Node Discovery disabled...
2022-05-30T20:15:21.931+03:00 INFO  [JestClientFactory] Idle connection reaping disabled...
2022-05-30T20:15:22.605+03:00 INFO  [connection] Opened connection [connectionId{localValue:5, serverValue:393}] to localhost:27017
2022-05-30T20:15:23.734+03:00 INFO  [ProcessBuffer] Initialized ProcessBuffer with ring size <65536> and wait strategy <BlockingWaitStrategy>.
2022-05-30T20:15:24.865+03:00 INFO  [OutputBuffer] Initialized OutputBuffer with ring size <65536> and wait strategy <BlockingWaitStrategy>.
2022-05-30T20:15:27.751+03:00 INFO  [ServerBootstrap] Graylog server 4.3.0+7c09aad starting up
2022-05-30T20:15:27.752+03:00 INFO  [ServerBootstrap] JRE: Red Hat, Inc. 1.8.0_332 on Linux 3.10.0-1127.19.1.el7.x86_64
2022-05-30T20:15:27.752+03:00 INFO  [ServerBootstrap] Deployment: rpm
2022-05-30T20:15:27.752+03:00 INFO  [ServerBootstrap] OS: CentOS Linux 7 (Core) (centos)
2022-05-30T20:15:27.752+03:00 INFO  [ServerBootstrap] Arch: amd64
2022-05-30T20:15:28.347+03:00 INFO  [ServerBootstrap] Running 44 migrations...
2022-05-30T20:15:30.891+03:00 WARN  [ServerBootstrap] Exception while running migrations
java.lang.RuntimeException: Could not resolve new ref for condition on EventDefinition <5e2afd8f39b22502c984949e>. oldref <"161a79ec-6d31-4317-94db-5cb214b272aa"> refMap <{6a294191-44fa-4990-8a30-b01384324d9b=count-}>
        at org.graylog2.migrations.V20200102140000_UnifyEventSeriesId.convertConditions(V20200102140000_UnifyEventSeriesId.java:116) ~[graylog.jar:?]
        at org.graylog2.migrations.V20200102140000_UnifyEventSeriesId.lambda$convertConditions$1(V20200102140000_UnifyEventSeriesId.java:127) ~[graylog.jar:?]
        at java.util.Iterator.forEachRemaining(Iterator.java:116) ~[?:1.8.0_332]
        at org.graylog2.migrations.V20200102140000_UnifyEventSeriesId.convertConditions(V20200102140000_UnifyEventSeriesId.java:124) ~[graylog.jar:?]
        at org.graylog2.migrations.V20200102140000_UnifyEventSeriesId.lambda$convertConditions$1(V20200102140000_UnifyEventSeriesId.java:127) ~[graylog.jar:?]
        at java.util.Iterator.forEachRemaining(Iterator.java:116) ~[?:1.8.0_332]
        at org.graylog2.migrations.V20200102140000_UnifyEventSeriesId.convertConditions(V20200102140000_UnifyEventSeriesId.java:124) ~[graylog.jar:?]
        at org.graylog2.migrations.V20200102140000_UnifyEventSeriesId.unifySeriesId(V20200102140000_UnifyEventSeriesId.java:103) ~[graylog.jar:?]
        at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) ~[?:1.8.0_332]
        at java.util.Iterator.forEachRemaining(Iterator.java:116) ~[?:1.8.0_332]
        at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) ~[?:1.8.0_332]
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) ~[?:1.8.0_332]
        at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) ~[?:1.8.0_332]
        at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) ~[?:1.8.0_332]
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:1.8.0_332]
        at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566) ~[?:1.8.0_332]
        at org.graylog2.migrations.V20200102140000_UnifyEventSeriesId.upgrade(V20200102140000_UnifyEventSeriesId.java:72) ~[graylog.jar:?]
        at org.graylog2.bootstrap.ServerBootstrap.lambda$runMigrations$0(ServerBootstrap.java:263) ~[graylog.jar:?]
        at com.google.common.collect.ImmutableList.forEach(ImmutableList.java:422) ~[graylog.jar:?]
        at com.google.common.collect.RegularImmutableSortedSet.forEach(RegularImmutableSortedSet.java:88) ~[graylog.jar:?]
        at org.graylog2.bootstrap.ServerBootstrap.runMigrations(ServerBootstrap.java:261) ~[graylog.jar:?]
        at org.graylog2.bootstrap.ServerBootstrap.startCommand(ServerBootstrap.java:187) [graylog.jar:?]
        at org.graylog2.bootstrap.CmdLineTool.run(CmdLineTool.java:311) [graylog.jar:?]
        at org.graylog2.bootstrap.Main.main(Main.java:45) [graylog.jar:?]

Is there a way to manually handle that ref for condition on EventDefinition, or a configuration option that lets you omit that check? How should I start fixing this?

Thanks in advance for any help you can offer!

Hello,

You may need to redo that Event Definition. Depend on what version you upgrade from. condition may have changed.

Thank you for the quick reply!

Is there an “offline” method of redoing/deleting an event? The server doesn’t start, so I’m unable to access the web UI.

I upgraded from the latest version, I think it was 4.2.9.

Hello,

My apologies, I over looked the statement about Graylog service not starting, I was assuming you were able to get it running.

Couple tips you can look at.

  1. Insure that the Plugin directory has the correct version corresponding to the version of Graylog 4.2.9. that was installed.
  2. If using certificates double check those and the permissions of file and folders that Graylog uses.
  3. Insure the other services are running without issues ( i.e. Elasticsearch, MongoDb). You may want to check out there logs incase you might fine more clues on what’s going on.

Not that I know of.
We can help you further but need more info. The first would be is the full log file , if possible after starting graylog service. This would be done by tailing -f Graylogs log file. It will show a clear understanding how Graylog is starting up which gives a better idea how to troubleshoot this issue.
Next, post Graylog configuration file here to see if there were any missed configurations. If you could brief us on how you upgraded this server from version .x.x.x to y.y.y would be helpful from what I read Im not sure what is installed now or what version you started off with. When posting Configuration files or logs please use the markdown </>. I did adjust you log file post above, it was hard to read/understand.

By chance is there any configuration made outside Graylog Default configurations? Meaning do you have a custom made Index template, etc…

Thanks

Thanks for the response.

I checked the plugin directory and it had the following files:

graylog-storage-elasticsearch7-4.3.0.jar
graylog-storage-elasticsearch6-4.3.0.jar
graylog-plugin-threatintel-4.3.0.jar
graylog-plugin-collector-4.3.0.jar
graylog-plugin-aws-4.3.0.jar

Based on this the plugin directory would correspond to version 4.3.0?

I double checked the configuration regarding certificates, and they are not being used. I should really enable https, but first I’ll need to fix the server.

I checked MongoDB service and it’s active without any problems. Elasticsearch is the same, it has been running non stop for one year and 7 months.

I’ll paste the output of logfiles at the end of this post. Lets see if they offer any usefull information (Elasticsearch hasn’t been creating new log files):

/var/log/mongodb/mongod.log

2022-06-13T11:05:13.854+0300 I NETWORK  [listener] connection accepted from 127.0.0.1:54836 #129180 (1 connection now open)
2022-06-13T11:05:13.894+0300 I NETWORK  [conn129180] received client metadata from 127.0.0.1:54836 conn129180: { driver: { name: "mongo-java-driver|legacy", version: "3.12.1" }, os: { type: "Linux", name: "Linux", architecture: "amd64", version: "3.10.0-1127.19.1.el7.x86_64" }, platform: "Java/Red Hat, Inc./1.8.0_332-b09" }
2022-06-13T11:05:14.017+0300 I NETWORK  [listener] connection accepted from 127.0.0.1:54838 #129181 (2 connections now open)
2022-06-13T11:05:14.028+0300 I NETWORK  [conn129181] received client metadata from 127.0.0.1:54838 conn129181: { driver: { name: "mongo-java-driver|legacy", version: "3.12.1" }, os: { type: "Linux", name: "Linux", architecture: "amd64", version: "3.10.0-1127.19.1.el7.x86_64" }, platform: "Java/Red Hat, Inc./1.8.0_332-b09" }
2022-06-13T11:05:14.178+0300 I NETWORK  [conn129180] end connection 127.0.0.1:54836 (1 connection now open)
2022-06-13T11:05:14.178+0300 I NETWORK  [conn129181] end connection 127.0.0.1:54838 (0 connections now open)
2022-06-13T11:05:31.442+0300 I NETWORK  [listener] connection accepted from 127.0.0.1:54842 #129182 (1 connection now open)
2022-06-13T11:05:31.457+0300 I NETWORK  [conn129182] received client metadata from 127.0.0.1:54842 conn129182: { driver: { name: "mongo-java-driver|legacy", version: "3.12.1" }, os: { type: "Linux", name: "Linux", architecture: "amd64", version: "3.10.0-1127.19.1.el7.x86_64" }, platform: "Java/Red Hat, Inc./1.8.0_332-b09" }
2022-06-13T11:05:31.473+0300 I NETWORK  [listener] connection accepted from 127.0.0.1:54844 #129183 (2 connections now open)
2022-06-13T11:05:31.486+0300 I NETWORK  [conn129183] received client metadata from 127.0.0.1:54844 conn129183: { driver: { name: "mongo-java-driver|legacy", version: "3.12.1" }, os: { type: "Linux", name: "Linux", architecture: "amd64", version: "3.10.0-1127.19.1.el7.x86_64" }, platform: "Java/Red Hat, Inc./1.8.0_332-b09" }
2022-06-13T11:05:35.403+0300 I NETWORK  [listener] connection accepted from 127.0.0.1:54848 #129184 (3 connections now open)
2022-06-13T11:05:35.441+0300 I NETWORK  [conn129184] received client metadata from 127.0.0.1:54848 conn129184: { driver: { name: "mongo-java-driver|legacy", version: "3.12.1" }, os: { type: "Linux", name: "Linux", architecture: "amd64", version: "3.10.0-1127.19.1.el7.x86_64" }, platform: "Java/Red Hat, Inc./1.8.0_332-b09" }
2022-06-13T11:05:42.415+0300 I NETWORK  [conn129182] end connection 127.0.0.1:54842 (2 connections now open)
2022-06-13T11:05:42.415+0300 I NETWORK  [conn129183] end connection 127.0.0.1:54844 (1 connection now open)
2022-06-13T11:05:42.415+0300 I NETWORK  [conn129184] end connection 127.0.0.1:54848 (0 connections now open

gc log (/var/log/elasticsearch/gc.log.10.current) keeps randomly pushing out following messages the timing of which don’t coincide with trying to start graylog server:

2022-06-13T11:17:02.849+0300: 52518225,967: Total time for which application threads were stopped: 0,0145317 seconds, Stopping threads took: 0,0001303 seconds
2022-06-13T11:21:57.947+0300: 52518521,065: [GC (Allocation Failure) 2022-06-13T11:21:57.947+0300: 52518521,065: [ParNew
Desired survivor size 4358144 bytes, new threshold 6 (max 6)
- age   1:     337992 bytes,     337992 total
- age   2:       5440 bytes,     343432 total
- age   3:        408 bytes,     343840 total
- age   4:        704 bytes,     344544 total
- age   5:         32 bytes,     344576 total
- age   6:        224 bytes,     344800 total
: 68489K->436K(76672K), 0,0140126 secs] 582844K->514791K(1040064K), 0,0142258 secs] [Times: user=0,02 sys=0,00, real=0,01 secs]

And here is everything that happens in /var/log/graylog-server/server.log after trying to start the server:

[root@localhost ~]# tail -f -n0 /var/log/graylog-server/server.log
2022-06-13T11:05:09.732+03:00 INFO  [ImmutableFeatureFlagsCollector] Following feature flags are used: {}
2022-06-13T11:05:12.117+03:00 INFO  [CmdLineTool] Loaded plugin: AWS plugins 4.3.0 [org.graylog.aws.AWSPlugin]
2022-06-13T11:05:12.119+03:00 INFO  [CmdLineTool] Loaded plugin: Collector 4.3.0 [org.graylog.plugins.collector.CollectorPlugin]
2022-06-13T11:05:12.133+03:00 INFO  [CmdLineTool] Loaded plugin: Threat Intelligence Plugin 4.3.0 [org.graylog.plugins.threatintel.ThreatIntelPlugin]
2022-06-13T11:05:12.134+03:00 INFO  [CmdLineTool] Loaded plugin: Elasticsearch 6 Support 4.3.0+7c09aad [org.graylog.storage.elasticsearch6.Elasticsearch6Plugin]
2022-06-13T11:05:12.134+03:00 INFO  [CmdLineTool] Loaded plugin: Elasticsearch 7 Support 4.3.0+7c09aad [org.graylog.storage.elasticsearch7.Elasticsearch7Plugin]
2022-06-13T11:05:12.213+03:00 INFO  [CmdLineTool] Running with JVM arguments: -Xms1g -Xmx1g -XX:NewRatio=1 -XX:+ResizeTLAB -XX:-OmitStackTraceInFastThrow -Djdk.tls.acknowledgeCloseNotify=true -Dlog4j2.formatMsgNoLookups=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSClassUnloadingEnabled -Dlog4j.configurationFile=file:///etc/graylog/server/log4j2.xml -Djava.library.path=/usr/share/graylog-server/lib/sigar -Dgraylog2.installation_source=rpm
2022-06-13T11:05:13.746+03:00 INFO  [cluster] Cluster created with settings {hosts=[localhost:27017], mode=SINGLE, requiredClusterType=UNKNOWN, serverSelectionTimeout='30000 ms', maxWaitQueueSize=5000}
2022-06-13T11:05:13.860+03:00 INFO  [cluster] Cluster description not yet available. Waiting for 30000 ms before timing out
2022-06-13T11:05:13.945+03:00 INFO  [connection] Opened connection [connectionId{localValue:1, serverValue:129180}] to localhost:27017
2022-06-13T11:05:13.965+03:00 INFO  [cluster] Monitor thread successfully connected to server with description ServerDescription{address=localhost:27017, type=STANDALONE, state=CONNECTED, ok=true, version=ServerVersion{versionList=[4, 0, 28]}, minWireVersion=0, maxWireVersion=7, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=17544223}
2022-06-13T11:05:14.061+03:00 INFO  [connection] Opened connection [connectionId{localValue:2, serverValue:129181}] to localhost:27017
2022-06-13T11:05:14.171+03:00 INFO  [connection] Closed connection [connectionId{localValue:2, serverValue:129181}] to localhost:27017 because the pool has been closed.
2022-06-13T11:05:14.173+03:00 INFO  [MongoDBPreflightCheck] Connected to MongoDB version 4.0.28
2022-06-13T11:05:14.695+03:00 INFO  [SearchDbPreflightCheck] Connected to (Elastic/Open)Search version <Elasticsearch:6.8.12>
2022-06-13T11:05:15.351+03:00 INFO  [Version] HV000001: Hibernate Validator null
2022-06-13T11:05:25.441+03:00 INFO  [InputBufferImpl] Message journal is enabled.
2022-06-13T11:05:25.529+03:00 INFO  [NodeId] Node ID: 8eab9e8e-2e3e-4409-8ee5-06609758fe11
2022-06-13T11:05:26.312+03:00 INFO  [LogManager] Loading logs.
2022-06-13T11:05:26.406+03:00 WARN  [Log] Found a corrupted index file, /var/lib/graylog-server/journal/messagejournal-0/00000000004392243507.index, deleting and rebuilding index...
2022-06-13T11:05:31.404+03:00 INFO  [LogManager] Logs loading complete.
2022-06-13T11:05:31.410+03:00 INFO  [LocalKafkaJournal] Initialized Kafka based journal at /var/lib/graylog-server/journal
2022-06-13T11:05:31.438+03:00 INFO  [cluster] Cluster created with settings {hosts=[localhost:27017], mode=SINGLE, requiredClusterType=UNKNOWN, serverSelectionTimeout='30000 ms', maxWaitQueueSize=5000}
2022-06-13T11:05:31.443+03:00 INFO  [cluster] Cluster description not yet available. Waiting for 30000 ms before timing out
2022-06-13T11:05:31.463+03:00 INFO  [connection] Opened connection [connectionId{localValue:3, serverValue:129182}] to localhost:27017
2022-06-13T11:05:31.470+03:00 INFO  [cluster] Monitor thread successfully connected to server with description ServerDescription{address=localhost:27017, type=STANDALONE, state=CONNECTED, ok=true, version=ServerVersion{versionList=[4, 0, 28]}, minWireVersion=0, maxWireVersion=7, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=6757153}
2022-06-13T11:05:31.502+03:00 INFO  [connection] Opened connection [connectionId{localValue:4, serverValue:129183}] to localhost:27017
2022-06-13T11:05:32.080+03:00 INFO  [InputBufferImpl] Initialized InputBufferImpl with ring size <65536> and wait strategy <BlockingWaitStrategy>, running 2 parallel message handlers.
2022-06-13T11:05:33.194+03:00 INFO  [ElasticsearchVersionProvider] Elasticsearch cluster is running Elasticsearch:6.8.12
2022-06-13T11:05:33.319+03:00 INFO  [AbstractJestClient] Setting server pool to a list of 1 servers: [http://127.0.0.1:9200]
2022-06-13T11:05:33.320+03:00 INFO  [JestClientFactory] Using multi thread/connection supporting pooling connection manager
2022-06-13T11:05:33.556+03:00 INFO  [JestClientFactory] Using custom ObjectMapper instance
2022-06-13T11:05:33.557+03:00 INFO  [JestClientFactory] Node Discovery disabled...
2022-06-13T11:05:33.557+03:00 INFO  [JestClientFactory] Idle connection reaping disabled...
2022-06-13T11:05:35.255+03:00 INFO  [ProcessBuffer] Initialized ProcessBuffer with ring size <65536> and wait strategy <BlockingWaitStrategy>.
2022-06-13T11:05:35.458+03:00 INFO  [connection] Opened connection [connectionId{localValue:5, serverValue:129184}] to localhost:27017
2022-06-13T11:05:36.278+03:00 INFO  [OutputBuffer] Initialized OutputBuffer with ring size <65536> and wait strategy <BlockingWaitStrategy>.
2022-06-13T11:05:39.163+03:00 INFO  [ServerBootstrap] Graylog server 4.3.0+7c09aad starting up
2022-06-13T11:05:39.192+03:00 INFO  [ServerBootstrap] JRE: Red Hat, Inc. 1.8.0_332 on Linux 3.10.0-1127.19.1.el7.x86_64
2022-06-13T11:05:39.193+03:00 INFO  [ServerBootstrap] Deployment: rpm
2022-06-13T11:05:39.193+03:00 INFO  [ServerBootstrap] OS: CentOS Linux 7 (Core) (centos)
2022-06-13T11:05:39.193+03:00 INFO  [ServerBootstrap] Arch: amd64
2022-06-13T11:05:39.757+03:00 INFO  [ServerBootstrap] Running 44 migrations...
2022-06-13T11:05:42.138+03:00 WARN  [ServerBootstrap] Exception while running migrations
java.lang.RuntimeException: Could not resolve new ref for condition on EventDefinition <5e2afd8f39b22502c984949e>. oldref <"161a79ec-6d31-4317-94db-5cb214b272aa"> refMap <{6a294191-44fa-4990-8a30-b01384324d9b=count-}>
        at org.graylog2.migrations.V20200102140000_UnifyEventSeriesId.convertConditions(V20200102140000_UnifyEventSeriesId.java:116) ~[graylog.jar:?]
        at org.graylog2.migrations.V20200102140000_UnifyEventSeriesId.lambda$convertConditions$1(V20200102140000_UnifyEventSeriesId.java:127) ~[graylog.jar:?]
        at java.util.Iterator.forEachRemaining(Iterator.java:116) ~[?:1.8.0_332]
        at org.graylog2.migrations.V20200102140000_UnifyEventSeriesId.convertConditions(V20200102140000_UnifyEventSeriesId.java:124) ~[graylog.jar:?]
        at org.graylog2.migrations.V20200102140000_UnifyEventSeriesId.lambda$convertConditions$1(V20200102140000_UnifyEventSeriesId.java:127) ~[graylog.jar:?]
        at java.util.Iterator.forEachRemaining(Iterator.java:116) ~[?:1.8.0_332]
        at org.graylog2.migrations.V20200102140000_UnifyEventSeriesId.convertConditions(V20200102140000_UnifyEventSeriesId.java:124) ~[graylog.jar:?]
        at org.graylog2.migrations.V20200102140000_UnifyEventSeriesId.unifySeriesId(V20200102140000_UnifyEventSeriesId.java:103) ~[graylog.jar:?]
        at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) ~[?:1.8.0_332]
        at java.util.Iterator.forEachRemaining(Iterator.java:116) ~[?:1.8.0_332]
        at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) ~[?:1.8.0_332]
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) ~[?:1.8.0_332]
        at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) ~[?:1.8.0_332]
        at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) ~[?:1.8.0_332]
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:1.8.0_332]
        at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566) ~[?:1.8.0_332]
        at org.graylog2.migrations.V20200102140000_UnifyEventSeriesId.upgrade(V20200102140000_UnifyEventSeriesId.java:72) ~[graylog.jar:?]
        at org.graylog2.bootstrap.ServerBootstrap.lambda$runMigrations$0(ServerBootstrap.java:263) ~[graylog.jar:?]
        at com.google.common.collect.ImmutableList.forEach(ImmutableList.java:422) ~[graylog.jar:?]
        at com.google.common.collect.RegularImmutableSortedSet.forEach(RegularImmutableSortedSet.java:88) ~[graylog.jar:?]
        at org.graylog2.bootstrap.ServerBootstrap.runMigrations(ServerBootstrap.java:261) ~[graylog.jar:?]
        at org.graylog2.bootstrap.ServerBootstrap.startCommand(ServerBootstrap.java:187) [graylog.jar:?]
        at org.graylog2.bootstrap.CmdLineTool.run(CmdLineTool.java:311) [graylog.jar:?]
        at org.graylog2.bootstrap.Main.main(Main.java:45) [graylog.jar:?]

I hope tis information is of any help trying to solve the problem. I thank you for the time you’re taking to assist me on this!

Hello,

I see you have connection to MongoDb localhost:27017, Also see a connection to Elasticsearch 127.0.0.1:9200.
What I’m not able to see is the end of the log file stating running, shutdown or failed. Normally there is a reason why.

The warn in GrayLog logs just shows Event definition problem.

What does Graylog status look like?

systemctl status graylog-server

What does Graylog configuration look like?

EDIT: Correction Graylog it shows this.

INFO  [ServerBootstrap] Graylog server 4.3.0+7c09aad starting up

So it looks like Graylog started up?

And a closer look I seen this

[Log] Found a corrupted index file, /var/lib/graylog-server/journal/messagejournal-0/00000000004392243507.index, deleting and rebuilding index..

Looks like you have some journal corruption but it also looks like it took care of that , I assume.

Here is the output of printing the graylog-server status:

systemctl status graylog-server
● graylog-server.service - Graylog server
   Loaded: loaded (/usr/lib/systemd/system/graylog-server.service; enabled; vend                                                                          or preset: disabled)
   Active: inactive (dead) (Result: exit-code) since ma 2022-06-13 11:05:42 EEST                                                                          ; 1 day 5h ago
     Docs: http://docs.graylog.org/
  Process: 18505 ExecStart=/usr/share/graylog-server/bin/graylog-server (code=ex                                                                          ited, status=1/FAILURE)
 Main PID: 18505 (code=exited, status=1/FAILURE)

kesä 13 11:05:42 localhost.localdomain systemd[1]: graylog-server.service: m...
kesä 13 11:05:42 localhost.localdomain systemd[1]: Unit graylog-server.servi...
kesä 13 11:05:42 localhost.localdomain systemd[1]: graylog-server.service fa...
kesä 13 11:05:42 localhost.localdomain systemd[1]: Stopped Graylog server.
Hint: Some lines were ellipsized, use -l to show in full.
[root@localhost ~]# systemctl status graylog-server
● graylog-server.service - Graylog server
   Loaded: loaded (/usr/lib/systemd/system/graylog-server.service; enabled; vendor preset: disabled)
   Active: inactive (dead) (Result: exit-code) since ma 2022-06-13 11:05:42 EEST; 1 day 5h ago
     Docs: http://docs.graylog.org/
  Process: 18505 ExecStart=/usr/share/graylog-server/bin/graylog-server (code=exited, status=1/FAILURE)
 Main PID: 18505 (code=exited, status=1/FAILURE)

kesä 13 11:05:42 localhost.localdomain systemd[1]: graylog-server.service: main process exited, code=exited, status=1/FAILURE
kesä 13 11:05:42 localhost.localdomain systemd[1]: Unit graylog-server.service entered failed state.
kesä 13 11:05:42 localhost.localdomain systemd[1]: graylog-server.service failed.
kesä 13 11:05:42 localhost.localdomain systemd[1]: Stopped Graylog server.

Here is some output from

journalctl --unit=graylog-server 
kesä 13 11:05:06 localhost.localdomain systemd[1]: Started Graylog server.
kesä 13 11:05:06 localhost.localdomain graylog-server[18505]: OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N
kesä 13 11:05:42 localhost.localdomain systemd[1]: graylog-server.service: main process exited, code=exited, status=1/FAILURE
kesä 13 11:05:42 localhost.localdomain systemd[1]: Unit graylog-server.service entered failed state.
kesä 13 11:05:42 localhost.localdomain systemd[1]: graylog-server.service failed.
kesä 13 11:05:42 localhost.localdomain systemd[1]: Stopped Graylog server.

Actually it seems like the server is unable to “fix” the journal corruption because every time the server tries to start that line is present. I did try to disable the journal though and it didn’t seem to help setting:

message_journal_enabled = false

I’ll post the server configuration in separate message.

Here are the contents of server.conf file:

############################
# GRAYLOG CONFIGURATION FILE
############################
#
# This is the Graylog configuration file. The file has to use ISO 8859-1/Latin-1 character encoding.
# Characters that cannot be directly represented in this encoding can be written using Unicode escapes
# as defined in https://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-3.3, using the \u prefix.
# For example, \u002c.
#
# * Entries are generally expected to be a single line of the form, one of the following:
#
# propertyName=propertyValue
# propertyName:propertyValue
#
# * White space that appears between the property name and property value is ignored,
#   so the following are equivalent:
#
# name=Stephen
# name = Stephen
#
# * White space at the beginning of the line is also ignored.
#
# * Lines that start with the comment characters ! or # are ignored. Blank lines are also ignored.
#
# * The property value is generally terminated by the end of the line. White space following the
#   property value is not ignored, and is treated as part of the property value.
#
# * A property value can span several lines if each line is terminated by a backslash (‘\’) character.
#   For example:
#
# targetCities=\
#         Detroit,\
#         Chicago,\
#         Los Angeles
#
#   This is equivalent to targetCities=Detroit,Chicago,Los Angeles (white space at the beginning of lines is ignored).
#
# * The characters newline, carriage return, and tab can be inserted with characters \n, \r, and \t, respectively.
#
# * The backslash character must be escaped as a double backslash. For example:
#
# path=c:\\docs\\doc1
#

# If you are running more than one instances of Graylog server you have to select one of these
# instances as master. The master will perform some periodical tasks that non-masters won't perform.
is_master = true

# The auto-generated node ID will be stored in this file and read after restarts. It is a good idea
# to use an absolute file path here if you are starting Graylog server from init scripts or similar.
node_id_file = /etc/graylog/server/node-id

# You MUST set a secret to secure/pepper the stored user passwords here. Use at least 64 characters.
# Generate one by using for example: pwgen -N 1 -s 96
password_secret = SKejsdfgNKB1203*Ä*FLErjmbn230483i1KGDSNvcxxxxxxxxxxxxxxxxxxxxx

# The default root user is named 'admin'
#root_username = admin

# You MUST specify a hash password for the root user (which you only need to initially set up the
# system and in case you lose connectivity to your authentication backend)
# This password cannot be changed using the API or via the web interface. If you need to change it,
# modify it in this file.
# Create one by using for example: echo -n yourpassword | shasum -a 256
# and put the resulting hash value into the following line
root_password_sha2 = 932b33b2de4f8f1a9fb326d5xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

# The email address of the root user.
# Default is empty
root_email = "xxxxxx@xxx.xx"

# The time zone setting of the root user. See http://www.joda.org/joda-time/timezones.html for a list of valid time zones.
# Default is UTC
root_timezone = Europe/Helsinki

# Set the bin directory here (relative or absolute)
# This directory contains binaries that are used by the Graylog server.
# Default: bin
bin_dir = /usr/share/graylog-server/bin

# Set the data directory here (relative or absolute)
# This directory is used to store Graylog server state.
# Default: data
data_dir = /var/lib/graylog-server

# Set plugin directory here (relative or absolute)
plugin_dir = /usr/share/graylog-server/plugin

###############
# HTTP settings
###############

#### HTTP bind address
#
# The network interface used by the Graylog HTTP interface.
#
# This network interface must be accessible by all Graylog nodes in the cluster and by all clients
# using the Graylog web interface.
#
# If the port is omitted, Graylog will use port 9000 by default.
#
# Default: 127.0.0.1:9000
http_bind_address = 192.168.xxx.xxx:9000
#http_bind_address = [2001:db8::1]:9000

#### HTTP publish URI
#
# The HTTP URI of this Graylog node which is used to communicate with the other Graylog nodes in the cluster and by all
# clients using the Graylog web interface.
#
# The URI will be published in the cluster discovery APIs, so that other Graylog nodes will be able to find and connect to this Graylog node.
#
# This configuration setting has to be used if this Graylog node is available on another network interface than $http_bind_address,
# for example if the machine has multiple network interfaces or is behind a NAT gateway.
#
# If $http_bind_address contains a wildcard IPv4 address (0.0.0.0), the first non-loopback IPv4 address of this machine will be used.
# This configuration setting *must not* contain a wildcard address!
#
# Default: http://$http_bind_address/
#http_publish_uri = http://192.168.1.1:9000/

#### External Graylog URI
#
# The public URI of Graylog which will be used by the Graylog web interface to communicate with the Graylog REST API.
#
# The external Graylog URI usually has to be specified, if Graylog is running behind a reverse proxy or load-balancer
# and it will be used to generate URLs addressing entities in the Graylog REST API (see $http_bind_address).
#
# When using Graylog Collector, this URI will be used to receive heartbeat messages and must be accessible for all collectors.
#
# This setting can be overriden on a per-request basis with the "X-Graylog-Server-URL" HTTP request header.
#
# Default: $http_publish_uri
#http_external_uri =

#### Enable CORS headers for HTTP interface
#
# This is necessary for JS-clients accessing the server directly.
# If these are disabled, modern browsers will not be able to retrieve resources from the server.
# This is enabled by default. Uncomment the next line to disable it.
#http_enable_cors = false

#### Enable GZIP support for HTTP interface
#
# This compresses API responses and therefore helps to reduce
# overall round trip times. This is enabled by default. Uncomment the next line to disable it.
#http_enable_gzip = false

# The maximum size of the HTTP request headers in bytes.
#http_max_header_size = 8192

# The size of the thread pool used exclusively for serving the HTTP interface.
#http_thread_pool_size = 16

################
# HTTPS settings
################

#### Enable HTTPS support for the HTTP interface
#
# This secures the communication with the HTTP interface with TLS to prevent request forgery and eavesdropping.
#
# Default: false
#http_enable_tls = true

# The X.509 certificate chain file in PEM format to use for securing the HTTP interface.
#http_tls_cert_file = /path/to/graylog.crt

# The PKCS#8 private key file in PEM format to use for securing the HTTP interface.
#http_tls_key_file = /path/to/graylog.key

# The password to unlock the private key used for securing the HTTP interface.
#http_tls_key_password = secret


# Comma separated list of trusted proxies that are allowed to set the client address with X-Forwarded-For
# header. May be subnets, or hosts.
#trusted_proxies = 127.0.0.1/32, 0:0:0:0:0:0:0:1/128

# List of Elasticsearch hosts Graylog should connect to.
# Need to be specified as a comma-separated list of valid URIs for the http ports of your elasticsearch nodes.
# If one or more of your elasticsearch hosts require authentication, include the credentials in each node URI that
# requires authentication.
#
# Default: http://127.0.0.1:9200
#elasticsearch_hosts = http://node1:9200,http://user:password@node2:19200

# Maximum amount of time to wait for successfull connection to Elasticsearch HTTP port.
#
# Default: 10 Seconds
#elasticsearch_connect_timeout = 10s

# Maximum amount of time to wait for reading back a response from an Elasticsearch server.
#
# Default: 60 seconds
#elasticsearch_socket_timeout = 60s

# Maximum idle time for an Elasticsearch connection. If this is exceeded, this connection will
# be tore down.
#
# Default: inf
#elasticsearch_idle_timeout = -1s

# Maximum number of total connections to Elasticsearch.
#
# Default: 20
#elasticsearch_max_total_connections = 20

# Maximum number of total connections per Elasticsearch route (normally this means per
# elasticsearch server).
#
# Default: 2
#elasticsearch_max_total_connections_per_route = 2

# Maximum number of times Graylog will retry failed requests to Elasticsearch.
#
# Default: 2
#elasticsearch_max_retries = 2

# Enable automatic Elasticsearch node discovery through Nodes Info,
# see https://www.elastic.co/guide/en/elasticsearch/reference/5.4/cluster-nodes-info.html
#
# WARNING: Automatic node discovery does not work if Elasticsearch requires authentication, e. g. with Shield.
#
# Default: false
#elasticsearch_discovery_enabled = true

# Filter for including/excluding Elasticsearch nodes in discovery according to their custom attributes,
# see https://www.elastic.co/guide/en/elasticsearch/reference/5.4/cluster.html#cluster-nodes
#
# Default: empty
#elasticsearch_discovery_filter = rack:42

# Frequency of the Elasticsearch node discovery.
#
# Default: 30s
# elasticsearch_discovery_frequency = 30s

# Enable payload compression for Elasticsearch requests.
#
# Default: false
#elasticsearch_compression_enabled = true

# Graylog will use multiple indices to store documents in. You can configured the strategy it uses to determine
# when to rotate the currently active write index.
# It supports multiple rotation strategies:
#   - "count" of messages per index, use elasticsearch_max_docs_per_index below to configure
#   - "size" per index, use elasticsearch_max_size_per_index below to configure
# valid values are "count", "size" and "time", default is "count"
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see http://docs.graylog.org/en/2.3/pages/configuration/index_model.html#index-set-configuration.
rotation_strategy = count

# (Approximate) maximum number of documents in an Elasticsearch index before a new index
# is being created, also see no_retention and elasticsearch_max_number_of_indices.
# Configure this if you used 'rotation_strategy = count' above.
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see http://docs.graylog.org/en/2.3/pages/configuration/index_model.html#index-set-configuration.
elasticsearch_max_docs_per_index = 20000000

# (Approximate) maximum size in bytes per Elasticsearch index on disk before a new index is being created, also see
# no_retention and elasticsearch_max_number_of_indices. Default is 1GB.
# Configure this if you used 'rotation_strategy = size' above.
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see http://docs.graylog.org/en/2.3/pages/configuration/index_model.html#index-set-configuration.
#elasticsearch_max_size_per_index = 1073741824

# (Approximate) maximum time before a new Elasticsearch index is being created, also see
# no_retention and elasticsearch_max_number_of_indices. Default is 1 day.
# Configure this if you used 'rotation_strategy = time' above.
# Please note that this rotation period does not look at the time specified in the received messages, but is
# using the real clock value to decide when to rotate the index!
# Specify the time using a duration and a suffix indicating which unit you want:
#  1w  = 1 week
#  1d  = 1 day
#  12h = 12 hours
# Permitted suffixes are: d for day, h for hour, m for minute, s for second.
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see http://docs.graylog.org/en/2.3/pages/configuration/index_model.html#index-set-configuration.
#elasticsearch_max_time_per_index = 1d

# Disable checking the version of Elasticsearch for being compatible with this Graylog release.
# WARNING: Using Graylog with unsupported and untested versions of Elasticsearch may lead to data loss!
#elasticsearch_disable_version_check = true

# Disable message retention on this node, i. e. disable Elasticsearch index rotation.
#no_retention = false

# How many indices do you want to keep?
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see http://docs.graylog.org/en/2.3/pages/configuration/index_model.html#index-set-configuration.
elasticsearch_max_number_of_indices = 20

# Decide what happens with the oldest indices when the maximum number of indices is reached.
# The following strategies are availble:
#   - delete # Deletes the index completely (Default)
#   - close # Closes the index and hides it from the system. Can be re-opened later.
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see http://docs.graylog.org/en/2.3/pages/configuration/index_model.html#index-set-configuration.
retention_strategy = delete

# How many Elasticsearch shards and replicas should be used per index? Note that this only applies to newly created indices.
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see http://docs.graylog.org/en/2.3/pages/configuration/index_model.html#index-set-configuration.
elasticsearch_shards = 4
elasticsearch_replicas = 0

# Prefix for all Elasticsearch indices and index aliases managed by Graylog.
#
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see http://docs.graylog.org/en/2.3/pages/configuration/index_model.html#index-set-configuration.
elasticsearch_index_prefix = graylog

# Name of the Elasticsearch index template used by Graylog to apply the mandatory index mapping.
# Default: graylog-internal
#
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see http://docs.graylog.org/en/2.3/pages/configuration/index_model.html#index-set-configuration.
#elasticsearch_template_name = graylog-internal

# Do you want to allow searches with leading wildcards? This can be extremely resource hungry and should only
# be enabled with care. See also: http://docs.graylog.org/en/2.1/pages/queries.html
allow_leading_wildcard_searches = true

# Do you want to allow searches to be highlighted? Depending on the size of your messages this can be memory hungry and
# should only be enabled after making sure your Elasticsearch cluster has enough memory.
allow_highlighting = false

# Analyzer (tokenizer) to use for message and full_message field. The "standard" filter usually is a good idea.
# All supported analyzers are: standard, simple, whitespace, stop, keyword, pattern, language, snowball, custom
# Elasticsearch documentation: https://www.elastic.co/guide/en/elasticsearch/reference/2.3/analysis.html
# Note that this setting only takes effect on newly created indices.
#
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see http://docs.graylog.org/en/2.3/pages/configuration/index_model.html#index-set-configuration.
elasticsearch_analyzer = standard

# Global request timeout for Elasticsearch requests (e. g. during search, index creation, or index time-range
# calculations) based on a best-effort to restrict the runtime of Elasticsearch operations.
# Default: 1m
#elasticsearch_request_timeout = 1m

# Global timeout for index optimization (force merge) requests.
# Default: 1h
#elasticsearch_index_optimization_timeout = 1h

# Maximum number of concurrently running index optimization (force merge) jobs.
# If you are using lots of different index sets, you might want to increase that number.
# Default: 20
#elasticsearch_index_optimization_jobs = 20

# Time interval for index range information cleanups. This setting defines how often stale index range information
# is being purged from the database.
# Default: 1h
#index_ranges_cleanup_interval = 1h

# Time interval for the job that runs index field type maintenance tasks like cleaning up stale entries. This doesn't
# need to run very often.
# Default: 1h
#index_field_type_periodical_interval = 1h

# Batch size for the Elasticsearch output. This is the maximum (!) number of messages the Elasticsearch output
# module will get at once and write to Elasticsearch in a batch call. If the configured batch size has not been
# reached within output_flush_interval seconds, everything that is available will be flushed at once. Remember
# that every outputbuffer processor manages its own batch and performs its own batch write calls.
# ("outputbuffer_processors" variable)
output_batch_size = 500

# Flush interval (in seconds) for the Elasticsearch output. This is the maximum amount of time between two
# batches of messages written to Elasticsearch. It is only effective at all if your minimum number of messages
# for this time period is less than output_batch_size * outputbuffer_processors.
output_flush_interval = 1

# As stream outputs are loaded only on demand, an output which is failing to initialize will be tried over and
# over again. To prevent this, the following configuration options define after how many faults an output will
# not be tried again for an also configurable amount of seconds.
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30

# The number of parallel running processors.
# Raise this number if your buffers are filling up.
processbuffer_processors = 5
outputbuffer_processors = 3

# The following settings (outputbuffer_processor_*) configure the thread pools backing each output buffer processor.
# See https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadPoolExecutor.html for technical details

# When the number of threads is greater than the core (see outputbuffer_processor_threads_core_pool_size),
# this is the maximum time in milliseconds that excess idle threads will wait for new tasks before terminating.
# Default: 5000
#outputbuffer_processor_keep_alive_time = 5000

# The number of threads to keep in the pool, even if they are idle, unless allowCoreThreadTimeOut is set
# Default: 3
#outputbuffer_processor_threads_core_pool_size = 3

# The maximum number of threads to allow in the pool
# Default: 30
#outputbuffer_processor_threads_max_pool_size = 30

# UDP receive buffer size for all message inputs (e. g. SyslogUDPInput).
#udp_recvbuffer_sizes = 1048576

# Wait strategy describing how buffer processors wait on a cursor sequence. (default: sleeping)
# Possible types:
#  - yielding
#     Compromise between performance and CPU usage.
#  - sleeping
#     Compromise between performance and CPU usage. Latency spikes can occur after quiet periods.
#  - blocking
#     High throughput, low latency, higher CPU usage.
#  - busy_spinning
#     Avoids syscalls which could introduce latency jitter. Best when threads can be bound to specific CPU cores.
processor_wait_strategy = blocking

# Size of internal ring buffers. Raise this if raising outputbuffer_processors does not help anymore.
# For optimum performance your LogMessage objects in the ring buffer should fit in your CPU L3 cache.
# Must be a power of 2. (512, 1024, 2048, ...)
ring_size = 65536

inputbuffer_ring_size = 65536
inputbuffer_processors = 2
inputbuffer_wait_strategy = blocking

# Enable the disk based message journal.
message_journal_enabled = true

# The directory which will be used to store the message journal. The directory must me exclusively used by Graylog and
# must not contain any other files than the ones created by Graylog itself.
#
# ATTENTION:
#   If you create a seperate partition for the journal files and use a file system creating directories like 'lost+found'
#   in the root directory, you need to create a sub directory for your journal.
#   Otherwise Graylog will log an error message that the journal is corrupt and Graylog will not start.
message_journal_dir = /var/lib/graylog-server/journal

# Journal hold messages before they could be written to Elasticsearch.
# For a maximum of 12 hours or 5 GB whichever happens first.
# During normal operation the journal will be smaller.
#message_journal_max_age = 12h
message_journal_max_size = 4gb

#message_journal_flush_age = 1m
#message_journal_flush_interval = 1000000
#message_journal_segment_age = 1h
#message_journal_segment_size = 100mb

# Number of threads used exclusively for dispatching internal events. Default is 2.
#async_eventbus_processors = 2

# How many seconds to wait between marking node as DEAD for possible load balancers and starting the actual
# shutdown process. Set to 0 if you have no status checking load balancers in front.
lb_recognition_period_seconds = 3

# Journal usage percentage that triggers requesting throttling for this server node from load balancers. The feature is
# disabled if not set.
#lb_throttle_threshold_percentage = 95

# Every message is matched against the configured streams and it can happen that a stream contains rules which
# take an unusual amount of time to run, for example if its using regular expressions that perform excessive backtracking.
# This will impact the processing of the entire server. To keep such misbehaving stream rules from impacting other
# streams, Graylog limits the execution time for each stream.
# The default values are noted below, the timeout is in milliseconds.
# If the stream matching for one stream took longer than the timeout value, and this happened more than "max_faults" times
# that stream is disabled and a notification is shown in the web interface.
#stream_processing_timeout = 2000
#stream_processing_max_faults = 3

# Length of the interval in seconds in which the alert conditions for all streams should be checked
# and alarms are being sent.
#alert_check_interval = 60

# Since 0.21 the Graylog server supports pluggable output modules. This means a single message can be written to multiple
# outputs. The next setting defines the timeout for a single output module, including the default output module where all
# messages end up.
#
# Time in milliseconds to wait for all message outputs to finish writing a single message.
#output_module_timeout = 10000

# Time in milliseconds after which a detected stale master node is being rechecked on startup.
#stale_master_timeout = 2000

# Time in milliseconds which Graylog is waiting for all threads to stop on shutdown.
#shutdown_timeout = 30000

# MongoDB connection string
# See https://docs.mongodb.com/manual/reference/connection-string/ for details
mongodb_uri = mongodb://localhost/graylog

# Authenticate against the MongoDB server
#mongodb_uri = mongodb://grayloguser:secret@localhost:27017/graylog

# Use a replica set instead of a single host
#mongodb_uri = mongodb://grayloguser:secret@localhost:27017,localhost:27018,localhost:27019/graylog

# Increase this value according to the maximum connections your MongoDB server can handle from a single client
# if you encounter MongoDB connection problems.
mongodb_max_connections = 1000

# Number of threads allowed to be blocked by MongoDB connections multiplier. Default: 5
# If mongodb_max_connections is 100, and mongodb_threads_allowed_to_block_multiplier is 5,
# then 500 threads can block. More than that and an exception will be thrown.
# http://api.mongodb.com/java/current/com/mongodb/MongoOptions.html#threadsAllowedToBlockForConnectionMultiplier
mongodb_threads_allowed_to_block_multiplier = 5


# Email transport
transport_email_enabled = true
transport_email_hostname = smtp.gmail.com
transport_email_port = 587
transport_email_use_auth = true
transport_email_use_tls = true
transport_email_use_ssl = false
transport_email_auth_username = xxxxxxxxx@xxx.xx
transport_email_auth_password = xxxxxxxxxxxx
transport_email_subject_prefix = [graylog]
transport_email_from_email = xxxxxxxxx@xxx.xx

# Encryption settings
#
# ATTENTION:
#    Using SMTP with STARTTLS *and* SMTPS at the same time is *not* possible.

# Use SMTP with STARTTLS, see https://en.wikipedia.org/wiki/Opportunistic_TLS
#transport_email_use_tls = true

# Use SMTP over SSL (SMTPS), see https://en.wikipedia.org/wiki/SMTPS
# This is deprecated on most SMTP services!
#transport_email_use_ssl = true


# Specify and uncomment this if you want to include links to the stream in your stream alert mails.
# This should define the fully qualified base url to your web interface exactly the same way as it is accessed by your users.
#transport_email_web_interface_url = https://graylog.example.com

# The default connect timeout for outgoing HTTP connections.
# Values must be a positive duration (and between 1 and 2147483647 when converted to milliseconds).
# Default: 5s
#http_connect_timeout = 5s

# The default read timeout for outgoing HTTP connections.
# Values must be a positive duration (and between 1 and 2147483647 when converted to milliseconds).
# Default: 10s
#http_read_timeout = 10s

# The default write timeout for outgoing HTTP connections.
# Values must be a positive duration (and between 1 and 2147483647 when converted to milliseconds).
# Default: 10s
#http_write_timeout = 10s

# HTTP proxy for outgoing HTTP connections
# ATTENTION: If you configure a proxy, make sure to also configure the "http_non_proxy_hosts" option so internal
#            HTTP connections with other nodes does not go through the proxy.
# Examples:
#   - http://proxy.example.com:8123
#   - http://username:password@proxy.example.com:8123
#http_proxy_uri =

# A list of hosts that should be reached directly, bypassing the configured proxy server.
# This is a list of patterns separated by ",". The patterns may start or end with a "*" for wildcards.
# Any host matching one of these patterns will be reached through a direct connection instead of through a proxy.
# Examples:
#   - localhost,127.0.0.1
#   - 10.0.*,*.example.com
#http_non_proxy_hosts =

# Disable the optimization of Elasticsearch indices after index cycling. This may take some load from Elasticsearch
# on heavily used systems with large indices, but it will decrease search performance. The default is to optimize
# cycled indices.
#
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see http://docs.graylog.org/en/2.3/pages/configuration/index_model.html#index-set-configuration.
#disable_index_optimization = true

# Optimize the index down to <= index_optimization_max_num_segments. A higher number may take some load from Elasticsearch
# on heavily used systems with large indices, but it will decrease search performance. The default is 1.
#
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see http://docs.graylog.org/en/2.3/pages/configuration/index_model.html#index-set-configuration.
#index_optimization_max_num_segments = 1

# The threshold of the garbage collection runs. If GC runs take longer than this threshold, a system notification
# will be generated to warn the administrator about possible problems with the system. Default is 1 second.
#gc_warning_threshold = 1s

# Connection timeout for a configured LDAP server (e. g. ActiveDirectory) in milliseconds.
#ldap_connection_timeout = 2000

# Disable the use of SIGAR for collecting system stats
#disable_sigar = false

# The default cache time for dashboard widgets. (Default: 10 seconds, minimum: 1 second)
#dashboard_widget_default_cache_time = 10s

# For some cluster-related REST requests, the node must query all other nodes in the cluster. This is the maximum number
# of threads available for this. Increase it, if '/cluster/*' requests take long to complete.
# Should be http_thread_pool_size * average_cluster_size if you have a high number of concurrent users.
proxied_requests_thread_pool_size = 32
```root

Hello,

I scanned over you configuration file. I didn’t see anything sticking out that could be the issue. To be honest, It looks like default settings. Did noticed that journal was reconfigured. The version of elasticsearch & Mongo seam to be within parameters.

As for this error

kesä 13 11:05:06 localhost.localdomain graylog-server[18505]: OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N

I found something similar here

Couple ideas.
If you don’t mind losing data you could delete everything in the journal then restart Graylog service.
You can find which file here

So it seams there is nothing in Graylog log file after i service stops.
Perhaps some simple checks:
Look over this site, see if anything does pertain to your environment.
Somethings have changed.

Check permissions on Graylog Directory.

The server won’t start up if it hits an unexpected exception during migrations.
Your log shows there is an inconsistency in event definitions, which is causing the exception.
To resolve it you can delete the offending event definition 5e2afd8f39b22502c984949e in the MongoDB and then recreate it later via UI, if needed.

3 Likes

Thank you patrickmann, that solved my problem. I logged into mongodb and deleted the event definition in question. After that I tailed the log and got a nasty looking error because the event in question did not exist anymore, but the server started despite that.

I’ll post the offending event information here just in case there is something obviously wrong with that, which could help with debugging in possible similar future cases:

{ "_id" : ObjectId("5e2afd8f39b22502c984949e"), "title" : "VehkaIrisHälytys", "description" : "", "priority" : 2, "alert" : true, "config" : { "type" : "aggregation-v1", "query" : "", "streams" : [ "5e2afafa39b22502c9848ec9" ], "group_by" : [ ], "series" : [ { "id" : "6a294191-44fa-4990-8a30-b01384324d9b", "function" : "count", "field" : null } ], "conditions" : { "expression" : { "expr" : "==", "left" : { "expr" : "number-ref", "ref" : "161a79ec-6d31-4317-94db-5cb214b272aa" }, "right" : { "expr" : "number", "value" : 0 } } }, "search_within_ms" : NumberLong(300000), "execute_every_ms" : NumberLong(300000) }, "field_spec" : {  }, "key_spec" : [ ], "notification_settings" : { "grace_period_ms" : NumberLong(300000), "backlog_size" : NumberLong(0) }, "notifications" : [ { "notification_id" : "5e2afd3c39b22502c98493c2", "notification_parameters" : null } ], "storage" : [ { "type" : "persist-to-streams-v1", "streams" : [ "000000000000000000000002" ] } ] }

Can you post the mongo commands you used to find and delete the even definition? That will help future searchers! :smiley:

I used these commands to delete the event:

[root@localhost ~]# mongo
> use graylog
> db.event_definitions.deleteOne({_id:ObjectId("5e2afd8f39b22502c984949e")})

You’ll need to use the id found in the log file for ObjectId({}) to delete it.

3 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.