High CPU load on nodes despite stopped input/no incoming messages

Hello,

I am running a 3-node Graylog cluster and experiencing a high CPU load. At first, I thought this maybe related to a high message load, so I turned the input off. But the high CPU load persisted on all nodes, even though I restarted the nodes and the WebUI is telling me that no messages are being processed. The message journal is empty.
Here is the output of top which shows the graylog process using CPU:

top - 07:22:57 up 45 min,  1 user,  load average: 6.06, 6.09, 5.67
Tasks: 111 total,   1 running,  62 sleeping,   0 stopped,   0 zombie
%Cpu(s): 49.7 us, 49.4 sy,  0.0 ni,  0.7 id,  0.1 wa,  0.0 hi,  0.0 si,  0.1 st
KiB Mem :  8167492 total,  4278456 free,  3174236 used,   714800 buff/cache
KiB Swap:        0 total,        0 free,        0 used.  4765976 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                     
 2094 graylog   20   0 5828824 1.439g  25952 S 375.0 18.5  72:45.92 java                                                        
  995 mongodb   20   0 1512276  91524  35072 S   6.2  1.1   0:28.03 mongod                                                      
    1 root      20   0  225068   8824   6680 S   0.0  0.1   0:01.28 systemd                                                     
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.00 kthreadd                                                    
    3 root      20   0       0      0      0 I   0.0  0.0   0:00.13 kworker/0:0                                                 
    4 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 kworker/0:0H                                                
    6 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 mm_percpu_wq                                                
    7 root      20   0       0      0      0 S   0.0  0.0   0:00.01 ksoftirqd/0                                                 
    8 root      20   0       0      0      0 I   0.0  0.0   0:00.39 rcu_sched                                                   
    9 root      20   0       0      0      0 I   0.0  0.0   0:00.00 rcu_bh                                                      
   10 root      rt   0       0      0      0 S   0.0  0.0   0:00.00 migration/0                                                 
   11 root      rt   0       0      0      0 S   0.0  0.0   0:00.00 watchdog/0                                                  
   12 root      20   0       0      0      0 S   0.0  0.0   0:00.00 cpuhp/0                                                     
   13 root      20   0       0      0      0 S   0.0  0.0   0:00.00 cpuhp/1                                                     
   14 root      rt   0       0      0      0 S   0.0  0.0   0:00.00 watchdog/1                                                  
   15 root      rt   0       0      0      0 S   0.0  0.0   0:00.00 migration/1                                                 
   16 root      20   0       0      0      0 S   0.0  0.0   0:00.01 ksoftirqd/1                                                 
   18 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 kworker/1:0H                                                
   19 root      20   0       0      0      0 S   0.0  0.0   0:00.00 cpuhp/2                                                     
   20 root      rt   0       0      0      0 S   0.0  0.0   0:00.00 watchdog/2                                                  
   21 root      rt   0       0      0      0 S   0.0  0.0   0:00.01 migration/2                                                 
   22 root      20   0       0      0      0 S   0.0  0.0   0:00.01 ksoftirqd/2                                                 
   24 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 kworker/2:0H

And if I dig into the process further with ā€˜top -n 1 -H -p 2094ā€™, it shows that input/process etc. buffer processors using CPU:

top - 07:23:25 up 46 min,  1 user,  load average: 6.03, 6.08, 5.68
Threads: 143 total,   6 running, 137 sleeping,   0 stopped,   0 zombie
%Cpu(s): 49.7 us, 49.4 sy,  0.0 ni,  0.7 id,  0.1 wa,  0.0 hi,  0.0 si,  0.1 st
KiB Mem :  8167492 total,  4276692 free,  3174696 used,   716104 buff/cache
KiB Swap:        0 total,        0 free,        0 used.  4765388 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                                                      
 2130 graylog   20   0 5828824 1.439g  25952 R 87.5 18.5  12:57.58 inputbufferproc                                              
 2147 graylog   20   0 5828824 1.439g  25952 R 87.5 18.5  12:47.88 processbufferpr                                              
 2131 graylog   20   0 5828824 1.439g  25952 R 56.2 18.5  12:52.01 inputbufferproc                                              
 2243 graylog   20   0 5828824 1.439g  25952 R 56.2 18.5   8:05.51 grizzly-nio-ker                                              
 2146 graylog   20   0 5828824 1.439g  25952 R 50.0 18.5  12:43.47 processbufferpr                                              
 2143 graylog   20   0 5828824 1.439g  25952 R 43.8 18.5  12:45.43 outputbufferpro                                              
 2094 graylog   20   0 5828824 1.439g  25952 S  0.0 18.5   0:00.00 java                                                         
 2100 graylog   20   0 5828824 1.439g  25952 S  0.0 18.5   0:07.83 java                                                         
 2101 graylog   20   0 5828824 1.439g  25952 S  0.0 18.5   0:02.65 java                                                         
 2105 graylog   20   0 5828824 1.439g  25952 S  0.0 18.5   0:02.68 java                                                         
 2106 graylog   20   0 5828824 1.439g  25952 S  0.0 18.5   0:02.68 java                                                         
 2107 graylog   20   0 5828824 1.439g  25952 S  0.0 18.5   0:02.63 java                                                         
 2108 graylog   20   0 5828824 1.439g  25952 S  0.0 18.5   0:00.57 java                                                         
 2109 graylog   20   0 5828824 1.439g  25952 S  0.0 18.5   0:02.10 VM Thread                                                    
 2111 graylog   20   0 5828824 1.439g  25952 S  0.0 18.5   0:00.02 Reference Handl                                              
 2112 graylog   20   0 5828824 1.439g  25952 S  0.0 18.5   0:00.03 Finalizer                                                    
 2120 graylog   20   0 5828824 1.439g  25952 S  0.0 18.5   0:00.00 Surrogate Locke                                              
 2121 graylog   20   0 5828824 1.439g  25952 S  0.0 18.5   0:00.00 Signal Dispatch                                              
 2122 graylog   20   0 5828824 1.439g  25952 S  0.0 18.5   0:18.01 C2 CompilerThre                                              
 2123 graylog   20   0 5828824 1.439g  25952 S  0.0 18.5   0:15.38 C2 CompilerThre                                              
 2124 graylog   20   0 5828824 1.439g  25952 S  0.0 18.5   0:06.67 C1 CompilerThre                                              
 2125 graylog   20   0 5828824 1.439g  25952 S  0.0 18.5   0:00.00 Service Thread                                               
 2126 graylog   20   0 5828824 1.439g  25952 S  0.0 18.5   0:00.58 VM Periodic Tas

Server.log seems normal. There is an error regarding the LookupDataAdapter, which I read can be ignored.

2021-05-27T07:04:35.197Z INFO  [CmdLineTool] Loaded plugin: AWS plugins 2.5.2 [org.graylog.aws.plugin.AWSPlugin]
2021-05-27T07:04:35.199Z INFO  [CmdLineTool] Loaded plugin: Elastic Beats Input 2.5.2 [org.graylog.plugins.beats.BeatsInputPlugin]
2021-05-27T07:04:35.200Z INFO  [CmdLineTool] Loaded plugin: CEF Input 2.5.2 [org.graylog.plugins.cef.CEFInputPlugin]
2021-05-27T07:04:35.201Z INFO  [CmdLineTool] Loaded plugin: Collector 2.5.2 [org.graylog.plugins.collector.CollectorPlugin]
2021-05-27T07:04:35.201Z INFO  [CmdLineTool] Loaded plugin: Enterprise Integration Plugin 2.5.2 [org.graylog.plugins.enterprise_integration.EnterpriseIntegrationPlugin]
2021-05-27T07:04:35.202Z INFO  [CmdLineTool] Loaded plugin: MapWidgetPlugin 2.5.2 [org.graylog.plugins.map.MapWidgetPlugin]
2021-05-27T07:04:35.203Z INFO  [CmdLineTool] Loaded plugin: NetFlow Plugin 2.5.2 [org.graylog.plugins.netflow.NetFlowPlugin]
2021-05-27T07:04:35.210Z INFO  [CmdLineTool] Loaded plugin: Pipeline Processor Plugin 2.5.2 [org.graylog.plugins.pipelineprocessor.ProcessorPlugin]
2021-05-27T07:04:35.210Z INFO  [CmdLineTool] Loaded plugin: Threat Intelligence Plugin 2.5.2 [org.graylog.plugins.threatintel.ThreatIntelPlugin]
2021-05-27T07:04:35.540Z INFO  [CmdLineTool] Running with JVM arguments: -Xms2048m -Xmx2048m -XX:NewRatio=1 -XX:+ResizeTLAB -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSClassUnloadingEnabled -XX:+UseParNewGC -XX:-OmitStackTraceInFastThrow -Dlog4j.configurationFile=file:///etc/graylog/server/log4j2.xml -Djava.library.path=/usr/share/graylog-server/lib/sigar -Dgraylog2.installation_source=deb
2021-05-27T07:04:35.728Z INFO  [Version] HV000001: Hibernate Validator 5.1.3.Final
2021-05-27T07:04:37.710Z INFO  [InputBufferImpl] Message journal is enabled.
2021-05-27T07:04:37.729Z INFO  [NodeId] Node ID: b6d234b3-0de7-483f-9a2b-cdfe9e024f81
2021-05-27T07:04:37.910Z INFO  [LogManager] Loading logs.
2021-05-27T07:04:37.977Z INFO  [LogManager] Logs loading complete.
2021-05-27T07:04:37.977Z INFO  [KafkaJournal] Initialized Kafka based journal at /var/lib/graylog-server/journal
2021-05-27T07:04:37.992Z INFO  [InputBufferImpl] Initialized InputBufferImpl with ring size <8192> and wait strategy <YieldingWaitStrategy>, running 1 parallel message handlers.
2021-05-27T07:04:38.021Z INFO  [cluster] Cluster created with settings {hosts=[192.168.4.41:27017, 192.168.4.42:27017, 192.168.4.43:27017], mode=MULTIPLE, requiredClusterType=REPLICA_SET, serverSelectionTimeout='30000 ms', maxWaitQueueSize=40, requiredReplicaSetName='rs0'}
2021-05-27T07:04:38.021Z INFO  [cluster] Adding discovered server 192.168.4.41:27017 to client view of cluster
2021-05-27T07:04:38.100Z INFO  [cluster] Adding discovered server 192.168.4.42:27017 to client view of cluster
2021-05-27T07:04:38.104Z INFO  [cluster] Adding discovered server 192.168.4.43:27017 to client view of cluster
2021-05-27T07:04:38.146Z INFO  [cluster] No server chosen by ReadPreferenceServerSelector{readPreference=primary} from cluster description ClusterDescription{type=REPLICA_SET, connectionMode=MULTIPLE, serverDescriptions=[ServerDescription{address=192.168.4.42:27017, type=UNKNOWN, state=CONNECTING}, ServerDescription{address=192.168.4.41:27017, type=UNKNOWN, state=CONNECTING}, ServerDescription{address=192.168.4.43:27017, type=UNKNOWN, state=CONNECTING}]}. Waiting for 30000 ms before timing out
2021-05-27T07:04:38.183Z INFO  [connection] Opened connection [connectionId{localValue:2, serverValue:30}] to 192.168.4.42:27017
2021-05-27T07:04:38.185Z INFO  [connection] Opened connection [connectionId{localValue:1, serverValue:103}] to 192.168.4.43:27017
2021-05-27T07:04:38.190Z INFO  [cluster] Monitor thread successfully connected to server with description ServerDescription{address=192.168.4.42:27017, type=REPLICA_SET_SECONDARY, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 6, 3]}, minWireVersion=0, maxWireVersion=6, maxDocumentSize=16777216, roundTripTimeNanos=2606495, setName='rs0', canonicalAddress=192.168.4.42:27017, hosts=[192.168.4.41:27017, 192.168.4.43:27017, 192.168.4.42:27017], passives=[], arbiters=[], primary='192.168.4.43:27017', tagSet=TagSet{[]}, electionId=null, setVersion=1, lastWriteDate=Thu May 27 07:04:38 UTC 2021, lastUpdateTimeNanos=1652423339669}
2021-05-27T07:04:38.193Z INFO  [cluster] Monitor thread successfully connected to server with description ServerDescription{address=192.168.4.43:27017, type=REPLICA_SET_PRIMARY, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 6, 3]}, minWireVersion=0, maxWireVersion=6, maxDocumentSize=16777216, roundTripTimeNanos=7518284, setName='rs0', canonicalAddress=192.168.4.43:27017, hosts=[192.168.4.41:27017, 192.168.4.43:27017, 192.168.4.42:27017], passives=[], arbiters=[], primary='192.168.4.43:27017', tagSet=TagSet{[]}, electionId=7fffffff0000000000000019, setVersion=1, lastWriteDate=Thu May 27 07:04:38 UTC 2021, lastUpdateTimeNanos=1652429694037}
2021-05-27T07:04:38.194Z INFO  [cluster] Setting max election id to 7fffffff0000000000000019 from replica set primary 192.168.4.43:27017
2021-05-27T07:04:38.194Z INFO  [cluster] Setting max set version to 1 from replica set primary 192.168.4.43:27017
2021-05-27T07:04:38.194Z INFO  [cluster] Discovered replica set primary 192.168.4.43:27017
2021-05-27T07:04:38.197Z INFO  [connection] Opened connection [connectionId{localValue:3, serverValue:8}] to 192.168.4.41:27017
2021-05-27T07:04:38.200Z INFO  [connection] Opened connection [connectionId{localValue:4, serverValue:104}] to 192.168.4.43:27017
2021-05-27T07:04:38.202Z INFO  [cluster] Monitor thread successfully connected to server with description ServerDescription{address=192.168.4.41:27017, type=REPLICA_SET_SECONDARY, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 6, 3]}, minWireVersion=0, maxWireVersion=6, maxDocumentSize=16777216, roundTripTimeNanos=4797726, setName='rs0', canonicalAddress=192.168.4.41:27017, hosts=[192.168.4.41:27017, 192.168.4.43:27017, 192.168.4.42:27017], passives=[], arbiters=[], primary='192.168.4.43:27017', tagSet=TagSet{[]}, electionId=null, setVersion=1, lastWriteDate=Thu May 27 07:04:38 UTC 2021, lastUpdateTimeNanos=1652438729303}
2021-05-27T07:04:38.641Z INFO  [AbstractJestClient] Setting server pool to a list of 3 servers: [http://192.168.4.41:9200,http://192.168.4.42:9200,http://192.168.4.43:9200]
2021-05-27T07:04:38.642Z INFO  [JestClientFactory] Using multi thread/connection supporting pooling connection manager
2021-05-27T07:04:38.714Z INFO  [JestClientFactory] Using custom ObjectMapper instance
2021-05-27T07:04:38.714Z INFO  [JestClientFactory] Node Discovery disabled...
2021-05-27T07:04:38.714Z INFO  [JestClientFactory] Idle connection reaping disabled...
2021-05-27T07:04:38.998Z INFO  [ProcessBuffer] Initialized ProcessBuffer with ring size <8192> and wait strategy <YieldingWaitStrategy>.
2021-05-27T07:04:40.568Z INFO  [RulesEngineProvider] No static rules file loaded.
2021-05-27T07:04:40.803Z WARN  [GeoIpResolverEngine] GeoIP database file does not exist: /etc/graylog/server/GeoLite2-City.mmdb
2021-05-27T07:04:40.812Z INFO  [OutputBuffer] Initialized OutputBuffer with ring size <8192> and wait strategy <YieldingWaitStrategy>.
2021-05-27T07:04:40.915Z WARN  [GeoIpResolverEngine] GeoIP database file does not exist: /etc/graylog/server/GeoLite2-City.mmdb
2021-05-27T07:04:40.922Z INFO  [connection] Opened connection [connectionId{localValue:5, serverValue:106}] to 192.168.4.43:27017
2021-05-27T07:04:40.925Z INFO  [connection] Opened connection [connectionId{localValue:6, serverValue:105}] to 192.168.4.43:27017
2021-05-27T07:04:41.341Z INFO  [ServerBootstrap] Graylog server 2.5.2+4f6d123 starting up
2021-05-27T07:04:41.342Z INFO  [ServerBootstrap] JRE: Private Build 1.8.0_292 on Linux 4.15.0-143-generic
2021-05-27T07:04:41.343Z INFO  [ServerBootstrap] Deployment: deb
2021-05-27T07:04:41.343Z INFO  [ServerBootstrap] OS: Ubuntu 18.04.5 LTS (bionic)
2021-05-27T07:04:41.343Z INFO  [ServerBootstrap] Arch: amd64
2021-05-27T07:04:41.438Z INFO  [PeriodicalsService] Starting 25 periodicals ...
2021-05-27T07:04:41.438Z INFO  [Periodicals] Starting [org.graylog2.periodical.ThroughputCalculator] periodical in [0s], polling every [1s].
2021-05-27T07:04:41.449Z INFO  [Periodicals] Starting [org.graylog2.periodical.AlertScannerThread] periodical in [10s], polling every [60s].
2021-05-27T07:04:41.578Z INFO  [Periodicals] Starting [org.graylog2.periodical.BatchedElasticSearchOutputFlushThread] periodical in [0s], polling every [1s].
2021-05-27T07:04:41.579Z INFO  [Periodicals] Starting [org.graylog2.periodical.ClusterHealthCheckThread] periodical in [120s], polling every [20s].
2021-05-27T07:04:41.581Z INFO  [Periodicals] Starting [org.graylog2.periodical.ContentPackLoaderPeriodical] periodical, running forever.
2021-05-27T07:04:41.591Z INFO  [Periodicals] Starting [org.graylog2.periodical.GarbageCollectionWarningThread] periodical, running forever.
2021-05-27T07:04:41.592Z INFO  [Periodicals] Starting [org.graylog2.periodical.IndexerClusterCheckerThread] periodical in [0s], polling every [30s].
2021-05-27T07:04:41.594Z INFO  [Periodicals] Starting [org.graylog2.periodical.IndexRetentionThread] periodical in [0s], polling every [300s].
2021-05-27T07:04:41.594Z INFO  [Periodicals] Starting [org.graylog2.periodical.IndexRotationThread] periodical in [0s], polling every [10s].
2021-05-27T07:04:41.597Z INFO  [Periodicals] Starting [org.graylog2.periodical.NodePingThread] periodical in [0s], polling every [1s].
2021-05-27T07:04:41.597Z INFO  [Periodicals] Starting [org.graylog2.periodical.VersionCheckThread] periodical in [300s], polling every [1800s].
2021-05-27T07:04:41.598Z INFO  [Periodicals] Starting [org.graylog2.periodical.ThrottleStateUpdaterThread] periodical in [1s], polling every [1s].
2021-05-27T07:04:41.598Z INFO  [Periodicals] Starting [org.graylog2.events.ClusterEventPeriodical] periodical in [0s], polling every [1s].
2021-05-27T07:04:41.602Z INFO  [Periodicals] Starting [org.graylog2.events.ClusterEventCleanupPeriodical] periodical in [0s], polling every [86400s].
2021-05-27T07:04:41.602Z INFO  [Periodicals] Starting [org.graylog2.periodical.ClusterIdGeneratorPeriodical] periodical, running forever.
2021-05-27T07:04:41.603Z INFO  [Periodicals] Starting [org.graylog2.periodical.IndexRangesMigrationPeriodical] periodical, running forever.
2021-05-27T07:04:41.604Z INFO  [Periodicals] Starting [org.graylog2.periodical.IndexRangesCleanupPeriodical] periodical in [15s], polling every [3600s].
2021-05-27T07:04:41.650Z INFO  [connection] Opened connection [connectionId{localValue:7, serverValue:107}] to 192.168.4.43:27017
2021-05-27T07:04:41.655Z INFO  [connection] Opened connection [connectionId{localValue:8, serverValue:108}] to 192.168.4.43:27017
2021-05-27T07:04:41.772Z INFO  [PeriodicalsService] Not starting [org.graylog2.periodical.UserPermissionMigrationPeriodical] periodical. Not configured to run on this node.
2021-05-27T07:04:41.772Z INFO  [Periodicals] Starting [org.graylog2.periodical.AlarmCallbacksMigrationPeriodical] periodical, running forever.
2021-05-27T07:04:41.775Z INFO  [Periodicals] Starting [org.graylog2.periodical.ConfigurationManagementPeriodical] periodical, running forever.
2021-05-27T07:04:41.779Z INFO  [Periodicals] Starting [org.graylog2.periodical.LdapGroupMappingMigration] periodical, running forever.
2021-05-27T07:04:41.780Z INFO  [Periodicals] Starting [org.graylog2.periodical.IndexFailuresPeriodical] periodical, running forever.
2021-05-27T07:04:41.781Z INFO  [Periodicals] Starting [org.graylog2.periodical.TrafficCounterCalculator] periodical in [0s], polling every [1s].
2021-05-27T07:04:41.790Z INFO  [Periodicals] Starting [org.graylog.plugins.pipelineprocessor.periodical.LegacyDefaultStreamMigration] periodical, running forever.
2021-05-27T07:04:41.812Z INFO  [Periodicals] Starting [org.graylog.plugins.collector.periodical.PurgeExpiredCollectorsThread] periodical in [0s], polling every [3600s].
2021-05-27T07:04:41.963Z INFO  [LookupTableService] Data Adapter tor-exit-node/5b46249628aa493469d7cc87 [@c5797f6] STARTING
2021-05-27T07:04:41.963Z INFO  [LookupTableService] Data Adapter otx-api-domain/5b46249628aa493469d7cc85 [@6de00355] STARTING
2021-05-27T07:04:41.964Z INFO  [LegacyDefaultStreamMigration] Legacy default stream has no connections, no migration needed.
2021-05-27T07:04:41.964Z WARN  [OTXDataAdapter] OTX API key is missing. Make sure to add the key to allow higher request limits.
2021-05-27T07:04:41.965Z INFO  [LookupTableService] Data Adapter otx-api-ip/5b46249628aa493469d7cc84 [@61cf1299] STARTING
2021-05-27T07:04:41.963Z ERROR [LookupDataAdapter] Couldn't start data adapter <tor-exit-node/5b46249628aa493469d7cc87/@c5797f6>
org.graylog.plugins.threatintel.tools.AdapterDisabledException: TOR service is disabled, not starting TOR exit addresses adapter. To enable it please go to System / Configurations.
	at org.graylog.plugins.threatintel.adapters.tor.TorExitNodeDataAdapter.doStart(TorExitNodeDataAdapter.java:73) ~[?:?]
	at org.graylog2.plugin.lookup.LookupDataAdapter.startUp(LookupDataAdapter.java:59) [graylog.jar:?]
	at com.google.common.util.concurrent.AbstractIdleService$DelegateService$1.run(AbstractIdleService.java:62) [graylog.jar:?]
	at com.google.common.util.concurrent.Callables$4.run(Callables.java:119) [graylog.jar:?]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
2021-05-27T07:04:41.970Z ERROR [LookupDataAdapter] Couldn't start data adapter <abuse-ch-ransomware-ip/5b46249728aa493469d7cc89/@7c391913>
org.graylog.plugins.threatintel.tools.AdapterDisabledException: Abuse.ch service is disabled, not starting adapter. To enable it please go to System / Configurations.
	at org.graylog.plugins.threatintel.adapters.abusech.AbuseChRansomAdapter.doStart(AbuseChRansomAdapter.java:80) ~[?:?]
	at org.graylog2.plugin.lookup.LookupDataAdapter.startUp(LookupDataAdapter.java:59) [graylog.jar:?]
	at com.google.common.util.concurrent.AbstractIdleService$DelegateService$1.run(AbstractIdleService.java:62) [graylog.jar:?]
	at com.google.common.util.concurrent.Callables$4.run(Callables.java:119) [graylog.jar:?]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
2021-05-27T07:04:41.970Z ERROR [LookupDataAdapter] Couldn't start data adapter <abuse-ch-ransomware-domains/5b46249728aa493469d7cc88/@61851462>
org.graylog.plugins.threatintel.tools.AdapterDisabledException: Abuse.ch service is disabled, not starting adapter. To enable it please go to System / Configurations.
	at org.graylog.plugins.threatintel.adapters.abusech.AbuseChRansomAdapter.doStart(AbuseChRansomAdapter.java:80) ~[?:?]
	at org.graylog2.plugin.lookup.LookupDataAdapter.startUp(LookupDataAdapter.java:59) [graylog.jar:?]
	at com.google.common.util.concurrent.AbstractIdleService$DelegateService$1.run(AbstractIdleService.java:62) [graylog.jar:?]
	at com.google.common.util.concurrent.Callables$4.run(Callables.java:119) [graylog.jar:?]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
2021-05-27T07:04:41.970Z ERROR [LookupDataAdapter] Couldn't start data adapter <spamhaus-drop/5b46249628aa493469d7cc83/@262ac7a6>
org.graylog.plugins.threatintel.tools.AdapterDisabledException: Spamhaus service is disabled, not starting (E)DROP adapter. To enable it please go to System / Configurations.
	at org.graylog.plugins.threatintel.adapters.spamhaus.SpamhausEDROPDataAdapter.doStart(SpamhausEDROPDataAdapter.java:68) ~[?:?]
	at org.graylog2.plugin.lookup.LookupDataAdapter.startUp(LookupDataAdapter.java:59) [graylog.jar:?]
	at com.google.common.util.concurrent.AbstractIdleService$DelegateService$1.run(AbstractIdleService.java:62) [graylog.jar:?]
	at com.google.common.util.concurrent.Callables$4.run(Callables.java:119) [graylog.jar:?]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
2021-05-27T07:04:41.983Z INFO  [LookupTableService] Data Adapter otx-api-domain/5b46249628aa493469d7cc85 [@6de00355] RUNNING
2021-05-27T07:04:41.983Z INFO  [LookupTableService] Data Adapter tor-exit-node/5b46249628aa493469d7cc87 [@c5797f6] RUNNING
2021-05-27T07:04:41.984Z INFO  [LookupTableService] Data Adapter whois/5b46249628aa493469d7cc86 [@6f4e6653] STARTING
2021-05-27T07:04:41.984Z INFO  [LookupTableService] Data Adapter spamhaus-drop/5b46249628aa493469d7cc83 [@262ac7a6] STARTING
2021-05-27T07:04:41.984Z INFO  [LookupTableService] Data Adapter spamhaus-drop/5b46249628aa493469d7cc83 [@262ac7a6] RUNNING
2021-05-27T07:04:41.984Z INFO  [LookupTableService] Data Adapter abuse-ch-ransomware-ip/5b46249728aa493469d7cc89 [@7c391913] STARTING
2021-05-27T07:04:41.984Z INFO  [LookupTableService] Data Adapter abuse-ch-ransomware-ip/5b46249728aa493469d7cc89 [@7c391913] RUNNING
2021-05-27T07:04:41.984Z WARN  [OTXDataAdapter] OTX API key is missing. Make sure to add the key to allow higher request limits.
2021-05-27T07:04:41.984Z INFO  [LookupTableService] Data Adapter abuse-ch-ransomware-domains/5b46249728aa493469d7cc88 [@61851462] STARTING
2021-05-27T07:04:41.985Z INFO  [LookupTableService] Data Adapter abuse-ch-ransomware-domains/5b46249728aa493469d7cc88 [@61851462] RUNNING
2021-05-27T07:04:41.986Z INFO  [LookupTableService] Data Adapter whois/5b46249628aa493469d7cc86 [@6f4e6653] RUNNING
2021-05-27T07:04:41.998Z INFO  [LookupTableService] Cache whois-cache/5b46249628aa493469d7cc81 [@47957787] STARTING
2021-05-27T07:04:41.998Z INFO  [LookupTableService] Data Adapter otx-api-ip/5b46249628aa493469d7cc84 [@61cf1299] RUNNING
2021-05-27T07:04:41.998Z INFO  [LookupTableService] Cache spamhaus-e-drop-cache/5b46249628aa493469d7cc7e [@4f67021f] STARTING
2021-05-27T07:04:41.999Z INFO  [LookupTableService] Cache otx-api-ip-cache/5b46249628aa493469d7cc7d [@43aee5b6] STARTING
2021-05-27T07:04:42.017Z INFO  [LookupTableService] Cache otx-api-domain-cache/5b46249628aa493469d7cc7f [@73abcb22] STARTING
2021-05-27T07:04:42.022Z INFO  [LookupTableService] Cache threat-intel-uncached-adapters/5b46249628aa493469d7cc80 [@347291fd] STARTING
2021-05-27T07:04:42.022Z INFO  [LookupTableService] Cache whois-cache/5b46249628aa493469d7cc81 [@47957787] RUNNING
2021-05-27T07:04:42.026Z INFO  [LookupTableService] Cache otx-api-ip-cache/5b46249628aa493469d7cc7d [@43aee5b6] RUNNING
2021-05-27T07:04:42.027Z INFO  [LookupTableService] Cache threat-intel-uncached-adapters/5b46249628aa493469d7cc80 [@347291fd] RUNNING
2021-05-27T07:04:42.027Z INFO  [LookupTableService] Cache otx-api-domain-cache/5b46249628aa493469d7cc7f [@73abcb22] RUNNING
2021-05-27T07:04:42.027Z INFO  [LookupTableService] Cache spamhaus-e-drop-cache/5b46249628aa493469d7cc7e [@4f67021f] RUNNING
2021-05-27T07:04:42.045Z INFO  [LookupTableService] Starting lookup table otx-api-ip/5b46249728aa493469d7cc8d [@558da04a] using cache otx-api-ip-cache/5b46249628aa493469d7cc7d [@43aee5b6], data adapter otx-api-ip/5b46249628aa493469d7cc84 [@61cf1299]
2021-05-27T07:04:42.045Z INFO  [LookupTableService] Starting lookup table spamhaus-drop/5b46249728aa493469d7cc8e [@2ba0380d] using cache spamhaus-e-drop-cache/5b46249628aa493469d7cc7e [@4f67021f], data adapter spamhaus-drop/5b46249628aa493469d7cc83 [@262ac7a6]
2021-05-27T07:04:42.045Z INFO  [LookupTableService] Starting lookup table abuse-ch-ransomware-domains/5b46249728aa493469d7cc90 [@2f71ba9b] using cache threat-intel-uncached-adapters/5b46249628aa493469d7cc80 [@347291fd], data adapter abuse-ch-ransomware-domains/5b46249728aa493469d7cc88 [@61851462]
2021-05-27T07:04:42.045Z INFO  [LookupTableService] Starting lookup table whois/5b46249728aa493469d7cc8f [@b8c02b0] using cache whois-cache/5b46249628aa493469d7cc81 [@47957787], data adapter whois/5b46249628aa493469d7cc86 [@6f4e6653]
2021-05-27T07:04:42.045Z INFO  [LookupTableService] Starting lookup table tor-exit-node-list/5b46249728aa493469d7cc91 [@333b6c7b] using cache threat-intel-uncached-adapters/5b46249628aa493469d7cc80 [@347291fd], data adapter tor-exit-node/5b46249628aa493469d7cc87 [@c5797f6]
2021-05-27T07:04:42.045Z INFO  [LookupTableService] Starting lookup table abuse-ch-ransomware-ip/5b46249728aa493469d7cc8c [@19f14a41] using cache threat-intel-uncached-adapters/5b46249628aa493469d7cc80 [@347291fd], data adapter abuse-ch-ransomware-ip/5b46249728aa493469d7cc89 [@7c391913]
2021-05-27T07:04:42.045Z INFO  [LookupTableService] Starting lookup table otx-api-domain/5b46249728aa493469d7cc8b [@3b768e86] using cache otx-api-domain-cache/5b46249628aa493469d7cc7f [@73abcb22], data adapter otx-api-domain/5b46249628aa493469d7cc85 [@6de00355]
2021-05-27T07:04:42.422Z INFO  [JerseyService] Enabling CORS for HTTP endpoint
2021-05-27T07:04:57.038Z INFO  [NetworkListener] Started listener bound to [192.168.3.41:9000]
2021-05-27T07:04:57.040Z INFO  [HttpServer] [HttpServer] Started.
2021-05-27T07:04:57.041Z INFO  [JerseyService] Started REST API at <https://192.168.3.41:9000/api/>
2021-05-27T07:04:57.041Z INFO  [JerseyService] Started Web Interface at <https://192.168.3.41:9000/>
2021-05-27T07:04:57.041Z INFO  [ServiceManagerListener] Services are healthy
2021-05-27T07:04:57.042Z INFO  [InputSetupService] Triggering launching persisted inputs, node transitioned from Uninitialized [LB:DEAD] to Running [LB:ALIVE]
2021-05-27T07:04:57.042Z INFO  [ServerBootstrap] Services started, startup times in ms: {InputSetupService [RUNNING]=24, ConfigurationEtagService [RUNNING]=83, OutputSetupService [RUNNING]=97, BufferSynchronizerService [RUNNING]=205, JournalReader [RUNNING]=249, KafkaJournal [RUNNING]=259, StreamCacheService [RUNNING]=502, PeriodicalsService [RUNNING]=550, LookupTableService [RUNNING]=585, JerseyService [RUNNING]=15605}
2021-05-27T07:04:57.051Z INFO  [ServerBootstrap] Graylog server up and running.
2021-05-27T07:04:57.066Z INFO  [AbstractTcpTransport] Enabled TLS for input [Beats/60a3c6e09a5e602b63b9be75]. key-file="/etc/vector-cloud/certs/graylog.fra1.private.beats-input_key.pem" cert-file="/etc/vector-cloud/certs/graylog.fra1.private.beats-input.pem"
2021-05-27T07:04:57.066Z INFO  [InputStateListener] Input [Beats/60a3c6e09a5e602b63b9be75] is now STARTING
2021-05-27T07:04:57.184Z INFO  [InputStateListener] Input [Beats/60a3c6e09a5e602b63b9be75] is now RUNNING
2021-05-27T07:10:17.978Z INFO  [connection] Opened connection [connectionId{localValue:9, serverValue:118}] to 192.168.4.43:27017
2021-05-27T07:13:25.588Z INFO  [InputStateListener] Input [Beats/60a3c6e09a5e602b63b9be75] is now STOPPING
2021-05-27T07:13:25.589Z INFO  [InputStateListener] Input [Beats/60a3c6e09a5e602b63b9be75] is now STOPPED
2021-05-27T07:13:25.591Z INFO  [InputStateListener] Input [Beats/60a3c6e09a5e602b63b9be75] is now TERMINATED

Any insight is appreciated.

Hardware Setup:
3 VMs with
4 vCPUs
8 GB RAM
Root disk: 20 GB
200GB Volume mounted on /var/lib/elasticsearch
The 3 VMs are sitting behind an Octavia loadbalancer with round robin.

Software Versions:
OS: Ubuntu 18.04.5 LTS (Bionic), Kernel Linux 4.15.0-143-generic
Java: OpenJDK Runtime Environment (build 1.8.0_292-8u292-b10-0ubuntu1~18.04-b10)
Graylog: 2.5.2+4f6d123
Elasticsearch: 6.8.16
MongoDB: 3.6.3

Server.conf (the same on all nodes, except the master flag and the uri IPs)

is_master = true
node_id_file = /etc/graylog/server/node-id
password_secret = *****
root_password_sha2 = *****
plugin_dir = /usr/share/graylog-server/plugin
rest_listen_uri = https://192.168.3.41:9000/api
rest_transport_uri = https://10.200.0.111:9000/api
rest_enable_tls = true
rest_tls_cert_file = ****
rest_tls_key_file = ****
rest_thread_pool_size = 8
web_listen_uri = https://192.168.3.41:9000
web_enable_tls = true
web_tls_cert_file = ***
web_tls_key_file = ***
web_thread_pool_size = 8
elasticsearch_hosts = http://192.168.4.41:9200,http://192.168.4.42:9200,http://192.168.4.43:9200
rotation_strategy = count
elasticsearch_max_docs_per_index = 20000000
elasticsearch_max_number_of_indices = 20
retention_strategy = delete
elasticsearch_shards = 4
elasticsearch_replicas = 0
elasticsearch_index_prefix = graylog
allow_leading_wildcard_searches = false
allow_highlighting = false
elasticsearch_analyzer = standard
output_batch_size = 500
output_flush_interval = 1
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30
processbuffer_processors = 2
outputbuffer_processors = 1
outputbuffer_processor_threads_core_pool_size = 8
outputbuffer_processor_threads_max_pool_size = 64
processor_wait_strategy = yielding
ring_size = 8192
inputbuffer_ring_size = 8192
inputbuffer_processors = 1
inputbuffer_wait_strategy = yielding
message_journal_enabled = true
message_journal_dir = /var/lib/graylog-server/journal
message_journal_max_age = 12h
message_journal_max_size = 1gb
message_journal_flush_age = 1m
message_journal_flush_interval = 1000000
message_journal_segment_age = 1h
message_journal_segment_size = 100mb
lb_recognition_period_seconds = 3
mongodb_uri = mongodb://192.168.4.41,192.168.4.42,192.168.4.43/graylog?replicaSet=rs0
mongodb_max_connections = 8
mongodb_threads_allowed_to_block_multiplier = 5
transport_email_enabled = true
transport_email_hostname = ****
transport_email_port = 25
transport_email_use_auth = false
transport_email_use_tls = true
transport_email_use_ssl = false
transport_email_subject_prefix = [graylog]
transport_email_from_email = ****
transport_email_web_interface_url = https://10.200.0.111:9000
content_packs_dir = /usr/share/graylog-server/contentpacks
content_packs_auto_load = grok-patterns.json
proxied_requests_thread_pool_size = 24

Hello and welcome,

Your settings look standard. What have you tried to solve this issue?
I assume this issue just started recently. If so, was there any applied updates on any of the servers?
To be honest my first thought would be is to add more CPUā€™s.

When I set up a graylog-server I try to leave one core for my operating system.
Example: 8 cores on my Virtual machine. I would divide them up as shown below.

processbuffer_processors = 3
outputbuffer_processors = 2
inputbuffer_processors = 2
Operating system = 1

If that doesnā€™t work, have you thought about upgrading to GL3.3?

Hello gsmith,

Thank you for your reply. You are correct in the assumption that this problem started recently, the cluster was running fine before. The cluster is set up via Ansible, so Iā€™m pretty confident that there arenā€™t any unintended or unnoticed configuration changes.
To solve this issue, I previously added more CPUs, the VMs here started with 2 vCPUs.
I also fiddled with the processor and thread numbers, but that seemed to have no effect, except that I see more or less processes in top. And switching the wait_strategy from ā€˜yieldingā€™ to ā€˜blockingā€™ only changed the results in so far, that now just one processor was consuming all the CPU regardless of how many processors were configured.
Regarding the updates: I noticed there was an automatic update of the JRE, from 8u162-b12-1 to 1.8.0_292-8u29, and I read some new java versions have trouble with TLSv1.3 connections. But by temporarily downgrading, I could rule out that this update was the cause of my issues.

Nevertheless, I upgraded all VMs from 4 vCPUs to 8 vCPUs and also followed your advice to leave some processers for the OS. But the load remains high.

I recognize the Graylog 2.5 is an older version, but Iā€™m hesitating a bit to upgrade, as this cluster is logging a legacy environment.
Could you share some insight as to why upgrading might solve my issues?

Hello,

Thank you for the detailed information but unfortunately this is an odd situation to be honest. The down grading with JAVA, I seen that also and what gets me about your situation is that there were no updates applied, this just happened randomly make be believe there something else interfering. Are you gathering metrics from these Graylog Nodeā€™s since they were first built by chance (i.e., Zabbix, etcā€¦)? What did the CPU % look like prior to this issue?

Only thing I can do for you right now is show you my environment maybe something in there might shed some light for you. Bad news I no longer have GL 2.5.

I have another test lab Graylog server 4.0.7 with 12 vCPUā€™s, 10 GB RAM, 500 GB HDD running CentOS 7 and the below information is a full load on the Graylog Server.

Here is my TOP output:

We push about 30 GB of messages a day because were testing the performance on a sinlge node. We have Verbose logging settings enable on our AD DC.

Here is my Graylog configuration file for my Lab VM. The only things I dont have that you do is Elasticsearch and MongoDb connections and some WEB UI that has changed since 2.5.

Most, if not all are pretty much default settings.

GL4_config
[root@graylog server]# grep -v "^#\|^$" /etc/graylog/server/server.conf | sed -e "s/#.*$//g"
is_master = true
node_id_file = /etc/graylog/server/node-id
password_secret =epOqmLi7r7CdZxl76QOQxr8bRUPYstNdcBuajsaSNfG5bkXXFxyHAAsdgmCfyHhSKlKXjMQG9ojc0bn22EBT17elgGTUJgbD
root_password_sha2 =272c3ac6b26a795a4244d8d2caf1d19a072fbc1c88d497ba1df7fef0a4171ea6
root_email = "greg.smith@domain.com"
root_timezone = America/Chicago
bin_dir = /usr/share/graylog-server/bin
data_dir = /var/lib/graylog-server
plugin_dir = /usr/share/graylog-server/plugin
http_bind_address = graylog.domain.com:9000
http_publish_uri = https://graylog.domain.com:9000/
http_enable_cors = true
http_enable_tls = true
http_tls_cert_file = /etc/ssl/certs/graylog/graylog-certificate.pem
http_tls_key_file = /etc/ssl/certs/graylog/graylog-key.pem
http_tls_key_password = secret
trusted_proxies = 10.10.20.20/24 <--- This is enabled for testing SSO.
elasticsearch_hosts = http://10.10.10.10:9200
rotation_strategy = count
elasticsearch_max_docs_per_index = 20000000
elasticsearch_max_number_of_indices = 20
retention_strategy = delete
elasticsearch_shards = 4
elasticsearch_replicas = 0
elasticsearch_index_prefix = graylog
allow_leading_wildcard_searches = true
allow_highlighting = false
elasticsearch_analyzer = standard
elasticsearch_index_optimization_timeout = 1h
output_batch_size = 5000
output_flush_interval = 1
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30
processbuffer_processors = 6
outputbuffer_processors = 2
processor_wait_strategy = blocking
ring_size = 65536
inputbuffer_ring_size = 65536
inputbuffer_processors = 3
inputbuffer_wait_strategy = blocking
message_journal_enabled = true
message_journal_dir = /var/lib/graylog-server/journal
message_journal_max_size = 12gb
lb_recognition_period_seconds = 3
mongodb_uri = mongodb://mongo_admin:password@localhost:27017/graylog
mongodb_max_connections = 1000
mongodb_threads_allowed_to_block_multiplier = 5
transport_email_enabled = true
transport_email_hostname = localhost
tansport_email_port = 25
transport_email_subject_prefix = [graylog]
transport_email_from_email = root@doamin.com
transport_email_web_interface_url = https://10.10.10.10:9000
http_connect_timeout = 10s
proxied_requests_thread_pool_size = 32

My HTOP

As you can see, java is soaking up CPu. Maybe someone else here ran into this before with a graylog cluster and can shed some light why this is happening to you.

There were some improvments in performance/Security with the newer version but after you stated what you did Iā€™m not 100% sure that will help your CPU issue.

Here is the changelog starting from GL3.0.0.

https://docs.graylog.org/en/3.3/pages/changelog.html#graylog-3-0-0

Hope that helps

Hello,

The culprit was the combination of updates of the java and openssl packages. A downgrade with the following apt commands solved my problems:

sudo apt install openjdk-8-jre-headless=8u162-b12-1
sudo apt install openssl=1.1.0g-2ubuntu4

Both upgrades of these were installed by automatic (unattended) upgrades.
I noticed there was something strange going on with SSL/TLS because calls to the Graylog API (https://192.168.3.41:9000/api on my config) were ā€œhanging upā€. The first call was always going through fine, however subsequent calls always got stuck on the TLS handshake part (visible with curl in verbose mode).
I am leaving this here in case anyone else encounters this problems. It should be noted however that the automatic upgrades where security upgrades (thatā€™s how they are configured), so use you own best judgement when downgrading.

Thank you gsmith for your help and insight!

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.