java.lang.OutOfMemoryError: unable to create new native thread

Hello,

I was using Graylog since about 2015 previously on Ubuntu 16.04LTS. I have now upgraded to 18.04LTS and I can’t get it to work anymore.:cry: It runs on a OpenVZ server with advertised 6 CPU vCores and 12 GB RAM.
All services are on the same system, it also runs Kopano, Nextcloud and Apache. I use it as a hobby instance, mostly by myself (hardly more than 2 concurrent users, besides bots, port scans and such. Used to have ~1GB of messages per quarter on previous installation). There is plenty of unused memory (usually more than 4 GB). GL has been running on the same “hardware” with a similar config before on Ubuntu 16.04 and before that Debian 7 on a smaller vhost with by far less ram ( I think it was 2 GB). However, now I got stuck after installing and setup the first Syslog input.

Webfrontend is behind an apache proxy. Configuration and installation has been done according to this tutorial:
http://docs.graylog.org/en/3.0/pages/installation/os/ubuntu.html
I’m using Letsencrypt for apache, Iptables for filtering (port 80,443 and ssh accept input).

Graylog Version

Graylog 3.0.2+1686930 on MY_HOST (Oracle Corporation 1.8.0_212 on Linux 4.15.0)

Java

java -version
openjdk version “1.8.0_212”
OpenJDK Runtime Environment (build 1.8.0_212-8u212-b03-0ubuntu1.18.04.1-b03)
OpenJDK 64-Bit Server VM (build 25.212-b03, mixed mode)

Mongo

mongod --version
db version v4.0.10
git version: c389e7f69f637f7a1ac3cc9fae843b635f20b766
OpenSSL version: OpenSSL 1.1.1 11 Sep 2018
allocator: tcmalloc
modules: none
build environment:
distmod: ubuntu1804
distarch: x86_64
target_arch: x86_64

Graylog configuration
(hardly any changes from default, since version 3 seem to run fine behind a reverse proxy out of the box)

is_master = true
node_id_file = /etc/graylog/server/node-id
password_secret =
root_username =
root_password_sha2 =
bin_dir = /usr/share/graylog-server/bin
data_dir = /var/lib/graylog-server
plugin_dir = /usr/share/graylog-server/plugin
rotation_strategy = count
elasticsearch_max_docs_per_index = 20000000
elasticsearch_max_number_of_indices = 20
retention_strategy = delete
elasticsearch_shards = 4
elasticsearch_replicas = 0
elasticsearch_index_prefix = graylog
allow_leading_wildcard_searches = false
allow_highlighting = false
elasticsearch_analyzer = standard
output_batch_size = 500
output_flush_interval = 1
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30
processbuffer_processors = 5
outputbuffer_processors = 3
processor_wait_strategy = blocking
ring_size = 65536
inputbuffer_ring_size = 65536
inputbuffer_processors = 2
inputbuffer_wait_strategy = blocking
message_journal_enabled = true
message_journal_dir = /var/lib/graylog-server/journal
lb_recognition_period_seconds = 3
mongodb_uri = mongodb://localhost/graylog
mongodb_max_connections = 1000
mongodb_threads_allowed_to_block_multiplier = 5
proxied_requests_thread_pool_size = 32

My problems are very similar to what has been described (but not solved) here: Input not Starting with java.lang.OutOfMemoryError: unable to create new native thread

Mongod and Elasticsearch are running, but with some warnings when started … which I’ve mostly ignored so far. I’ll post startup logs at the end of this post for the sake of overview.

I can start graylog-server and open the webfrontend. When I create and start an input (Syslog TCP or UDP), the log gets filled with errors. This is an excerpt of /var/log/graylog-server/graylog.log from a fresh restart to input creation and error state:

2019-07-11T23:38:58.498+02:00 INFO [ServerBootstrap] Graylog server up and running.
2019-07-11T23:38:58.547+02:00 INFO [InputStateListener] Input [Syslog TCP/5d27ab900a975a0ecb430e88] is now STARTING
2019-07-11T23:38:58.592+02:00 WARN [MacAddressUtil] Failed to find a usable hardware address from the network interfaces; using random bytes: 63:56:91:6e:b2:cc:08:6f
2019-07-11T23:38:58.648+02:00 INFO [InputStateListener] Input [Syslog TCP/5d27ab900a975a0ecb430e88] is now RUNNING
2019-07-11T23:38:58.655+02:00 WARN [AbstractTcpTransport] receiveBufferSize (SO_RCVBUF) for input SyslogTCPInput{title=sys, type=org.graylog2.inputs.syslog.tcp.SyslogTCPInput, nodeId=null} (channel [id: 0x585b2c10, L:/127.0.0.1:10514]) should be 1048576 but is 425984.
2019-07-11T23:39:23.521+02:00 WARN [OutputBuffer] Unable to process event MessageEvent{raw=null, message=source: MY_HOST | message: Starting Clean php session files… { process_id: 1 | level: 6 | gl2_remote_ip: 127.0.0.1 | gl2_remote_port: 36924 | gl2_source_input: 5d27ab900a975a0ecb430e88 | application_name: systemd | gl2_source_node: b3f1c249-3375-4322-a370-e53f38873bd9 | _id: 594828f0-a424-11e9-851f-000000000000 | facility: system daemon | timestamp: 2019-07-11T23:39:23.512+02:00 }, messages=null}, sequence 7
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method) ~[?:1.8.0_212]
at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_212]
at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957) ~[?:1.8.0_212]
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1367) ~[?:1.8.0_212]
at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112) ~[?:1.8.0_212]
at com.codahale.metrics.InstrumentedExecutorService.submit(InstrumentedExecutorService.java:72) ~[graylog.jar:?]
at org.graylog2.buffers.processors.OutputBufferProcessor.processMessage(OutputBufferProcessor.java:187) ~[graylog.jar:?]
at org.graylog2.buffers.processors.OutputBufferProcessor.processMessage(OutputBufferProcessor.java:168) ~[graylog.jar:?]
at org.graylog2.buffers.processors.OutputBufferProcessor.onEvent(OutputBufferProcessor.java:135) ~[graylog.jar:?]
at org.graylog2.buffers.processors.OutputBufferProcessor.onEvent(OutputBufferProcessor.java:51) ~[graylog.jar:?]
at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:143) [graylog.jar:?]
at com.codahale.metrics.InstrumentedThreadFactory$InstrumentedRunnable.run(InstrumentedThreadFactory.java:66) [graylog.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212]
2019-07-11T23:39:23.752+02:00 WARN [OutputBuffer] Unable to process event MessageEvent{raw=null, message=source: MY_HOST | message: Started Clean php session files. { process_id: 1 | level: 6 | gl2_remote_ip: 127.0.0.1 | gl2_remote_port: 36924 | gl2_source_input: 5d27ab900a975a0ecb430e88 | application_name: systemd | gl2_source_node: b3f1c249-3375-4322-a370-e53f38873bd9 | _id: 596b8f70-a424-11e9-851f-000000000000 | facility: system daemon | timestamp: 2019-07-11T23:39:23.748+02:00 }, messages=null}, sequence 8
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method) ~[?:1.8.0_212]
at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_212]
at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957) ~[?:1.8.0_212]
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1367) ~[?:1.8.0_212]
at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112) ~[?:1.8.0_212]
at com.codahale.metrics.InstrumentedExecutorService.submit(InstrumentedExecutorService.java:72) ~[graylog.jar:?]
at org.graylog2.buffers.processors.OutputBufferProcessor.processMessage(OutputBufferProcessor.java:187) ~[graylog.jar:?]
at org.graylog2.buffers.processors.OutputBufferProcessor.processMessage(OutputBufferProcessor.java:168) ~[graylog.jar:?]
at org.graylog2.buffers.processors.OutputBufferProcessor.onEvent(OutputBufferProcessor.java:135) ~[graylog.jar:?]
at org.graylog2.buffers.processors.OutputBufferProcessor.onEvent(OutputBufferProcessor.java:51) ~[graylog.jar:?]
at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:143) [graylog.jar:?]
at com.codahale.metrics.InstrumentedThreadFactory$InstrumentedRunnable.run(InstrumentedThreadFactory.java:66) [graylog.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212]

After these occur, the Webfrontend doesn’t respond very well anymore, I get a lot of 500 Errors from the webfrontend. The apache error log shows some of these:

[Thu Jul 11 23:38:57.164898 2019] [proxy:error] [pid 17064] (111)Connection refused: AH00957: HTTP: attempt to connect to 127.0.0.1:9000 (127.0.0.1) failed
[Thu Jul 11 23:38:57.164955 2019] [proxy_http:error] [pid 17064] [client 93.104.181.158:12265] AH01114: HTTP: failed to make connection to backend: 127.0.0.1, referer: https://LOG.MY.DOMAIN/system/inputs

Also, I’m puzzled by this message in the graylog-server log:

2019-07-11T23:38:58.655+02:00 WARN [AbstractTcpTransport] receiveBufferSize (SO_RCVBUF) for input SyslogTCPInput{title=sys, type=org.graylog2.inputs.syslog.tcp.SyslogTCPInput, nodeId=null} (channel [id: 0x585b2c10, L:/127.0.0.1:10514]) should be 1048576 but is 425984

I was thinking this is related to http://docs.graylog.org/en/3.0/pages/faq.html#troubleshooting.

ulimit -u
62987

I was trying to increase them by /etc/security/limits.d/graylog.conf

more graylog.conf
graylog soft nproc 125974

To be honest, I don’t understand where I should set it to. However, that change had no effect. I also tried adjusting NPROC in /etc/systemd/system/multi-user.target.wants/graylog-server.service to

#LimitNOFILE=64000
LimitNOFILE=65535
LimitNPROC=1033211
LimitSIGPENDING=1033211

Which didn’t change the behavior. I was checking my backups on the previous installations and found no adjustments had been done to limits.

I was hoping to get some guidance on how to deeper analyze the problem / what to adjust …

Many thanks,
Rafael

Startup logs:

Elasticsearch

[2019-07-11T23:06:55,943][INFO ][o.e.e.NodeEnvironment ] [dYVtR5d] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [247.3gb], net total_space [605.7gb], types [rootfs]
[2019-07-11T23:06:55,946][INFO ][o.e.e.NodeEnvironment ] [dYVtR5d] heap size [989.8mb], compressed ordinary object pointers [true]
[2019-07-11T23:06:55,959][INFO ][o.e.n.Node ] [dYVtR5d] node name derived from node ID [dYVtR5dXQv6JcbFIngwYbA]; set [node.name] to override
[2019-07-11T23:06:55,959][INFO ][o.e.n.Node ] [dYVtR5d] version[6.8.1], pid[2377], build[default/deb/1fad4e1/2019-06-18T13:16:52.517138Z], OS[Linux/4.15.0/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_212/25.212-b03]
[2019-07-11T23:06:55,960][INFO ][o.e.n.Node ] [dYVtR5d] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/tmp/elasticsearch-2413074978885295782, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=/var/lib/elasticsearch, -XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -Xloggc:/var/log/elasticsearch/gc.log, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=32, -XX:GCLogFileSize=64m, -Des.path.home=/usr/share/elasticsearch, -Des.path.conf=/etc/elasticsearch, -Des.distribution.flavor=default, -Des.distribution.type=deb]
[2019-07-11T23:06:58,434][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [aggs-matrix-stats]
[2019-07-11T23:06:58,434][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [analysis-common]
[2019-07-11T23:06:58,434][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [ingest-common]
[2019-07-11T23:06:58,434][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [ingest-geoip]
[2019-07-11T23:06:58,434][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [ingest-user-agent]
[2019-07-11T23:06:58,434][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [lang-expression]
[2019-07-11T23:06:58,434][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [lang-mustache]
[2019-07-11T23:06:58,434][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [lang-painless]
[2019-07-11T23:06:58,434][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [mapper-extras]
[2019-07-11T23:06:58,434][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [parent-join]
[2019-07-11T23:06:58,434][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [percolator]
[2019-07-11T23:06:58,434][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [rank-eval]
[2019-07-11T23:06:58,435][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [reindex]
[2019-07-11T23:06:58,435][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [repository-url]
[2019-07-11T23:06:58,435][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [transport-netty4]
[2019-07-11T23:06:58,435][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [tribe]
[2019-07-11T23:06:58,435][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [x-pack-ccr]
[2019-07-11T23:06:58,435][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [x-pack-core]
[2019-07-11T23:06:58,435][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [x-pack-deprecation]
[2019-07-11T23:06:58,435][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [x-pack-graph]
[2019-07-11T23:06:58,435][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [x-pack-ilm]
[2019-07-11T23:06:58,435][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [x-pack-logstash]
[2019-07-11T23:06:58,435][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [x-pack-ml]
[2019-07-11T23:06:58,435][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [x-pack-monitoring]
[2019-07-11T23:06:58,435][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [x-pack-rollup]
[2019-07-11T23:06:58,435][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [x-pack-security]
[2019-07-11T23:06:58,436][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [x-pack-sql]
[2019-07-11T23:06:58,436][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [x-pack-upgrade]
[2019-07-11T23:06:58,436][INFO ][o.e.p.PluginsService ] [dYVtR5d] loaded module [x-pack-watcher]
[2019-07-11T23:06:58,436][INFO ][o.e.p.PluginsService ] [dYVtR5d] no plugins loaded
[2019-07-11T23:07:03,919][INFO ][o.e.x.s.a.s.FileRolesStore] [dYVtR5d] parsed [0] roles from file [/etc/elasticsearch/roles.yml]
[2019-07-11T23:07:04,566][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [dYVtR5d] [controller/2461] [Main.cc@109] controller (64 bit): Version 6.8.1 (Build 6e3432237cefa4) Copyright © 2019 Elasticsearch BV
[2019-07-11T23:07:05,250][DEBUG][o.e.a.ActionModule ] [dYVtR5d] Using REST wrapper from plugin org.elasticsearch.xpack.security.Security
[2019-07-11T23:07:05,633][INFO ][o.e.d.DiscoveryModule ] [dYVtR5d] using discovery type [zen] and host providers [settings]
[2019-07-11T23:07:06,679][INFO ][o.e.n.Node ] [dYVtR5d] initialized
[2019-07-11T23:07:06,680][INFO ][o.e.n.Node ] [dYVtR5d] starting …
[2019-07-11T23:07:06,852][WARN ][i.n.u.i.MacAddressUtil ] [dYVtR5d] Failed to find a usable hardware address from the network interfaces; using random bytes: 01:1e:3a:97:74:e3:e1:ba
[2019-07-11T23:07:07,030][INFO ][o.e.t.TransportService ] [dYVtR5d] publish_address {127.0.0.1:9300}, bound_addresses {127.0.0.1:9300}
[2019-07-11T23:07:07,094][WARN ][o.e.b.BootstrapChecks ] [dYVtR5d] max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[2019-07-11T23:07:10,226][INFO ][o.e.c.s.MasterService ] [dYVtR5d] zen-disco-elected-as-master ([0] nodes joined), reason: new_master {dYVtR5d}{dYVtR5dXQv6JcbFIngwYbA}{aoQ5NAWRQs2LvcVwUABK6A}{127.0.0.1}{127.0.0.1:9300}{ml.machine_memory=12884901888, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}
[2019-07-11T23:07:10,231][INFO ][o.e.c.s.ClusterApplierService] [dYVtR5d] new_master {dYVtR5d}{dYVtR5dXQv6JcbFIngwYbA}{aoQ5NAWRQs2LvcVwUABK6A}{127.0.0.1}{127.0.0.1:9300}{ml.machine_memory=12884901888, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}, reason: apply cluster state (from master [master {dYVtR5d}{dYVtR5dXQv6JcbFIngwYbA}{aoQ5NAWRQs2LvcVwUABK6A}{127.0.0.1}{127.0.0.1:9300}{ml.machine_memory=12884901888, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)]])
[2019-07-11T23:07:10,316][INFO ][o.e.h.n.Netty4HttpServerTransport] [dYVtR5d] publish_address {127.0.0.1:9200}, bound_addresses {127.0.0.1:9200}
[2019-07-11T23:07:10,317][INFO ][o.e.n.Node ] [dYVtR5d] started
[2019-07-11T23:07:10,763][WARN ][o.e.x.s.a.s.m.NativeRoleMappingStore] [dYVtR5d] Failed to clear cache for realms []
[2019-07-11T23:07:10,855][INFO ][o.e.l.LicenseService ] [dYVtR5d] license [c4cf3942-4288-4dd1-a3d8-a3c8b92f7364] mode [basic] - valid
[2019-07-11T23:07:10,870][INFO ][o.e.g.GatewayService ] [dYVtR5d] recovered [1] indices into cluster_state
[2019-07-11T23:07:11,390][INFO ][o.e.c.r.a.AllocationService] [dYVtR5d] Cluster health status changed from [RED] to [GREEN] (reason: [shards started [[graylog_0][3], [graylog_0][1]] …]).

Mongod

2019-07-11T23:16:58.750+0200 I CONTROL [main] ***** SERVER RESTARTED *****
2019-07-11T23:16:58.755+0200 I CONTROL [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols ‘none’
2019-07-11T23:16:58.838+0200 I CONTROL [initandlisten] MongoDB starting : pid=2617 port=27017 dbpath=/var/lib/mongodb 64-bit host=MY_HOST.stratoserver.net
2019-07-11T23:16:58.839+0200 I CONTROL [initandlisten] db version v4.0.10
2019-07-11T23:16:58.839+0200 I CONTROL [initandlisten] git version: c389e7f69f637f7a1ac3cc9fae843b635f20b766
2019-07-11T23:16:58.839+0200 I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.1.1 11 Sep 2018
2019-07-11T23:16:58.839+0200 I CONTROL [initandlisten] allocator: tcmalloc
2019-07-11T23:16:58.839+0200 I CONTROL [initandlisten] modules: none
2019-07-11T23:16:58.839+0200 I CONTROL [initandlisten] build environment:
2019-07-11T23:16:58.839+0200 I CONTROL [initandlisten] distmod: ubuntu1804
2019-07-11T23:16:58.839+0200 I CONTROL [initandlisten] distarch: x86_64
2019-07-11T23:16:58.839+0200 I CONTROL [initandlisten] target_arch: x86_64
2019-07-11T23:16:58.839+0200 I CONTROL [initandlisten] options: { config: “/etc/mongod.conf”, net: { bindIp: “127.0.0.1”, port: 27017 }, processManagement: { timeZoneInfo: “/usr/share/zoneinfo” }, storage: { dbPath: “/var/lib/mongodb”, journal: { enabled: true } }, systemLog: { destination: “file”, logAppend: true, path: “/var/log/mongodb/mongod.log” } }
2019-07-11T23:16:58.845+0200 I STORAGE [initandlisten] Detected data files in /var/lib/mongodb created by the ‘wiredTiger’ storage engine, so setting the active storage engine to ‘wiredTiger’.
2019-07-11T23:16:58.845+0200 I STORAGE [initandlisten]
2019-07-11T23:16:58.845+0200 I STORAGE [initandlisten] ** WARNING: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine
2019-07-11T23:16:58.845+0200 I STORAGE [initandlisten] ** See http://dochub.mongodb.org/core/prodnotes-filesystem
2019-07-11T23:16:58.846+0200 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=5632M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),statistics_log=(wait=0),verbose=(recovery_progress),
2019-07-11T23:16:59.706+0200 I STORAGE [initandlisten] WiredTiger message [1562879819:706343][2617:0x7ff927412a40], txn-recover: Main recovery loop: starting at 5/12288 to 6/256
2019-07-11T23:16:59.818+0200 I STORAGE [initandlisten] WiredTiger message [1562879819:818526][2617:0x7ff927412a40], txn-recover: Recovering log 5 through 6
2019-07-11T23:16:59.900+0200 I STORAGE [initandlisten] WiredTiger message [1562879819:900641][2617:0x7ff927412a40], txn-recover: Recovering log 6 through 6
2019-07-11T23:16:59.954+0200 I STORAGE [initandlisten] WiredTiger message [1562879819:954523][2617:0x7ff927412a40], txn-recover: Set global recovery timestamp: 0
2019-07-11T23:16:59.975+0200 I RECOVERY [initandlisten] WiredTiger recoveryTimestamp. Ts: Timestamp(0, 0)
2019-07-11T23:17:00.059+0200 I CONTROL [initandlisten]
2019-07-11T23:17:00.059+0200 I CONTROL [initandlisten] ** WARNING: Access control is not enabled for the database.
2019-07-11T23:17:00.059+0200 I CONTROL [initandlisten] ** Read and write access to data and configuration is unrestricted.
2019-07-11T23:17:00.059+0200 I CONTROL [initandlisten]
2019-07-11T23:17:00.059+0200 I CONTROL [initandlisten]
2019-07-11T23:17:00.059+0200 I CONTROL [initandlisten] ** WARNING: You are running in OpenVZ which can cause issues on versions of RHEL older than RHEL6.
2019-07-11T23:17:00.060+0200 I CONTROL [initandlisten]
2019-07-11T23:17:00.184+0200 I FTDC [initandlisten] Initializing full-time diagnostic data capture with directory ‘/var/lib/mongodb/diagnostic.data’
2019-07-11T23:17:00.186+0200 I NETWORK [initandlisten] waiting for connections on port 27017

I guess the main problem is:

2019-07-11T23:38:58.592+02:00 WARN [MacAddressUtil] Failed to find a usable hardware address from the network interfaces; using random bytes: 63:56:91:6e:b2:cc:08:6f

But that is only guessing.

Many thanks for your reply.
From what I can tell this message origins from the Netty function “MacAddressUtil”.

My network interface is the OpenVZ containers Virtual Net.

ifconfig
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
loop txqueuelen 1000 (Lokale Schleife)
RX packets 8877228 bytes 22728067037 (22.7 GB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 8877228 bytes 22728067037 (22.7 GB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

venet0: flags=211<UP,BROADCAST,POINTOPOINT,RUNNING,NOARP> mtu 1500
inet 127.0.0.1 netmask 255.255.255.255 broadcast 0.0.0.0 destination 127.0.0.1
unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 0 (UNSPEC)
RX packets 1500568 bytes 1108783656 (1.1 GB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1216511 bytes 984146317 (984.1 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

venet0:0: flags=211<UP,BROADCAST,POINTOPOINT,RUNNING,NOARP> mtu 1500
inet 85.214.105.251 netmask 255.255.255.255 broadcast 85.214.105.251 destination 85.214.105.251
unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 0 (UNSPEC)

Google search shows this impacts often OpenVZ VPS running a service that is using Netty. Unfortunately it seems I can’t change IF settings as they are defined in the VPS template. I found some tutorials on how to setup a virtual bridge with a MAC but I’m not sure if that works as it also requires to have some Kernel settings to be enabled. For that, I’m a bit limited as this is configured by my provider.

I red other Netty applications work as per Google, and, I setup Kibana in the meantime which shows the same warning but runs (miss Graylog though :smiling_face_with_three_hearts:). As I had Graylog running previously, this may be connected on how my provider setup the Ubuntu18.04 template in the network area.
I have also tried Graylog V3.01, which I was running on the same VPS with Ubunut16.04 - same behaviour as described in the original. So I’m guessing there may be a difference in the U18.04 OpenVZ template they use.

On the Netty Github there are two issues relating to that message, on both, comments are going towards “its fine, ignore it”:


I don’t knwo how severe the MAC issue is for Graylog, but when you guys tell me its worth checking I will try to work something out in that area.

thank you for your detailed research. Do you mind opening a bug issue over at github graylog2-server?

Filed an issue on github, but turned out it is indeed a user error :cold_sweat::

My systemd installation was set to DefaultTaskMax = 165, which is obviously too low for graylog. Increasing the setting solved the problem. Haven’t spotted this, only after spending hours of trying to tweak Stack Memory size - which I suspected to be the culprit. :woozy_face: Also the explanation why my previous installation were running fine, Ubuntu16.04 probably had a higher default / not limited on my VPS.

This:


indicates it was 512, but my current installation obviously was limited (sources provided by hoster).

Sorry for the confusion.