Inputs not Starting with java.lang.OutOfMemoryError: unable to create new native thread


(Meerkampj) #1

I installed Graylog on our new Graylog servers with our current used SLES 12 SP2 version.
Our other Graylog installation are on SLES11 SP3 and working with the same configuration.
The new node appears in the nodes view but does not fully start any inputs. (they are all in starting State)

There is enoug memory on the node (128gb) heap space is set to 8gb. Limits are set too. Am i missing something.

OS: SLES 12-SP2 (sles)
JRE: Oracle Corporation 1.8.0_101 on Linux 4.4.21-69-default
Graylog server 2.2.3+7adc951

GRAYLOG_SERVER_JAVA_OPTS="-Xms8g -Xmx8g -XX:NewRatio=1 -server -XX:+ResizeTLAB -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSClassUnloadingEnabled -XX:+UseParNewGC -XX:-OmitStackTraceInFastThrow"

the error suggests that there is a problems wit the limits but they are all defined as on the working system :

cat /proc/56862/limits
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             1033211              1033211              processes 
Max open files            65535                65535                files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       1033211              1033211              signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us    
Sep 26 09:01:09 mgmgray04 graylog-server[56859]: 2017-09-26 09:01:09,669 INFO : org.graylog2.inputs.InputStateListener - Input [GELF TCP/576bc66e47c6c85d3bf26b05] is now STARTING
Sep 26 09:01:09 mgmgray04 graylog-server[56859]: 2017-09-26 09:01:09,671 INFO : org.graylog2.inputs.InputStateListener - Input [GELF TCP/58985f4df14e68768d67ced0] is now STARTING
Sep 26 09:01:09 mgmgray04 graylog-server[56859]: 2017-09-26 09:01:09,673 INFO : org.graylog2.inputs.InputStateListener - Input [Beats/58ec5e3d47c6c8fafa7eaa22] is now STARTING
Sep 26 09:01:09 mgmgray04 graylog-server[56859]: 2017-09-26 09:01:09,674 INFO : org.graylog2.inputs.InputStateListener - Input [Beats/58ec5e5547c6c8fafa7eaa7a] is now STARTING
Sep 26 09:01:09 mgmgray04 graylog-server[56859]: 2017-09-26 09:01:09,675 INFO : org.graylog2.inputs.InputStateListener - Input [Beats/58ec5e6647c6c8fafa7eaa8f] is now STARTING
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: 2017-09-26 09:01:10,157 ERROR: org.graylog2.shared.bindings.SchedulerBindings - Thread scheduled-1 failed by not catching exception: java.lang.OutOfMemoryError: unable to create new native thread.
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: 2017-09-26 09:01:10,170 ERROR: org.elasticsearch.transport.netty - [graylog-9d7ffc6d-1bc5-4479-8577-46112dc47f02] failed to handle exception response [org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$4@45f976e8]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: java.lang.OutOfMemoryError: unable to create new native thread
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at java.lang.Thread.start0(Native Method) ~[?:1.8.0_101]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at java.lang.Thread.start(Thread.java:714) ~[?:1.8.0_101]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:950) ~[?:1.8.0_101]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1357) ~[?:1.8.0_101]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.execute(EsThreadPoolExecutor.java:85) ~[graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.elasticsearch.action.support.ThreadedActionListener.onFailure(ThreadedActionListener.java:101) ~[graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$4.handleException(TransportMasterNodeAction.java:210) ~[graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.elasticsearch.transport.netty.MessageChannelHandler.handleException(MessageChannelHandler.java:184) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.elasticsearch.transport.netty.MessageChannelHandler.handleResponse(MessageChannelHandler.java:163) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:124) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) [graylog.jar:?]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_101]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_101]
Sep 26 09:01:10 mgmgray04 graylog-server[56859]: at java.lang.Thread.run(Thread.java:745) [?:1.8.0_101]
Sep 26 09:01:12 mgmgray04 graylog-server[56859]: 2017-09-26 09:01:12,156 ERROR: org.graylog2.shared.bindings.SchedulerBindings - Thread scheduled-11 failed by not catching exception: java.lang.OutOfMemoryError: unable to create new native thread.
Sep 26 09:01:14 mgmgray04 graylog-server[56859]: 2017-09-26 09:01:14,156 ERROR: org.graylog2.shared.bindings.SchedulerBindings - Thread scheduled-0 failed by not catching exception: java.lang.OutOfMemoryError: unable to create new native thread.
Sep 26 09:01:16 mgmgray04 graylog-server[56859]: 2017-09-26 09:01:16,157 ERROR: org.graylog2.shared.bindings.SchedulerBindings - Thread scheduled-4 failed by not catching exception: java.lang.OutOfMemoryError: unable to create new native thread.
Sep 26 09:01:18 mgmgray04 graylog-server[56859]: 2017-09-26 09:01:18,156 ERROR: org.graylog2.shared.bindings.SchedulerBindings - Thread scheduled-9 failed by not catching exception: java.lang.OutOfMemoryError: unable to create new native thread.
Sep 26 09:01:20 mgmgray04 graylog-server[56859]: 2017-09-26 09:01:20,160 ERROR: org.graylog2.shared.bindings.SchedulerBindings - Thread scheduled-2 failed by not catching exception: java.lang.OutOfMemoryError: unable to create new native thread.
Sep 26 09:01:22 mgmgray04 graylog-server[56859]: 2017-09-26 09:01:22,157 ERROR: org.graylog2.shared.bindings.SchedulerBindings - Thread scheduled-13 failed by not catching exception: java.lang.OutOfMemoryError: unable to create new native thread.
Sep 26 09:01:24 mgmgray04 graylog-server[56859]: 2017-09-26 09:01:24,157 ERROR: org.graylog2.shared.bindings.SchedulerBindings - Thread scheduled-15 failed by not catching exception: java.lang.OutOfMemoryError: unable to create new native thread.
Sep 26 09:01:26 mgmgray04 graylog-server[56859]: 2017-09-26 09:01:26,157 ERROR: org.graylog2.shared.bindings.SchedulerBindings - Thread scheduled-18 failed by not catching exception: java.lang.OutOfMemoryError: unable to create new native thread.

(Jochen) #2

Please post the complete configuration of Graylog.


(Meerkampj) #3

/etc/sysconfig/graylog-server

# Path to the java executable.
JAVA=/usr/bin/java

# Default Java options for heap and garbage collection.
GRAYLOG_SERVER_JAVA_OPTS="-Xms8g -Xmx8g -XX:NewRatio=1 -server -XX:+ResizeTLAB -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSClassUnloadingEnabled -XX:+UseParNewGC -XX:-OmitStackTraceInFastThrow"

# Pass some extra args to graylog-server. (i.e. "-d" to enable debug mode)
GRAYLOG_SERVER_ARGS=""

# Program that will be used to wrap the graylog-server command. Useful to
# support programs like authbind.
GRAYLOG_COMMAND_WRAPPER=""

/etc/graylog/server/server.conf

# If you are running more than one instances of Graylog server you have to select one of these
# instances as master. The master will perform some periodical tasks that non-masters won't perform.
is_master = false

# The auto-generated node ID will be stored in this file and read after restarts. It is a good idea
# to use an absolute file path here if you are starting Graylog server from init scripts or similar.
node_id_file = /etc/graylog/server/node-id

# You MUST set a secret to secure/pepper the stored user passwords here. Use at least 64 characters.
# Generate one by using for example: pwgen -N 1 -s 96
password_secret = *redacted*

# The default root user is named 'admin'
#root_username = admin

# You MUST specify a hash password for the root user (which you only need to initially set up the
# system and in case you lose connectivity to your authentication backend)
# This password cannot be changed using the API or via the web interface. If you need to change it,
# modify it in this file.
# Create one by using for example: echo -n yourpassword | shasum -a 256
# and put the resulting hash value into the following line
root_password_sha2 = *redacted*

# The email address of the root user.
# Default is empty
#root_email = ""

# The time zone setting of the root user. See http://www.joda.org/joda-time/timezones.html for a list of valid time zones.
# Default is UTC
#root_timezone = UTC

# Set plugin directory here (relative or absolute)
plugin_dir = /opt/graylog/server/plugin/

# REST API listen URI. Must be reachable by other Graylog server nodes if you run a cluster.
# When using Graylog Collectors, this URI will be used to receive heartbeat messages and must be accessible for all collectors.
rest_listen_uri = http://0.0.0.0:12900/

# REST API transport address. Defaults to the value of rest_listen_uri. Exception: If rest_listen_uri
# is set to a wildcard IP address (0.0.0.0) the first non-loopback IPv4 system address is used.
# If set, this will be promoted in the cluster discovery APIs, so other nodes may try to connect on
# this address and it is used to generate URLs addressing entities in the REST API. (see rest_listen_uri)
# You will need to define this, if your Graylog server is running behind a HTTP proxy that is rewriting
# the scheme, host name or URI.
# This must not contain a wildcard address (0.0.0.0).
rest_transport_uri = http://172.17.3.148:12900/

# Enable CORS headers for REST API. This is necessary for JS-clients accessing the server directly.
# If these are disabled, modern browsers will not be able to retrieve resources from the server.
# This is enabled by default. Uncomment the next line to disable it.
#rest_enable_cors = false
rest_enable_cors = true

# Enable GZIP support for REST API. This compresses API responses and therefore helps to reduce
# overall round trip times. This is enabled by default. Uncomment the next line to disable it.
#rest_enable_gzip = false

# Enable HTTPS support for the REST API. This secures the communication with the REST API with
# TLS to prevent request forgery and eavesdropping. This is disabled by default. Uncomment the
# next line to enable it.
#rest_enable_tls = true

# The X.509 certificate chain file in PEM format to use for securing the REST API.
#rest_tls_cert_file = /path/to/graylog.crt

# The PKCS#8 private key file in PEM format to use for securing the REST API.
#rest_tls_key_file = /path/to/graylog.key

# The password to unlock the private key used for securing the REST API.
#rest_tls_key_password = secret

# The maximum size of the HTTP request headers in bytes.
#rest_max_header_size = 8192

# The maximal length of the initial HTTP/1.1 line in bytes.
#rest_max_initial_line_length = 4096

# The size of the thread pool used exclusively for serving the REST API.
#rest_thread_pool_size = 16

# Comma separated list of trusted proxies that are allowed to set the client address with X-Forwarded-For
# header. May be subnets, or hosts.
#trusted_proxies = 127.0.0.1/32, 0:0:0:0:0:0:0:1/128

# Enable the embedded Graylog web interface.
# Default: true
#web_enable = false

# Web interface listen URI.
# Configuring a path for the URI here effectively prefixes all URIs in the web interface. This is a replacement
# for the application.context configuration parameter in pre-2.0 versions of the Graylog web interface.
#web_listen_uri = http://127.0.0.1:9000/
web_listen_uri = http://0.0.0.0:9000/

# Web interface endpoint URI. This setting can be overriden on a per-request basis with the X-Graylog-Server-URL header.
# Default: $rest_transport_uri
#web_endpoint_uri =

# Enable CORS headers for the web interface. This is necessary for JS-clients accessing the server directly.
# If these are disabled, modern browsers will not be able to retrieve resources from the server.
#web_enable_cors = false
web_enable_cors = true

# Enable/disable GZIP support for the web interface. This compresses HTTP responses and therefore helps to reduce
# overall round trip times. This is enabled by default. Uncomment the next line to disable it.
#web_enable_gzip = false
web_enable_gzip = true

# Enable HTTPS support for the web interface. This secures the communication of the web browser with the web interface
# using TLS to prevent request forgery and eavesdropping.
# This is disabled by default. Uncomment the next line to enable it and see the other related configuration settings.
#web_enable_tls = true

# The X.509 certificate chain file in PEM format to use for securing the web interface.
#web_tls_cert_file = /path/to/graylog-web.crt

# The PKCS#8 private key file in PEM format to use for securing the web interface.
#web_tls_key_file = /path/to/graylog-web.key

# The password to unlock the private key used for securing the web interface.
#web_tls_key_password = secret

# The maximum size of the HTTP request headers in bytes.
#web_max_header_size = 8192

# The maximal length of the initial HTTP/1.1 line in bytes.
#web_max_initial_line_length = 4096

# The size of the thread pool used exclusively for serving the web interface.
#web_thread_pool_size = 16

# Configuration file for the embedded Elasticsearch instance in Graylog.
# Pay attention to the working directory of the server, maybe use an absolute path here.
# Default: empty
#elasticsearch_config_file = /etc/graylog/server/elasticsearch.yml

# Graylog will use multiple indices to store documents in. You can configured the strategy it uses to determine
# when to rotate the currently active write index.
# It supports multiple rotation strategies:
#   - "count" of messages per index, use elasticsearch_max_docs_per_index below to configure
#   - "size" per index, use elasticsearch_max_size_per_index below to configure
# valid values are "count", "size" and "time", default is "count"
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
rotation_strategy = time

# (Approximate) maximum number of documents in an Elasticsearch index before a new index
# is being created, also see no_retention and elasticsearch_max_number_of_indices.
# Configure this if you used 'rotation_strategy = time
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
elasticsearch_max_docs_per_index = 20000000

# (Approximate) maximum size in bytes per Elasticsearch index on disk before a new index is being created, also see
# no_retention and elasticsearch_max_number_of_indices. Default is 1GB.
# Configure this if you used 'rotation_strategy = time
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
#elasticsearch_max_size_per_index = 1073741824

# (Approximate) maximum time before a new Elasticsearch index is being created, also see
# no_retention and elasticsearch_max_number_of_indices. Default is 1 day.
# Configure this if you used 'rotation_strategy = time
# Please note that this rotation period does not look at the time specified in the received messages, but is
# using the real clock value to decide when to rotate the index!
# Specify the time using a duration and a suffix indicating which unit you want:
#  1w  = 1 week
#  1d  = 1 day
#  12h = 12 hours
# Permitted suffixes are: d for day, h for hour, m for minute, s for second.
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
#elasticsearch_max_time_per_index = 1d

# Disable checking the version of Elasticsearch for being compatible with this Graylog release.
# WARNING: Using Graylog with unsupported and untested versions of Elasticsearch may lead to data loss!
#elasticsearch_disable_version_check = true

# Disable message retention on this node, i. e. disable Elasticsearch index rotation.
#no_retention = false

# How many indices do you want to keep?
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
elasticsearch_max_number_of_indices = 20

# Decide what happens with the oldest indices when the maximum number of indices is reached.
# The following strategies are availble:
#   - delete # Deletes the index completely (Default)
#   - close # Closes the index and hides it from the system. Can be re-opened later.
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
retention_strategy = delete

# How many Elasticsearch shards and replicas should be used per index? Note that this only applies to newly created indices.
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
elasticsearch_shards = 2
elasticsearch_replicas = 1

# Prefix for all Elasticsearch indices and index aliases managed by Graylog.
#
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
elasticsearch_index_prefix = pru-graylog

# Name of the Elasticsearch index template used by Graylog to apply the mandatory index mapping.
# Default: graylog-internal
#
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#elasticsearch_template_name = graylog-internal

# Do you want to allow searches with leading wildcards? This can be extremely resource hungry and should only
# be enabled with care. See also: http://docs.graylog.org/en/2.1/pages/queries.html
allow_leading_wildcard_searches = false

# Do you want to allow searches to be highlighted? Depending on the size of your messages this can be memory hungry and
# should only be enabled after making sure your Elasticsearch cluster has enough memory.
allow_highlighting = false

# settings to be passed to elasticsearch's client (overriding those in the provided elasticsearch_config_file)
# all these
# this must be the same as for your Elasticsearch cluster
#elasticsearch_cluster_name = graylog
elasticsearch_cluster_name = pru-graylog

# The prefix being used to generate the Elasticsearch node name which makes it easier to identify the specific Graylog
# server running the embedded Elasticsearch instance. The node name will be constructed by concatenating this prefix
# and the Graylog node ID (see node_id_file), for example "graylog-17052010-1234-5678-abcd-1337cafebabe".
# Default: graylog-
#elasticsearch_node_name_prefix = graylog-

# A comma-separated list of Elasticsearch nodes which Graylog is using to connect to the Elasticsearch cluster,
# see https://www.elastic.co/guide/en/elasticsearch/reference/2.3/modules-discovery-zen.html for details.
# Default: 127.0.0.1
elasticsearch_discovery_zen_ping_unicast_hosts = mgmlog14:9300 ,mgmlog13:9300 ,mgmlog12:9300 , mgmlog11:9300 , mgmlog16:9300

# Use multiple Elasticsearch nodes as seed
elasticsearch_discovery_zen_ping_unicast_hosts = mgmlog14:9300 ,mgmlog13:9300 ,mgmlog12:9300 , mgmlog11:9300 , mgmlog16:9300

# we don't want the Graylog server to store any data, or be master node
#elasticsearch_node_master = false
#elasticsearch_node_data = false

# use a different port if you run multiple Elasticsearch nodes on one machine
#elasticsearch_transport_tcp_port = 9350

# we don't need to run the embedded HTTP server here
#elasticsearch_http_enabled = false

# Change the following setting if you are running into problems with timeouts during Elasticsearch cluster discovery.
# The setting is specified in milliseconds, the default is 5000ms (5 seconds).
#elasticsearch_cluster_discovery_timeout = 5000

# the following settings allow to change the bind addresses for the Elasticsearch client in Graylog
# these settings are empty by default, letting Elasticsearch choose automatically,
# override them here or in the 'elasticsearch_config_file' if you need to bind to a special address
# refer to https://www.elastic.co/guide/en/elasticsearch/reference/2.3/modules-network.html
# for special values here
#elasticsearch_network_host =
elasticsearch_network_host = 172.17.3.148
#elasticsearch_network_bind_host =
#elasticsearch_network_publish_host =

# The total amount of time discovery will look for other Elasticsearch nodes in the cluster
# before giving up and declaring the current node master.
#elasticsearch_discovery_initial_state_timeout = 3s

# Analyzer (tokenizer) to use for message and full_message field. The "standard" filter usually is a good idea.
# All supported analyzers are: standard, simple, whitespace, stop, keyword, pattern, language, snowball, custom
# Elasticsearch documentation: https://www.elastic.co/guide/en/elasticsearch/reference/2.3/analysis.html
# Note that this setting only takes effect on newly created indices.
#
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
elasticsearch_analyzer = standard

# Global request timeout for Elasticsearch requests (e. g. during search, index creation, or index time-range
# calculations) based on a best-effort to restrict the runtime of Elasticsearch operations.
# Default: 1m
#elasticsearch_request_timeout = 1m

# Global timeout for index optimization (force merge) requests.
# Default: 1h
#elasticsearch_index_optimization_timeout = 1h

# Maximum number of concurrently running index optimization (force merge) jobs.
# If you are using lots of different index sets, you might want to increase that number.
# Default: 20
#elasticsearch_index_optimization_jobs = 20

# Time interval for index range information cleanups. This setting defines how often stale index range information
# is being purged from the database.
# Default: 1h
#index_ranges_cleanup_interval = 1h

# Batch size for the Elasticsearch output. This is the maximum (!) number of messages the Elasticsearch output
# module will get at once and write to Elasticsearch in a batch call. If the configured batch size has not been
# reached within output_flush_interval seconds, everything that is available will be flushed at once. Remember
# that every outputbuffer processor manages its own batch and performs its own batch write calls.
# ("outputbuffer_processors" variable)
output_batch_size = 4000

# Flush interval (in seconds) for the Elasticsearch output. This is the maximum amount of time between two
# batches of messages written to Elasticsearch. It is only effective at all if your minimum number of messages
# for this time period is less than output_batch_size * outputbuffer_processors.
output_flush_interval = 5

# As stream outputs are loaded only on demand, an output which is failing to initialize will be tried over and
# over again. To prevent this, the following configuration options define after how many faults an output will
# not be tried again for an also configurable amount of seconds.
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30

# The number of parallel running processors.
# Raise this number if your buffers are filling up.
processbuffer_processors = 7
outputbuffer_processors = 1

#outputbuffer_processor_keep_alive_time = 5000
#outputbuffer_processor_threads_core_pool_size = 3
#outputbuffer_processor_threads_max_pool_size = 30

# UDP receive buffer size for all message inputs (e. g. SyslogUDPInput).
#udp_recvbuffer_sizes = 1048576

# Wait strategy describing how buffer processors wait on a cursor sequence. (default: sleeping)
# Possible types:
#  - yielding
#     Compromise between performance and CPU usage.
#  - sleeping
#     Compromise between performance and CPU usage. Latency spikes can occur after quiet periods.
#  - blocking
#     High throughput, low latency, higher CPU usage.
#  - busy_spinning
#     Avoids syscalls which could introduce latency jitter. Best when threads can be bound to specific CPU cores.
processor_wait_strategy = blocking

# Size of internal ring buffers. Raise this if raising outputbuffer_processors does not help anymore.
# For optimum performance your LogMessage objects in the ring buffer should fit in your CPU L3 cache.
# Must be a power of 2. (512, 1024, 2048, ...)
ring_size = 65536

inputbuffer_ring_size = 65536
inputbuffer_processors = 2
inputbuffer_wait_strategy = blocking

# Enable the disk based message journal.
message_journal_enabled = true

# The directory which will be used to store the message journal. The directory must me exclusively used by Graylog and
# must not contain any other files than the ones created by Graylog itself.
#
# ATTENTION:
#   If you create a seperate partition for the journal files and use a file system creating directories like 'lost+found'
#   in the root directory, you need to create a sub directory for your journal.
#   Otherwise Graylog will log an error message that the journal is corrupt and Graylog will not start.
message_journal_dir = /daten/graylog-journal

# Journal hold messages before they could be written to Elasticsearch.
# For a maximum of 12 hours or 5 GB whichever happens first.
# During normal operation the journal will be smaller.
#message_journal_max_age = 12h
message_journal_max_age = 3d
#message_journal_max_size = 5gb
message_journal_max_size = 100gb

#message_journal_flush_age = 1m
#message_journal_flush_interval = 1000000
#message_journal_segment_age = 1h
#message_journal_segment_size = 100mb

# Number of threads used exclusively for dispatching internal events. Default is 2.
#async_eventbus_processors = 2

# How many seconds to wait between marking node as DEAD for possible load balancers and starting the actual
# shutdown process. Set to 0 if you have no status checking load balancers in front.
lb_recognition_period_seconds = 3

# Journal usage percentage that triggers requesting throttling for this server node from load balancers. The feature is
# disabled if not set.
#lb_throttle_threshold_percentage = 95

# Every message is matched against the configured streams and it can happen that a stream contains rules which
# take an unusual amount of time to run, for example if its using regular expressions that perform excessive backtracking.
# This will impact the processing of the entire server. To keep such misbehaving stream rules from impacting other
# streams, Graylog limits the execution time for each stream.
# The default values are noted below, the timeout is in milliseconds.
# If the stream matching for one stream took longer than the timeout value, and this happened more than "max_faults" times
# that stream is disabled and a notification is shown in the web interface.
#stream_processing_timeout = 2000
#stream_processing_max_faults = 3

# Length of the interval in seconds in which the alert conditions for all streams should be checked
# and alarms are being sent.
#alert_check_interval = 60

# Since 0.21 the Graylog server supports pluggable output modules. This means a single message can be written to multiple
# outputs. The next setting defines the timeout for a single output module, including the default output module where all
# messages end up.
#
# Time in milliseconds to wait for all message outputs to finish writing a single message.
#output_module_timeout = 10000

# Time in milliseconds after which a detected stale master node is being rechecked on startup.
#stale_master_timeout = 2000

# Time in milliseconds which Graylog is waiting for all threads to stop on shutdown.
#shutdown_timeout = 30000

# MongoDB connection string
# See https://docs.mongodb.com/manual/reference/connection-string/ for details
mongodb_uri = mongodb://mgmlog14,mgmlog13,mgmlog12,mgmlog11,mgmgray02,mgmgray04/graylog

# Authenticate against the MongoDB server
#mongodb_uri = mongodb://mgmlog14,mgmlog13,mgmlog12,mgmlog11,mgmgray02,mgmgray04/graylog

# Use a replica set instead of a single host
#mongodb_uri = mongodb://mgmlog14,mgmlog13,mgmlog12,mgmlog11,mgmgray02,mgmgray04/graylog

# Increase this value according to the maximum connections your MongoDB server can handle from a single client
# if you encounter MongoDB connection problems.
mongodb_max_connections = 1000

# Number of threads allowed to be blocked by MongoDB connections multiplier. Default: 5
# If mongodb_max_connections is 100, and mongodb_threads_allowed_to_block_multiplier is 5,
# then 500 threads can block. More than that and an exception will be thrown.
# http://api.mongodb.com/java/current/com/mongodb/MongoOptions.html#threadsAllowedToBlockForConnectionMultiplier
mongodb_threads_allowed_to_block_multiplier = 5

# Drools Rule File (Use to rewrite incoming log messages)
# See: http://docs.graylog.org/en/2.1/pages/drools.html
#rules_file = /etc/graylog/server/rules.drl
rules_file = /etc/graylog/server/graylog.drl

# Email transport
transport_email_from_email = graylog2@graylog.adm.mgm.plis
transport_email_subject_prefix = [graylog2]
transport_email_use_ssl = false
transport_email_use_tls = false
transport_email_use_auth = false
transport_email_port = 25
transport_email_hostname = localhost
transport_email_enabled = true
#transport_email_enabled = false
#transport_email_hostname = mail.example.com
#transport_email_port = 587
#transport_email_use_auth = true
#transport_email_use_tls = true
#transport_email_use_ssl = true
#transport_email_auth_username = you@example.com
#transport_email_auth_password = secret
#transport_email_subject_prefix = [graylog]
#transport_email_from_email = graylog@example.com

# Specify and uncomment this if you want to include links to the stream in your stream alert mails.
# This should define the fully qualified base url to your web interface exactly the same way as it is accessed by your users.
#transport_email_web_interface_url = https://graylog.example.com
transport_email_web_interface_url = https://172.17.3.134

# The default connect timeout for outgoing HTTP connections.
# Values must be a positive duration (and between 1 and 2147483647 when converted to milliseconds).
# Default: 5s
#http_connect_timeout = 5s

# The default read timeout for outgoing HTTP connections.
# Values must be a positive duration (and between 1 and 2147483647 when converted to milliseconds).
# Default: 10s
#http_read_timeout = 10s

# The default write timeout for outgoing HTTP connections.
# Values must be a positive duration (and between 1 and 2147483647 when converted to milliseconds).
# Default: 10s
#http_write_timeout = 10s

# HTTP proxy for outgoing HTTP connections
#http_proxy_uri =

# Disable the optimization of Elasticsearch indices after index cycling. This may take some load from Elasticsearch
# on heavily used systems with large indices, but it will decrease search performance. The default is to optimize
# cycled indices.
#
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#disable_index_optimization = true

# Optimize the index down to <= index_optimization_max_num_segments. A higher number may take some load from Elasticsearch
# on heavily used systems with large indices, but it will decrease search performance. The default is 1.
#
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#index_optimization_max_num_segments = 1

# The threshold of the garbage collection runs. If GC runs take longer than this threshold, a system notification
# will be generated to warn the administrator about possible problems with the system. Default is 1 second.
#gc_warning_threshold = 1s

# Connection timeout for a configured LDAP server (e. g. ActiveDirectory) in milliseconds.
#ldap_connection_timeout = 2000

# Disable the use of SIGAR for collecting system stats
#disable_sigar = false

# The default cache time for dashboard widgets. (Default: 10 seconds, minimum: 1 second)
#dashboard_widget_default_cache_time = 10s

# Automatically load content packs in "content_packs_dir" on the first start of Graylog.
#content_packs_loader_enabled = true

# The directory which contains content packs which should be loaded on the first start of Graylog.
#content_packs_dir = data/contentpacks

# A comma-separated list of content packs (files in "content_packs_dir") which should be applied on
# the first start of Graylog.
# Default: empty
content_packs_auto_load = grok-patterns.json

# For some cluster-related REST requests, the node must query all other nodes in the cluster. This is the maximum number
# of threads available for this. Increase it, if '/cluster/*' requests take long to complete.
# Should be rest_thread_pool_size * average_cluster_size if you have a high number of concurrent users.
proxied_requests_thread_pool_size = 32

(Jochen) #4

Have you been using the official RPM package to install Graylog on SLES 12-SP2?
What’s the complete log output of Graylog?
What’s the installation directory of Graylog and what’s the working directory of the Java process?
(You might want to check the permissions on /usr/share/graylog-server/data; also see http://docs.graylog.org/en/2.2/pages/configuration/file_location.html#rpm-package)

What’s the output of the following commands?

namei -l /daten/graylog-journal
namei -l /usr/share/graylog-server/data
namei -l /var/lib/graylog-server/journal

Also see:


(Meerkampj) #5

We deploy the tgz to /opt/graylog/graylog-2.2.3
there is a link : /opt/graylog/server -> /opt/graylog/graylog-2.2.3 wich gets updated with a new release
elasticseearch will not be installed on this node we want to seperate Graylog and ES

/usr/share/graylog-server/data and /var/lib/graylog-server/journal are not created

namei -l /daten/graylog-journal
f: /daten/graylog-journal
drwxr-xr-x root    root    /
drwxr-xr-x root    root    daten
drwxr-xr-x graylog graylog graylog-journal

i modified the systemd unit file :

[Unit]
Description=Graylog server
Documentation=http://docs.graylog.org/
Wants=network-online.target
After=network-online.target

[Service]
Type=simple
#Restart=on-failure
Restart=no
RestartSec=10
User=graylog
Group=graylog
LimitNOFILE=65535
LimitNPROC=1033211
LimitSIGPENDING=1033211

ExecStart=/opt/graylog/server/bin/graylog-server

# When a JVM receives a SIGTERM signal it exits with 143.
SuccessExitStatus=143

# Make sure stderr/stdout is captured in the systemd journal.
StandardOutput=journal
StandardError=journal

[Install]
WantedBy=multi-user.target

/opt/graylog/server/bin/graylog-server

#!/bin/sh

set -e

# For Debian/Ubuntu based systems.
if [ -f "/etc/default/graylog-server" ]; then
    . "/etc/default/graylog-server"
fi

# For RedHat/Fedora based systems.
if [ -f "/etc/sysconfig/graylog-server" ]; then
    . "/etc/sysconfig/graylog-server"
fi

if [ -f "/usr/share/graylog-server/installation-source.sh" ]; then
    . "/usr/share/graylog-server/installation-source.sh"
fi

$GRAYLOG_COMMAND_WRAPPER ${JAVA:=/usr/bin/java} $GRAYLOG_SERVER_JAVA_OPTS \
    -jar -Djava.library.path=/opt/graylog/server/lib/sigar/ \
    /opt/graylog/server/graylog.jar server -f /etc/graylog/server/server.conf -np \
    $GRAYLOG_SERVER_ARGS

log is to long to post so here : https://pastebin.com/pGLCUmdQ


(Jochen) #6

Please post the output of the following commands:

namei -l /opt/graylog/server
namei -l /opt/graylog/graylog-2.2.3
namei -l /opt/graylog/graylog-2.2.3/data
namei -l /etc/graylog/server/graylog.drl

The problem seems to be the embedded Elasticsearch node in Graylog, which is unable to start properly.

Just out of curiosity, is there a specific reason you’re staying on Graylog 2.2.x and don’t upgrade to Graylog 2.3.x?


(Meerkampj) #7

i will update to 2.3.x the moment i get a maintenance window ~ a week or two but i also want to get the new node up and running as they are desperatly needed.

namei -l /opt/graylog/server
f: /opt/graylog/server
drwxr-xr-x root    root    /
drwxr-xr-x root    root    opt
drwxr-xr-x graylog graylog graylog
lrwxrwxrwx root    root    server -> /opt/graylog/graylog-2.2.3
drwxr-xr-x root    root      /
drwxr-xr-x root    root      opt
drwxr-xr-x graylog graylog   graylog
drwxr-xr-x root    root      graylog-2.2.3

namei -l /opt/graylog/graylog-2.2.3
f: /opt/graylog/graylog-2.2.3
drwxr-xr-x root    root    /
drwxr-xr-x root    root    opt
drwxr-xr-x graylog graylog graylog
drwxr-xr-x root    root    graylog-2.2.3

namei -l /opt/graylog/graylog-2.2.3/data
f: /opt/graylog/graylog-2.2.3/data
drwxr-xr-x root    root    /
drwxr-xr-x root    root    opt
drwxr-xr-x graylog graylog graylog
drwxr-xr-x root    root    graylog-2.2.3
drwxr-xr-x root    root    data

namei -l /etc/graylog/server/graylog.drl
f: /etc/graylog/server/graylog.drl
drwxr-xr-x root root /
drwxr-xr-x root root etc
drwxr-xr-x root root graylog
drwxr-xr-x root root server
-rw-r----- root root graylog.drl

i already fixed the drl issu :zipper_mouth_face:

namei -l /etc/graylog/server/graylog.drl
f: /etc/graylog/server/graylog.drl
drwxr-xr-x root    root    /
drwxr-xr-x root    root    etc
drwxr-xr-x root    root    graylog
drwxr-xr-x root    root    server
-rw-r----- graylog graylog graylog.drl

(Meerkampj) #8

forgot to mention that i also want top update the running cluster to sles 12 sp2 … so thats why this is the different service pack.


(Jochen) #9

Try using another directory for temporary files (writable by Graylog) or fix the mount options of your /tmp directory.

See https://support.cloudbees.com/hc/en-us/articles/215281717-Jenkins-fails-to-start-with-JNA-error for a related KB article for Jenkins (for the same issue).


(Meerkampj) #10

Our hardening sets the noexec on the mount /tmp.

fixed this there is no longer the message UnsatisfiedLinkError […]

still no started message from the input and still
ERROR: org.graylog2.shared.bindings.SchedulerBindings - Thread scheduled-55 failed by not catching exception: java.lang.OutOfMemoryError: unable to create new native thread.

im lost :confused: i will cleanup the server tommorow morning and reinstall with the rpm. so it may be easier to locate the error.

for know thank you very much for your help.


(Jochen) #11

Are there any kernel security extensions or system limits (e. g. standard ulimits, grsecurity, SELinux, AppArmor) in place which might deny creating new threads (or a certain number of threads)?


(Meerkampj) #12

ulimits are set via .Service file and checked then in /proc/pid ( shown in first post )
No security Extension is currently active.


(system) #13

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.