ERROR [VersionProbe] Unable to retrieve version from Elasticsearch node: Failed to connect to /127.0.0.1:9200. - Connection refused (Connection refused)

cesq · December 13, 2022, 3:05pm

So I have updated my graylog server from 4.2.7 to 4.3.9 yesterday. Everything was working perfectly. Today when I started the server I get a “connection refused” when trying to access the UI.
I have looked at the logs and narrowed it down to this:

INFO  [VersionProbe] Elasticsearch is not available. Retry #1
ERROR [VersionProbe] Unable to retrieve version from Elasticsearch node: Failed to connect to /127.0.0.1:9200. - Connection refused (Connection refused).
INFO  [VersionProbe] Elasticsearch is not available. Retry #2
ERROR [VersionProbe] Unable to retrieve version from Elasticsearch node: Failed to connect to /127.0.0.1:9200. - Connection refused (Connection refused).
...

It goes on for a while. I have made sure that I have the correct version of elastic installed → elasticsearch-oss.x86_64 (7.10.2-1)
I have also googled this error, but mostly people experience this issue when running opensearch instead of elastic, that is not the case here.
I have tried setting this option in my server.conf file elastic_search_version = 7 but it didn’t solve the issue.
I’ve also made sure that this isn’t a firewall or a selinux issue, so now I really don’t know what to do.
My server.conf file:





is_master = true


node_id_file = /etc/graylog/server/node-id


password_secret = <super secret>

# The default root user is named 'admin'
#root_username = admin


root_password_sha2 = <supersupersecret>

# The email address of the root user.
# Default is empty
#root_email = ""

# The time zone setting of the root user. See http://www.joda.org/joda-time/timezones.html for a list of valid time zones.
# Default is UTC
#root_timezone = UTC


bin_dir = /usr/share/graylog-server/bin

# Set the data directory here (relative or absolute)
# This directory is used to store Graylog server state.
# Default: data
data_dir = /var/lib/graylog-server

# Set plugin directory here (relative or absolute)
plugin_dir = /usr/share/graylog-server/plugin

###############
# HTTP settings
###############

#### HTTP bind address
#
# The network interface used by the Graylog HTTP interface.
#
# This network interface must be accessible by all Graylog nodes in the cluster and by all clients
# using the Graylog web interface.
#
# If the port is omitted, Graylog will use port 9000 by default.
#
# Default: 127.0.0.1:9000
http_bind_address = 192.168.100.45:9000
#http_bind_address = [2001:db8::1]:9000

#### HTTP publish URI
#
# The HTTP URI of this Graylog node which is used to communicate with the other Graylog nodes in the cluster and by all
# clients using the Graylog web interface.
#

#http_publish_uri = http://127.0.0.1:9000/

#### External Graylog URI
#
# The public URI of Graylog which will be used by the Graylog web interface to communicate with the Graylog REST API.
#
# The external Graylog URI usually has to be specified, if Graylog is running behind a reverse proxy or load-balancer
# and it will be used to generate URLs addressing entities in the Graylog REST API (see $http_bind_address).
#
# When using Graylog Collector, this URI will be used to receive heartbeat messages and must be accessible for all collectors.
#
# This setting can be overriden on a per-request basis with the "X-Graylog-Server-URL" HTTP request header.
#
# Default: $http_publish_uri
#http_external_uri =

#### Enable CORS headers for HTTP interface
#
# This allows browsers to make Cross-Origin requests from any origin.
# This is disabled for security reasons and typically only needed if running graylog
# with a separate server for frontend development.
#
# Default: false
#http_enable_cors = false

#### Enable GZIP support for HTTP interface
#
# This compresses API responses and therefore helps to reduce
# overall round trip times. This is enabled by default. Uncomment the next line to disable it.
#http_enable_gzip = false

# The maximum size of the HTTP request headers in bytes.
#http_max_header_size = 8192

# The size of the thread pool used exclusively for serving the HTTP interface.
#http_thread_pool_size = 16

################
# HTTPS settings
################

#### Enable HTTPS support for the HTTP interface
#
# This secures the communication with the HTTP interface with TLS to prevent request forgery and eavesdropping.
#
# Default: false
#http_enable_tls = true

# The X.509 certificate chain file in PEM format to use for securing the HTTP interface.
#http_tls_cert_file = /path/to/graylog.crt

# The PKCS#8 private key file in PEM format to use for securing the HTTP interface.
#http_tls_key_file = /path/to/graylog.key

# The password to unlock the private key used for securing the HTTP interface.
#http_tls_key_password = secret


# Comma separated list of trusted proxies that are allowed to set the client address with X-Forwarded-For
# header. May be subnets, or hosts.
#trusted_proxies = 127.0.0.1/32, 0:0:0:0:0:0:0:1/128

# List of Elasticsearch hosts Graylog should connect to.
# Need to be specified as a comma-separated list of valid URIs for the http ports of your elasticsearch nodes.
# If one or more of your elasticsearch hosts require authentication, include the credentials in each node URI that
# requires authentication.
#
# Default: http://127.0.0.1:9200
#elasticsearch_hosts = http://node1:9200,http://user:password@node2:19200

# Maximum number of retries to connect to elasticsearch on boot for the version probe.
#
# Default: 0, retry indefinitely with the given delay until a connection could be established
#elasticsearch_version_probe_attempts = 5

# Waiting time in between connection attempts for elasticsearch_version_probe_attempts
#
# Default: 5s
#elasticsearch_version_probe_delay = 5s

# Maximum amount of time to wait for successful connection to Elasticsearch HTTP port.
#
# Default: 10 Seconds
#elasticsearch_connect_timeout = 10s

# Maximum amount of time to wait for reading back a response from an Elasticsearch server.
# (e. g. during search, index creation, or index time-range calculations)
#
# Default: 60 seconds
#elasticsearch_socket_timeout = 60s

# Maximum idle time for an Elasticsearch connection. If this is exceeded, this connection will
# be tore down.

# Frequency of the Elasticsearch node discovery.
#
# Default: 30s
# elasticsearch_discovery_frequency = 30s

# Set the default scheme when connecting to Elasticsearch discovered nodes
#
# Default: http (available options: http, https)
#elasticsearch_discovery_default_scheme = http

# Enable payload compression for Elasticsearch requests.
#
# Default: false
#elasticsearch_compression_enabled = true


rotation_strategy = count

# (Approximate) maximum number of documents in an Elasticsearch index before a new index
# is being created, also see no_retention and elasticsearch_max_number_of_indices.
# Configure this if you used 'rotation_strategy = count' above.
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
elasticsearch_max_docs_per_index = 20000000

# (Approximate) maximum size in bytes per Elasticsearch index on disk before a new index is being created, also see
# no_retention and elasticsearch_max_number_of_indices. Default is 1GB.
# Configure this if you used 'rotation_strategy = size' above.
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
#elasticsearch_max_size_per_index = 1073741824

# (Approximate) maximum time before a new Elasticsearch index is being created, also see
# no_retention and elasticsearch_max_number_of_indices. Default is 1 day.
# Configure this if you used 'rotation_strategy = time' above.
# Please note that this rotation period does not look at the time specified in the received messages, but is
# using the real clock value to decide when to rotate the index!
# Specify the time using a duration and a suffix indicating which unit you want:
#  1w  = 1 week
#  1d  = 1 day
#  12h = 12 hours
# Permitted suffixes are: d for day, h for hour, m for minute, s for second.
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
#elasticsearch_max_time_per_index = 1d

# Optional upper bound on elasticsearch_max_time_per_index
# elasticsearch_max_write_index_age = 1d

# Disable checking the version of Elasticsearch for being compatible with this Graylog release.
# WARNING: Using Graylog with unsupported and untested versions of Elasticsearch may lead to data loss!
#elasticsearch_disable_version_check = true
elastic_search_version = 7
# Disable message retention on this node, i. e. disable Elasticsearch index rotation.
#no_retention = false

# How many indices do you want to keep?
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
elasticsearch_max_number_of_indices = 20

# Decide what happens with the oldest indices when the maximum number of indices is reached.
# The following strategies are availble:
#   - delete # Deletes the index completely (Default)
#   - close # Closes the index and hides it from the system. Can be re-opened later.
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
retention_strategy = delete

# How many Elasticsearch shards and replicas should be used per index? Note that this only applies to newly created indices.
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
elasticsearch_shards = 1
elasticsearch_replicas = 0

# Prefix for all Elasticsearch indices and index aliases managed by Graylog.
#
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
elasticsearch_index_prefix = graylog

# Name of the Elasticsearch index template used by Graylog to apply the mandatory index mapping.
# Default: graylog-internal
#
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
#elasticsearch_template_name = graylog-internal

# Do you want to allow searches with leading wildcards? This can be extremely resource hungry and should only
# be enabled with care. See also: https://docs.graylog.org/docs/query-language
allow_leading_wildcard_searches = false

# Do you want to allow searches to be highlighted? Depending on the size of your messages this can be memory hungry and
# should only be enabled after making sure your Elasticsearch cluster has enough memory.
allow_highlighting = false

# Analyzer (tokenizer) to use for message and full_message field. The "standard" filter usually is a good idea.
# All supported analyzers are: standard, simple, whitespace, stop, keyword, pattern, language, snowball, custom
# Elasticsearch documentation: https://www.elastic.co/guide/en/elasticsearch/reference/2.3/analysis.html
# Note that this setting only takes effect on newly created indices.
#
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
elasticsearch_analyzer = standard

# Global timeout for index optimization (force merge) requests.
# Default: 1h
#elasticsearch_index_optimization_timeout = 1h

# Maximum number of concurrently running index optimization (force merge) jobs.
# If you are using lots of different index sets, you might want to increase that number.
# Default: 20
#elasticsearch_index_optimization_jobs = 20

# Mute the logging-output of ES deprecation warnings during REST calls in the ES RestClient
#elasticsearch_mute_deprecation_warnings = true

# Time interval for index range information cleanups. This setting defines how often stale index range information
# is being purged from the database.
# Default: 1h
#index_ranges_cleanup_interval = 1h

# Time interval for the job that runs index field type maintenance tasks like cleaning up stale entries. This doesn't
# need to run very often.
# Default: 1h
#index_field_type_periodical_interval = 1h

# Batch size for the Elasticsearch output. This is the maximum (!) number of messages the Elasticsearch output
# module will get at once and write to Elasticsearch in a batch call. If the configured batch size has not been
# reached within output_flush_interval seconds, everything that is available will be flushed at once. Remember
# that every outputbuffer processor manages its own batch and performs its own batch write calls.
# ("outputbuffer_processors" variable)
output_batch_size = 500

# Flush interval (in seconds) for the Elasticsearch output. This is the maximum amount of time between two
# batches of messages written to Elasticsearch. It is only effective at all if your minimum number of messages
# for this time period is less than output_batch_size * outputbuffer_processors.
output_flush_interval = 1

# As stream outputs are loaded only on demand, an output which is failing to initialize will be tried over and
# over again. To prevent this, the following configuration options define after how many faults an output will
# not be tried again for an also configurable amount of seconds.
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30

# The number of parallel running processors.
# Raise this number if your buffers are filling up.
processbuffer_processors = 5
outputbuffer_processors = 3

# The following settings (outputbuffer_processor_*) configure the thread pools backing each output buffer processor.
# See https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadPoolExecutor.html for technical details

# When the number of threads is greater than the core (see outputbuffer_processor_threads_core_pool_size),
# this is the maximum time in milliseconds that excess idle threads will wait for new tasks before terminating.
# Default: 5000
#outputbuffer_processor_keep_alive_time = 5000

# The number of threads to keep in the pool, even if they are idle, unless allowCoreThreadTimeOut is set
# Default: 3
#outputbuffer_processor_threads_core_pool_size = 3

# The maximum number of threads to allow in the pool
# Default: 30
#outputbuffer_processor_threads_max_pool_size = 30

# UDP receive buffer size for all message inputs (e. g. SyslogUDPInput).
#udp_recvbuffer_sizes = 1048576

# Wait strategy describing how buffer processors wait on a cursor sequence. (default: sleeping)
# Possible types:
#  - yielding
#     Compromise between performance and CPU usage.
#  - sleeping
#     Compromise between performance and CPU usage. Latency spikes can occur after quiet periods.
#  - blocking
#     High throughput, low latency, higher CPU usage.
#  - busy_spinning
#     Avoids syscalls which could introduce latency jitter. Best when threads can be bound to specific CPU cores.
processor_wait_strategy = blocking

# Size of internal ring buffers. Raise this if raising outputbuffer_processors does not help anymore.
# For optimum performance your LogMessage objects in the ring buffer should fit in your CPU L3 cache.
# Must be a power of 2. (512, 1024, 2048, ...)
ring_size = 65536

inputbuffer_ring_size = 65536
inputbuffer_processors = 2
inputbuffer_wait_strategy = blocking

# Enable the message journal.
message_journal_enabled = true

# The directory which will be used to store the message journal. The directory must be exclusively used by Graylog and
# must not contain any other files than the ones created by Graylog itself.
#
# ATTENTION:
#   If you create a seperate partition for the journal files and use a file system creating directories like 'lost+found'
#   in the root directory, you need to create a sub directory for your journal.
#   Otherwise Graylog will log an error message that the journal is corrupt and Graylog will not start.
message_journal_dir = /var/lib/graylog-server/journal

# Journal hold messages before they could be written to Elasticsearch.
# For a maximum of 12 hours or 5 GB whichever happens first.
# During normal operation the journal will be smaller.
#message_journal_max_age = 12h
#message_journal_max_size = 5gb

#message_journal_flush_age = 1m
#message_journal_flush_interval = 1000000
#message_journal_segment_age = 1h
#message_journal_segment_size = 100mb

# Number of threads used exclusively for dispatching internal events. Default is 2.
#async_eventbus_processors = 2

# How many seconds to wait between marking node as DEAD for possible load balancers and starting the actual
# shutdown process. Set to 0 if you have no status checking load balancers in front.
lb_recognition_period_seconds = 3

# Journal usage percentage that triggers requesting throttling for this server node from load balancers. The feature is
# disabled if not set.
#lb_throttle_threshold_percentage = 95

# Every message is matched against the configured streams and it can happen that a stream contains rules which
# take an unusual amount of time to run, for example if its using regular expressions that perform excessive backtracking.
# This will impact the processing of the entire server. To keep such misbehaving stream rules from impacting other
# streams, Graylog limits the execution time for each stream.
# The default values are noted below, the timeout is in milliseconds.
# If the stream matching for one stream took longer than the timeout value, and this happened more than "max_faults" times
# that stream is disabled and a notification is shown in the web interface.
#stream_processing_timeout = 2000
#stream_processing_max_faults = 3

# Since 0.21 the Graylog server supports pluggable output modules. This means a single message can be written to multiple
# outputs. The next setting defines the timeout for a single output module, including the default output module where all
# messages end up.
#
# Time in milliseconds to wait for all message outputs to finish writing a single message.
#output_module_timeout = 10000

# Time in milliseconds after which a detected stale master node is being rechecked on startup.
#stale_master_timeout = 2000

# Time in milliseconds which Graylog is waiting for all threads to stop on shutdown.
#shutdown_timeout = 30000

# MongoDB connection string
# See https://docs.mongodb.com/manual/reference/connection-string/ for details
mongodb_uri = mongodb://localhost/graylog

# Authenticate against the MongoDB server
# '+'-signs in the username or password need to be replaced by '%2B'
#mongodb_uri = mongodb://grayloguser:secret@localhost:27017/graylog

# Use a replica set instead of a single host
#mongodb_uri = mongodb://grayloguser:secret@localhost:27017,localhost:27018,localhost:27019/graylog?replicaSet=rs01

# DNS Seedlist https://docs.mongodb.com/manual/reference/connection-string/#dns-seedlist-connection-format
#mongodb_uri = mongodb+srv://server.example.org/graylog

# Increase this value according to the maximum connections your MongoDB server can handle from a single client
# if you encounter MongoDB connection problems.
mongodb_max_connections = 1000

# Number of threads allowed to be blocked by MongoDB connections multiplier. Default: 5
# If mongodb_max_connections is 100, and mongodb_threads_allowed_to_block_multiplier is 5,
# then 500 threads can block. More than that and an exception will be thrown.
# http://api.mongodb.com/java/current/com/mongodb/MongoOptions.html#threadsAllowedToBlockForConnectionMultiplier
mongodb_threads_allowed_to_block_multiplier = 5


# Email transport
#transport_email_enabled = false
#transport_email_hostname = mail.example.com
#transport_email_port = 587
#transport_email_use_auth = true
#transport_email_auth_username = you@example.com
#transport_email_auth_password = secret
#transport_email_subject_prefix = [graylog]
#transport_email_from_email = graylog@example.com

# Encryption settings
#
# ATTENTION:
#    Using SMTP with STARTTLS *and* SMTPS at the same time is *not* possible.

# Use SMTP with STARTTLS, see https://en.wikipedia.org/wiki/Opportunistic_TLS
#transport_email_use_tls = true

# Use SMTP over SSL (SMTPS), see https://en.wikipedia.org/wiki/SMTPS
# This is deprecated on most SMTP services!
#transport_email_use_ssl = false


# Specify and uncomment this if you want to include links to the stream in your stream alert mails.
# This should define the fully qualified base url to your web interface exactly the same way as it is accessed by your users.
#transport_email_web_interface_url = https://graylog.example.com

# The default connect timeout for outgoing HTTP connections.
# Values must be a positive duration (and between 1 and 2147483647 when converted to milliseconds).
# Default: 5s
#http_connect_timeout = 5s

# The default read timeout for outgoing HTTP connections.
# Values must be a positive duration (and between 1 and 2147483647 when converted to milliseconds).
# Default: 10s
#http_read_timeout = 10s

# The default write timeout for outgoing HTTP connections.
# Values must be a positive duration (and between 1 and 2147483647 when converted to milliseconds).
# Default: 10s
#http_write_timeout = 10s

# HTTP proxy for outgoing HTTP connections
# ATTENTION: If you configure a proxy, make sure to also configure the "http_non_proxy_hosts" option so internal
#            HTTP connections with other nodes does not go through the proxy.
# Examples:
#   - http://proxy.example.com:8123
#   - http://username:password@proxy.example.com:8123
#http_proxy_uri =

# A list of hosts that should be reached directly, bypassing the configured proxy server.
# This is a list of patterns separated by ",". The patterns may start or end with a "*" for wildcards.
# Any host matching one of these patterns will be reached through a direct connection instead of through a proxy.
# Examples:
#   - localhost,127.0.0.1
#   - 10.0.*,*.example.com
#http_non_proxy_hosts =

# Disable the optimization of Elasticsearch indices after index cycling. This may take some load from Elasticsearch
# on heavily used systems with large indices, but it will decrease search performance. The default is to optimize
# cycled indices.
#
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
#disable_index_optimization = true

# Optimize the index down to <= index_optimization_max_num_segments. A higher number may take some load from Elasticsearch
# on heavily used systems with large indices, but it will decrease search performance. The default is 1.
#
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
#index_optimization_max_num_segments = 1

# The threshold of the garbage collection runs. If GC runs take longer than this threshold, a system notification
# will be generated to warn the administrator about possible problems with the system. Default is 1 second.
#gc_warning_threshold = 1s

# Connection timeout for a configured LDAP server (e. g. ActiveDirectory) in milliseconds.
#ldap_connection_timeout = 2000

# Disable the use of a native system stats collector (currently OSHI)
disable_native_system_stats_collector = true

# The default cache time for dashboard widgets. (Default: 10 seconds, minimum: 1 second)
#dashboard_widget_default_cache_time = 10s

# For some cluster-related REST requests, the node must query all other nodes in the cluster. This is the maximum number
# of threads available for this. Increase it, if '/cluster/*' requests take long to complete.
# Should be http_thread_pool_size * average_cluster_size if you have a high number of concurrent users.
proxied_requests_thread_pool_size = 32

# The server is writing processing status information to the database on a regular basis. This setting controls how
# often the data is written to the database.
# Default: 1s (cannot be less than 1s)
#processing_status_persist_interval = 1s

# Configures the threshold for detecting outdated processing status records. Any records that haven't been updated
# in the configured threshold will be ignored.
# Default: 1m (one minute)
#processing_status_update_threshold = 1m

# Configures the journal write rate threshold for selecting processing status records. Any records that have a lower
# one minute rate than the configured value might be ignored. (dependent on number of messages in the journal)
# Default: 1
#processing_status_journal_write_rate_threshold = 1

# Configures the prefix used for graylog event indices
# Default: gl-events
#default_events_index_prefix = gl-events

# Configures the prefix used for graylog system event indices
# Default: gl-system-events
#default_system_events_index_prefix = gl-system-events

# Automatically load content packs in "content_packs_dir" on the first start of Graylog.
#content_packs_loader_enabled = false

# The directory which contains content packs which should be loaded on the first start of Graylog.
#content_packs_dir = data/contentpacks

# A comma-separated list of content packs (files in "content_packs_dir") which should be applied on
# the first start of Graylog.
# Default: empty
#content_packs_auto_install = grok-patterns.json

# The allowed TLS protocols for system wide TLS enabled servers. (e.g. message inputs, http interface)
# Setting this to an empty value, leaves it up to system libraries and the used JDK to chose a default.
# Default: TLSv1.2,TLSv1.3  (might be automatically adjusted to protocols supported by the JDK)
#enabled_tls_protocols= TLSv1.2,TLSv1.3

# Enable Prometheus exporter HTTP server.
# Default: false
#prometheus_exporter_enabled = false

# IP address and port for the Prometheus exporter HTTP server.
# Default: 127.0.0.1:9833
#prometheus_exporter_bind_address = 127.0.0.1:9833

# Path to the Prometheus exporter core mapping file. If this option is enabled, the full built-in core mapping is
# replaced with the mappings in this file.
# This file is monitored for changes and updates will be applied at runtime.
# Default: none
#prometheus_exporter_mapping_file_path_core = prometheus-exporter-mapping-core.yml

# Path to the Prometheus exporter custom mapping file. If this option is enabled, the mappings in this file are
# configured in addition to the built-in core mappings. The mappings in this file cannot overwrite any core mappings.
# This file is monitored for changes and updates will be applied at runtime.
# Default: none
#prometheus_exporter_mapping_file_path_custom = prometheus-exporter-mapping-custom.yml

# Configures the refresh interval for the monitored Prometheus exporter mapping files.
# Default: 60s
#prometheus_exporter_mapping_file_refresh_interval = 60s

# Optional allowed paths for Graylog data files. If provided, certain operations in Graylog will only be permitted
# if the data file(s) are located in the specified paths (for example, with the CSV File lookup adapter).
# All subdirectories of indicated paths are allowed by default. This Provides an additional layer of security,
# and allows administrators to control where in the file system Graylog users can select files from.
#allowed_auxiliary_paths = /etc/graylog/data-files,/etc/custom-allowed-path

I had to trim the server.conf file a little, cause it exceeded the character limit (only deleted stuff that was commented out)

tmacgbay · December 13, 2022, 8:14pm

It’s as if the elasticsearch service isn’t running:

$ sudo systemctl status elasticsearch

You can condense the server.conf file for posting with this command (found here) that eliminates comments:

cat /etc/graylog/server/server.conf         | egrep -v "^\s*(#|$)"

Don’t forget to obfuscate/remove secrets etc…

cesq · December 14, 2022, 6:11pm

I thought that it had something to do with elasticsearch, looks like java as well:

Dec 14 19:04:18 graylog-server.cesq.com kernel: Out of memory: Killed process 2076 (java) total-vm:7617500kB, anon-rss:4136984kB, file-rss:0kB, shmem-rss:0kB, UID:992 pgtables:10536kB oom_score_adj:0
Dec 14 19:04:19 graylog-server.cesq.com systemd[1]: elasticsearch.service: Main process exited, code=killed, status=9/KILL
Dec 14 19:04:19 graylog-server.cesq.com systemd[1]: elasticsearch.service: Failed with result 'signal'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- The unit elasticsearch.service has entered the 'failed' state with result 'signal'.
Dec 14 19:04:19 graylog-server.cesq.com systemd[1]: Failed to start Elasticsearch.
-- Subject: Unit elasticsearch.service has failed
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit elasticsearch.service has failed.
-- 
-- The result is failed.

I have recently edited the heap size by changing the jvm arguments to 6GB. I’ll try to revert to 1GB or 2GB and post an update. In the meantime, here’s a log from /var/log/elasticsearch/graylog.log, maybe you can make something of it, but I don’t think there’s anything of use here:

[2022-12-12T18:43:24,221][INFO ][o.e.n.Node               ] [graylog-server.cesq.com] JVM home [/usr/share/elasticsearch/jdk], using bundled JDK [true]
[2022-12-12T18:43:24,221][INFO ][o.e.n.Node               ] [graylog-server.cesq.com] JVM arguments [-Xshare:auto, -Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -XX:+ShowCodeDetailsInExceptionMessages, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dio.netty.allocator.numDirectArenas=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.locale.providers=SPI,COMPAT, -Xms1g, -Xmx1g, -XX:+UseG1GC, -XX:G1ReservePercent=25, -XX:InitiatingHeapOccupancyPercent=30, -Djava.io.tmpdir=/tmp/elasticsearch-5942910150687526801, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=/var/lib/elasticsearch, -XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -XX:MaxDirectMemorySize=536870912, -Des.path.home=/usr/share/elasticsearch, -Des.path.conf=/etc/elasticsearch, -Des.distribution.flavor=oss, -Des.distribution.type=rpm, -Des.bundled_jdk=true]
[2022-12-12T18:43:24,789][INFO ][o.e.p.PluginsService     ] [graylog-server.cesq.com] loaded module [aggs-matrix-stats]
[2022-12-12T18:43:24,789][INFO ][o.e.p.PluginsService     ] [graylog-server.cesq.com] loaded module [analysis-common]
[2022-12-12T18:43:24,789][INFO ][o.e.p.PluginsService     ] [graylog-server.cesq.com] loaded module [geo]
[2022-12-12T18:43:24,790][INFO ][o.e.p.PluginsService     ] [graylog-server.cesq.com] loaded module [ingest-common]
[2022-12-12T18:43:24,790][INFO ][o.e.p.PluginsService     ] [graylog-server.cesq.com] loaded module [ingest-geoip]
[2022-12-12T18:43:24,790][INFO ][o.e.p.PluginsService     ] [graylog-server.cesq.com] loaded module [ingest-user-agent]
[2022-12-12T18:43:24,790][INFO ][o.e.p.PluginsService     ] [graylog-server.cesq.com] loaded module [kibana]
[2022-12-12T18:43:24,790][INFO ][o.e.p.PluginsService     ] [graylog-server.cesq.com] loaded module [lang-expression]
[2022-12-12T18:43:24,790][INFO ][o.e.p.PluginsService     ] [graylog-server.cesq.com] loaded module [lang-mustache]
[2022-12-12T18:43:24,791][INFO ][o.e.p.PluginsService     ] [graylog-server.cesq.com] loaded module [lang-painless]
[2022-12-12T18:43:24,791][INFO ][o.e.p.PluginsService     ] [graylog-server.cesq.com] loaded module [mapper-extras]
[2022-12-12T18:43:24,791][INFO ][o.e.p.PluginsService     ] [graylog-server.cesq.com] loaded module [parent-join]
[2022-12-12T18:43:24,791][INFO ][o.e.p.PluginsService     ] [graylog-server.cesq.com] loaded module [percolator]
[2022-12-12T18:43:24,791][INFO ][o.e.p.PluginsService     ] [graylog-server.cesq.com] loaded module [rank-eval]
[2022-12-12T18:43:24,791][INFO ][o.e.p.PluginsService     ] [graylog-server.cesq.com] loaded module [reindex]
[2022-12-12T18:43:24,791][INFO ][o.e.p.PluginsService     ] [graylog-server.cesq.com] loaded module [repository-url]
[2022-12-12T18:43:24,792][INFO ][o.e.p.PluginsService     ] [graylog-server.cesq.com] loaded module [systemd]
[2022-12-12T18:43:24,792][INFO ][o.e.p.PluginsService     ] [graylog-server.cesq.com] loaded module [transport-netty4]
[2022-12-12T18:43:24,792][INFO ][o.e.p.PluginsService     ] [graylog-server.cesq.com] no plugins loaded
[2022-12-12T18:43:24,818][INFO ][o.e.e.NodeEnvironment    ] [graylog-server.cesq.com] using [1] data paths, mounts [[/ (/dev/mapper/rl-root)]], net usable_space [6gb], net total_space [9.7gb], types [xfs]
[2022-12-12T18:43:24,818][INFO ][o.e.e.NodeEnvironment    ] [graylog-server.cesq.com] heap size [1gb], compressed ordinary object pointers [true]
[2022-12-12T18:43:24,927][INFO ][o.e.n.Node               ] [graylog-server.cesq.com] node name [graylog-server.cesq.com], node ID [jMWyIy_iS76MuqTwNW4hOA], cluster name [graylog], roles [master, remote_cluster_client, data, ingest]
[2022-12-12T18:43:26,939][INFO ][o.e.t.NettyAllocator     ] [graylog-server.cesq.com] creating NettyAllocator with the following configs: [name=unpooled, suggested_max_allocation_size=256kb, factors={es.unsafe.use_unpooled_allocator=null, g1gc_enabled=true, g1gc_region_size=1mb, heap_size=1gb}]
[2022-12-12T18:43:26,985][INFO ][o.e.d.DiscoveryModule    ] [graylog-server.cesq.com] using discovery type [zen] and seed hosts providers [settings]
[2022-12-12T18:43:27,149][WARN ][o.e.g.DanglingIndicesState] [graylog-server.cesq.com] gateway.auto_import_dangling_indices is disabled, dangling indices will not be automatically detected or imported and must be managed manually
[2022-12-12T18:43:27,251][INFO ][o.e.n.Node               ] [graylog-server.cesq.com] initialized
[2022-12-12T18:43:27,251][INFO ][o.e.n.Node               ] [graylog-server.cesq.com] starting ...
[2022-12-12T18:43:27,342][INFO ][o.e.t.TransportService   ] [graylog-server.cesq.com] publish_address {127.0.0.1:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}
[2022-12-12T18:43:27,493][WARN ][o.e.b.BootstrapChecks    ] [graylog-server.cesq.com] the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured
[2022-12-12T18:43:27,494][INFO ][o.e.c.c.Coordinator      ] [graylog-server.cesq.com] cluster UUID [nI-mLLzwR924UvKii0PcIg]
[2022-12-12T18:43:27,499][INFO ][o.e.c.c.ClusterBootstrapService] [graylog-server.cesq.com] no discovery configuration found, will perform best-effort cluster bootstrapping after [3s] unless existing master is discovered
[2022-12-12T18:43:27,546][INFO ][o.e.c.s.MasterService    ] [graylog-server.cesq.com] elected-as-master ([1] nodes joined)[{graylog-server.cesq.com}{jMWyIy_iS76MuqTwNW4hOA}{DIVyLQdTTkORcVK1CAAHjw}{127.0.0.1}{127.0.0.1:9300}{dimr} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 21, version: 452, delta: master node changed {previous [], current [{graylog-server.cesq.com}{jMWyIy_iS76MuqTwNW4hOA}{DIVyLQdTTkORcVK1CAAHjw}{127.0.0.1}{127.0.0.1:9300}{dimr}]}
[2022-12-12T18:43:27,585][INFO ][o.e.c.s.ClusterApplierService] [graylog-server.cesq.com] master node changed {previous [], current [{graylog-server.cesq.com}{jMWyIy_iS76MuqTwNW4hOA}{DIVyLQdTTkORcVK1CAAHjw}{127.0.0.1}{127.0.0.1:9300}{dimr}]}, term: 21, version: 452, reason: Publication{term=21, version=452}
[2022-12-12T18:43:27,619][INFO ][o.e.h.AbstractHttpServerTransport] [graylog-server.cesq.com] publish_address {127.0.0.1:9200}, bound_addresses {[::1]:9200}, {127.0.0.1:9200}
[2022-12-12T18:43:27,620][INFO ][o.e.n.Node               ] [graylog-server.cesq.com] started
[2022-12-12T18:43:27,672][INFO ][o.e.g.GatewayService     ] [graylog-server.cesq.com] recovered [22] indices into cluster_state
[2022-12-12T18:43:29,113][INFO ][o.e.c.r.a.AllocationService] [graylog-server.cesq.com] Cluster health status changed from [RED] to [GREEN] (reason: [shards started [[graylog_0][0]]]).
[2022-12-12T19:57:58,230][INFO ][o.e.n.Node               ] [graylog-server.cesq.com] stopping ...
[2022-12-12T19:57:58,360][INFO ][o.e.n.Node               ] [graylog-server.cesq.com] stopped
[2022-12-12T19:57:58,361][INFO ][o.e.n.Node               ] [graylog-server.cesq.com] closing ...
[2022-12-12T19:57:58,371][INFO ][o.e.n.Node               ] [graylog-server.cesq.com] closed

Update: Reverting the JVM options didn’t help.

gsmith · December 14, 2022, 11:01pm

Hello,
What I got was out of memory from those first logs.
If ES/GL & Mongo are on the same node they might be fightingover resources.

Check /etc/elasticsearch/jvm.options

##
################################################################

# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space

-Xms4g
-Xmx4g

And depending on what type of OS you have Graylog /JAVA.

https://go2docs.graylog.org/5-0/setting_up_graylog/default_file_locations.html

[root@graylog opensearch]# cat /etc/sysconfig/graylog-server
# Path to the java executable.
JAVA=/usr/bin/java

# Default Java options for heap and garbage collection.
GRAYLOG_SERVER_JAVA_OPTS="-Xms3g -Xmx3g -XX:NewRatio=1

The two above examples will tell me I need at least 7 GB of RAM plus add two more for the OS.

2022-12-12T18:43:29,113][INFO ][o.e.c.r.a.AllocationService] [graylog-server.cesq.com] Cluster health status changed from [RED] to [GREEN] (reason: [shards started [[graylog_0][0]]]).
[2022-12-12T19:57:58,230][INFO ][o.e.n.Node               ] [graylog-server.cesq.com] stopping ...
[2022-12-12T19:57:58,360][INFO ][o.e.n.Node               ] [graylog-server.cesq.com] stopped
[2022-12-12T19:57:58,361][INFO ][o.e.n.Node               ] [graylog-server.cesq.com] closing ...
[2022-12-12T19:57:58,371][INFO ][o.e.n.Node               ] [graylog-server.cesq.com] closed

The last set of logs it looks like Graylog service stopped, Might be something wrong with configurations.

cesq · December 15, 2022, 6:57pm

Thank you all for helping me. Turns out I am a complete moron. I’ve assigned 6GB to Graylog when my vm maximum amount of RAM is 6GB…
Don’t drink and sysadmin kids…

gsmith · December 15, 2022, 10:00pm

Hello,
Made my day @cesq

system · December 29, 2022, 10:00pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Upgrade to OpenSearch problem Graylog Central (peer support)	4	3332	July 1, 2022
Elasticsearch:8.4.0 and graylog:5.1.1 Graylog Central (peer support) elastic	6	247	August 14, 2024
Unable to retrieve version from Elasticsearch node Graylog Central (peer support) docker	6	1711	March 1, 2024
Graylog 5 clean install connection refused Graylog Central (peer support)	8	622	June 2, 2023
.storage.versionprobe.VersionProbe Graylog Central (peer support) elastic	12	1277	November 29, 2022

ERROR [VersionProbe] Unable to retrieve version from Elasticsearch node: Failed to connect to /127.0.0.1:9200. - Connection refused (Connection refused)

Related topics