Fresh Graylog 5.0 Install w/ OpenSearch; OpenSearch invariably fails after a few hours

Before you post: Your responses to these questions will help the community help you. Please complete this template if you’re asking a support question.
Don’t forget to select tags to help index your topic!

1. Describe your incident:
I have a (relatively) fresh install of Graylog 5.0 w/ Opensearch in a Everything seems to be going quite well except that, invariably, my OpenSearch will fail. Sometimes its hours, sometimes its a day, but it will fail. The problem is, I am not familiar enough with OpenSearch to triage this (so bear with me) but I’ll do my best and apologize in advance for any omitted or extraneous information.

2. Describe your environment:

  • OS Information: LXC (Ubuntu 22.04), on a Debian Bookworm Host.

  • Package Version: Graylog 5.0.3+a82acb2

  • Service logs, configurations, and environment variables:
    /graylog/opensearch/config/opensearch.yml

# ======================== OpenSearch Configuration =========================
#
# NOTE: OpenSearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.opensearch.org
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: graylog
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: graylogopen
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /graylog/opensearch/data
#
# Path to log files:
#
path.logs: /var/log/opensearch
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# OpenSearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 127.0.0.1
#
# Set a custom port for HTTP:
#
http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
discovery.type: single-node
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
#discovery.seed_hosts: ["SERVERNAME01", "SERVERNAME02", "SERVERNAME03"]
#
# Bootstrap the cluster using an initial set of cluster-manager-eligible nodes:
#
#cluster.initial_cluster_manager_nodes: ["SERVERNAME01", "SERVERNAME02", "SERVERNAME03"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true
action.auto_create_index: false
plugins.security.disabled: true

/graylog/opensearch/config/jvm.options

## JVM configuration

################################################################
## IMPORTANT: JVM heap size
################################################################
##
## You should always set the min and max JVM heap
## size to the same value. For example, to set
## the heap to 4 GB, set:
##
## -Xms4g
## -Xmx4g
##
## See https://opensearch.org/docs/opensearch/install/important-settings/
## for more information
##
################################################################

# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space

-Xms24g
-Xmx24g

################################################################
## Expert settings
################################################################
##
## All settings below this section are considered
## expert settings. Don't tamper with them unless
## you understand what you are doing
##
################################################################

## GC configuration
8-10:-XX:+UseConcMarkSweepGC
8-10:-XX:CMSInitiatingOccupancyFraction=75
8-10:-XX:+UseCMSInitiatingOccupancyOnly

## G1GC Configuration
# NOTE: G1 GC is only supported on JDK version 10 or later
# to use G1GC, uncomment the next two lines and update the version on the
# following three lines to your version of the JDK
# 10:-XX:-UseConcMarkSweepGC
# 10:-XX:-UseCMSInitiatingOccupancyOnly
11-:-XX:+UseG1GC
11-:-XX:G1ReservePercent=25
11-:-XX:InitiatingHeapOccupancyPercent=30

## JVM temporary directory
-Djava.io.tmpdir=${OPENSEARCH_TMPDIR}

## heap dumps

# generate a heap dump when an allocation from the Java heap fails
# heap dumps are created in the working directory of the JVM
-XX:+HeapDumpOnOutOfMemoryError

# specify an alternative path for heap dumps; ensure the directory exists and
# has sufficient space
-XX:HeapDumpPath=data

# specify an alternative path for JVM fatal error logs
-XX:ErrorFile=logs/hs_err_pid%p.log

## JDK 8 GC logging
8:-XX:+PrintGCDetails
8:-XX:+PrintGCDateStamps
8:-XX:+PrintTenuringDistribution
8:-XX:+PrintGCApplicationStoppedTime
8:-Xloggc:logs/gc.log
8:-XX:+UseGCLogFileRotation
8:-XX:NumberOfGCLogFiles=32
8:-XX:GCLogFileSize=64m

# JDK 9+ GC logging
9-:-Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m

# Explicitly allow security manager (https://bugs.openjdk.java.net/browse/JDK-8270380)
18-:-Djava.security.manager=allow

/etc/default/graylog-server

  GNU nano 6.2                                                                                               /etc/default/graylog-server                                                                                                        # Path to a custom java executable. By default the java executable of the
# bundled JVM is used.
#JAVA=/usr/bin/java

# Default Java options for heap and garbage collection.
GRAYLOG_SERVER_JAVA_OPTS="-Xms24g -Xmx24g -server -XX:+UseG1GC -XX:-OmitStackTraceInFastThrow"

# Avoid endless loop with some TLSv1.3 implementations.
GRAYLOG_SERVER_JAVA_OPTS="$GRAYLOG_SERVER_JAVA_OPTS -Djdk.tls.acknowledgeCloseNotify=true"

# Fix for log4j CVE-2021-44228
GRAYLOG_SERVER_JAVA_OPTS="$GRAYLOG_SERVER_JAVA_OPTS -Dlog4j2.formatMsgNoLookups=true"

# Pass some extra args to graylog-server. (i.e. "-d" to enable debug mode)
GRAYLOG_SERVER_ARGS=""

# Program that will be used to wrap the graylog-server command. Useful to
# support programs like authbind.
GRAYLOG_COMMAND_WRAPPER=""

/etc/graylog/server/server.conf

############################
# GRAYLOG CONFIGURATION FILE
############################
...
#

# If you are running more than one instances of Graylog server you have to select one of these
# instances as leader. The leader will perform some periodical tasks that non-leaders won't perform.
is_leader = true

# The auto-generated node ID will be stored in this file and read after restarts. It is a good idea
# to use an absolute file path here if you are starting Graylog server from init scripts or similar.
node_id_file = /etc/graylog/server/node-id

...

# The time zone setting of the root user. See http://www.joda.org/joda-time/timezones.html for a list of valid time zones.
# Default is UTC
root_timezone = UTC

# Set the bin directory here (relative or absolute)
# This directory contains binaries that are used by the Graylog server.
# Default: bin
bin_dir = /usr/share/graylog-server/bin

# Set the data directory here (relative or absolute)
# This directory is used to store Graylog server state.
# Default: data
data_dir = /var/lib/graylog-server

# Set plugin directory here (relative or absolute)
plugin_dir = /usr/share/graylog-server/plugin

###############
# HTTP settings
###############

#### HTTP bind address
#
# The network interface used by the Graylog HTTP interface.
#
# This network interface must be accessible by all Graylog nodes in the cluster and by all clients
# using the Graylog web interface.
#
# If the port is omitted, Graylog will use port 9000 by default.
#
# Default: 127.0.0.1:9000
http_bind_address = 127.0.0.1:9000
#http_bind_address = [2001:db8::1]:9000

#### HTTP publish URI
#
# The HTTP URI of this Graylog node which is used to communicate with the other Graylog nodes in the cluster and by all
# clients using the Graylog web interface.
#
# The URI will be published in the cluster discovery APIs, so that other Graylog nodes will be able to find and connect to this Graylog node.
#
# This configuration setting has to be used if this Graylog node is available on another network interface than $http_bind_address,
# for example if the machine has multiple network interfaces or is behind a NAT gateway.
#
# If $http_bind_address contains a wildcard IPv4 address (0.0.0.0), the first non-loopback IPv4 address of this machine will be used.
# This configuration setting *must not* contain a wildcard address!
#
# Default: http://$http_bind_address/
http_publish_uri = http://192.168.128.253

#### External Graylog URI
#
# The public URI of Graylog which will be used by the Graylog web interface to communicate with the Graylog REST API.
#
# The external Graylog URI usually has to be specified, if Graylog is running behind a reverse proxy or load-balancer
# and it will be used to generate URLs addressing entities in the Graylog REST API (see $http_bind_address).
#
# When using Graylog Collector, this URI will be used to receive heartbeat messages and must be accessible for all collectors.
#
# This setting can be overriden on a per-request basis with the "X-Graylog-Server-URL" HTTP request header.
#
# Default: $http_publish_uri
http_external_uri = http://sub.domain.help/

#### Enable CORS headers for HTTP interface
#
# This allows browsers to make Cross-Origin requests from any origin.
# This is disabled for security reasons and typically only needed if running graylog
# with a separate server for frontend development.
#
# Default: false
#http_enable_cors = false

#### Enable GZIP support for HTTP interface
#
# This compresses API responses and therefore helps to reduce
# overall round trip times. This is enabled by default. Uncomment the next line to disable it.
#http_enable_gzip = false

# The maximum size of the HTTP request headers in bytes.
#http_max_header_size = 8192

# The size of the thread pool used exclusively for serving the HTTP interface.
#http_thread_pool_size = 64

################
# HTTPS settings
################

...

# If set to "true", Graylog will periodically investigate indices to figure out which fields are used in which streams.
# It will make field list in Graylog interface show only fields used in selected streams, but can decrease system performance,
# especially on systems with great number of streams and fields.
stream_aware_field_types=false

# Comma separated list of trusted proxies that are allowed to set the client address with X-Forwarded-For
# header. May be subnets, or hosts.
trusted_proxies = 127.0.0.1/32, 192.168.128.0/24

# List of Elasticsearch hosts Graylog should connect to.
# Need to be specified as a comma-separated list of valid URIs for the http ports of your elasticsearch nodes.
# If one or more of your elasticsearch hosts require authentication, include the credentials in each node URI that
# requires authentication.
#
# Default: http://127.0.0.1:9200
elasticsearch_hosts = http://192.168.128.253:9200

# Maximum number of attempts to connect to elasticsearch on boot for the version probe.
#
# Default: 0, retry indefinitely with the given delay until a connection could be established
#elasticsearch_version_probe_attempts = 5

# Waiting time in between connection attempts for elasticsearch_version_probe_attempts
#
# Default: 5s
#elasticsearch_version_probe_delay = 5s

# Maximum amount of time to wait for successful connection to Elasticsearch HTTP port.
#
# Default: 10 Seconds
#elasticsearch_connect_timeout = 10s

# Maximum amount of time to wait for reading back a response from an Elasticsearch server.
# (e. g. during search, index creation, or index time-range calculations)
#
# Default: 60 seconds
#elasticsearch_socket_timeout = 60s

# Maximum idle time for an Elasticsearch connection. If this is exceeded, this connection will
# be tore down.
#
# Default: inf
#elasticsearch_idle_timeout = -1s

# Maximum number of total connections to Elasticsearch.
#
# Default: 200
#elasticsearch_max_total_connections = 200

# Maximum number of total connections per Elasticsearch route (normally this means per
# elasticsearch server).
#
# Default: 20
#elasticsearch_max_total_connections_per_route = 20

# Maximum number of times Graylog will retry failed requests to Elasticsearch.
#
# Default: 2
#elasticsearch_max_retries = 2

# Enable automatic Elasticsearch node discovery through Nodes Info,
# see https://www.elastic.co/guide/en/elasticsearch/reference/5.4/cluster-nodes-info.html
#
# WARNING: Automatic node discovery does not work if Elasticsearch requires authentication, e. g. with Shield.
#
# Default: false
#elasticsearch_discovery_enabled = true

# Filter for including/excluding Elasticsearch nodes in discovery according to their custom attributes,
# see https://www.elastic.co/guide/en/elasticsearch/reference/5.4/cluster.html#cluster-nodes
#
# Default: empty
#elasticsearch_discovery_filter = rack:42

# Frequency of the Elasticsearch node discovery.
#
# Default: 30s
# elasticsearch_discovery_frequency = 30s

# Set the default scheme when connecting to Elasticsearch discovered nodes
#
# Default: http (available options: http, https)
#elasticsearch_discovery_default_scheme = http

# Enable payload compression for Elasticsearch requests.
#
# Default: false
#elasticsearch_compression_enabled = true

# Enable use of "Expect: 100-continue" Header for Elasticsearch index requests.
# If this is disabled, Graylog cannot properly handle HTTP 413 Request Entity Too Large errors.
#
# Default: true
#elasticsearch_use_expect_continue = true

# Graylog will use multiple indices to store documents in. You can configure the strategy it uses to determine
# when to rotate the currently active write index.
# It supports multiple rotation strategies, the default being "count":
#   - "count" of messages per index, use elasticsearch_max_docs_per_index below to configure
#   - "size" per index, use elasticsearch_max_size_per_index below to configure
#   - "time" interval between index rotations, use elasticsearch_max_time_per_index to configure
# A strategy may be disabled by specifying the optional enabled_index_rotation_strategies list and excluding that strategy.
#enabled_index_rotation_strategies = count,size,time

# Provides a hard upper limit for the retention period of any index set at configuration time.
#
# This setting is used to validate the value a user chooses for the maximum number of retained indexes, when configuring
# an index set. However, it is only in effect, when a time-based rotation strategy is chosen.
#
# If a rotation strategy other than time-based is selected and/or no value is provided for this setting, no upper limit
# for index retention will be enforced. This is also the default.
# max_index_retention_period = P90d

# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
rotation_strategy = count

# (Approximate) maximum number of documents in an Elasticsearch index before a new index
# is being created, also see no_retention and elasticsearch_max_number_of_indices.
# Configure this if you used 'rotation_strategy = count' above.
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
elasticsearch_max_docs_per_index = 20000000

# (Approximate) maximum size in bytes per Elasticsearch index on disk before a new index is being created, also see
# no_retention and elasticsearch_max_number_of_indices. Default is 1GB.
# Configure this if you used 'rotation_strategy = size' above.
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
#elasticsearch_max_size_per_index = 1073741824

# (Approximate) maximum time before a new Elasticsearch index is being created, also see
# no_retention and elasticsearch_max_number_of_indices. Default is 1 day.
# Configure this if you used 'rotation_strategy = time' above.
# Please note that this rotation period does not look at the time specified in the received messages, but is
# using the real clock value to decide when to rotate the index!
# Specify the time using a duration and a suffix indicating which unit you want:
#  1w  = 1 week
#  1d  = 1 day
#  12h = 12 hours
# Permitted suffixes are: d for day, h for hour, m for minute, s for second.
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
#elasticsearch_max_time_per_index = 1d

# Optional upper bound on elasticsearch_max_time_per_index
# elasticsearch_max_write_index_age = 1d

# Disable checking the version of Elasticsearch for being compatible with this Graylog release.
# WARNING: Using Graylog with unsupported and untested versions of Elasticsearch may lead to data loss!
elasticsearch_disable_version_check = true

# Disable message retention on this node, i. e. disable Elasticsearch index rotation.
#no_retention = false

# How many indices do you want to keep?
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
elasticsearch_max_number_of_indices = 20

# Decide what happens with the oldest indices when the maximum number of indices is reached.
# The following strategies are availble:
#   - delete # Deletes the index completely (Default)
#   - close # Closes the index and hides it from the system. Can be re-opened later.
#
# ATTENTION: These settings have been moved to the database in 2.0. When you upgrade, make sure to set these
#            to your previous 1.x settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
retention_strategy = delete

# How many Elasticsearch shards and replicas should be used per index? Note that this only applies to newly created indices.
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
elasticsearch_shards = 4
elasticsearch_replicas = 0

# Prefix for all Elasticsearch indices and index aliases managed by Graylog.
#
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
elasticsearch_index_prefix = graylog

# Name of the Elasticsearch index template used by Graylog to apply the mandatory index mapping.
# Default: graylog-internal
#
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
#elasticsearch_template_name = graylog-internal

# Do you want to allow searches with leading wildcards? This can be extremely resource hungry and should only
# be enabled with care. See also: https://docs.graylog.org/docs/query-language
allow_leading_wildcard_searches = false

# Do you want to allow searches to be highlighted? Depending on the size of your messages this can be memory hungry and
# should only be enabled after making sure your Elasticsearch cluster has enough memory.
allow_highlighting = false

# Analyzer (tokenizer) to use for message and full_message field. The "standard" filter usually is a good idea.
# All supported analyzers are: standard, simple, whitespace, stop, keyword, pattern, language, snowball, custom
# Elasticsearch documentation: https://www.elastic.co/guide/en/elasticsearch/reference/2.3/analysis.html
# Note that this setting only takes effect on newly created indices.
#
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
elasticsearch_analyzer = standard

# Global timeout for index optimization (force merge) requests.
# Default: 1h
#elasticsearch_index_optimization_timeout = 1h

# Maximum number of concurrently running index optimization (force merge) jobs.
# If you are using lots of different index sets, you might want to increase that number.
# This value should be set lower than elasticsearch_max_total_connections_per_route, otherwise index optimization
# could deplete all the client connections to the search server and block new messages ingestion for prolonged
# periods of time.
# Default: 10
#elasticsearch_index_optimization_jobs = 10

# Mute the logging-output of ES deprecation warnings during REST calls in the ES RestClient
#elasticsearch_mute_deprecation_warnings = true

# Time interval for index range information cleanups. This setting defines how often stale index range information
# is being purged from the database.
# Default: 1h
#index_ranges_cleanup_interval = 1h

# Time interval to trigger a full refresh of the index field types for all indexes. This will query ES for all indexes
# and populate any missing field type information to the database.
# Default: 5m
#index_field_type_periodical_full_refresh_interval = 5m

# Batch size for the Elasticsearch output. This is the maximum (!) number of messages the Elasticsearch output
# module will get at once and write to Elasticsearch in a batch call. If the configured batch size has not been
# reached within output_flush_interval seconds, everything that is available will be flushed at once. Remember
# that every outputbuffer processor manages its own batch and performs its own batch write calls.
# ("outputbuffer_processors" variable)
output_batch_size = 500

# Flush interval (in seconds) for the Elasticsearch output. This is the maximum amount of time between two
# batches of messages written to Elasticsearch. It is only effective at all if your minimum number of messages
# for this time period is less than output_batch_size * outputbuffer_processors.
output_flush_interval = 2

# As stream outputs are loaded only on demand, an output which is failing to initialize will be tried over and
# over again. To prevent this, the following configuration options define after how many faults an output will
# not be tried again for an also configurable amount of seconds.
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30

# The number of parallel running processors.
# Raise this number if your buffers are filling up.
processbuffer_processors = 5
outputbuffer_processors = 3

# The size of the thread pool in the output buffer processor.
# Default: 3
#outputbuffer_processor_threads_core_pool_size = 3

# UDP receive buffer size for all message inputs (e. g. SyslogUDPInput).
#udp_recvbuffer_sizes = 1048576

# Wait strategy describing how buffer processors wait on a cursor sequence. (default: sleeping)
# Possible types:
#  - yielding
#     Compromise between performance and CPU usage.
#  - sleeping
#     Compromise between performance and CPU usage. Latency spikes can occur after quiet periods.
#  - blocking
#     High throughput, low latency, higher CPU usage.
#  - busy_spinning
#     Avoids syscalls which could introduce latency jitter. Best when threads can be bound to specific CPU cores.
processor_wait_strategy = blocking

# Size of internal ring buffers. Raise this if raising outputbuffer_processors does not help anymore.
# For optimum performance your LogMessage objects in the ring buffer should fit in your CPU L3 cache.
# Must be a power of 2. (512, 1024, 2048, ...)
ring_size = 65536

inputbuffer_ring_size = 65536
inputbuffer_processors = 2
inputbuffer_wait_strategy = blocking

# Manually stopped inputs are no longer auto-restarted. To re-enable the previous behavior, set auto_restart_inputs to true.
#auto_restart_inputs = true

# Enable the message journal.
message_journal_enabled = true

# The directory which will be used to store the message journal. The directory must be exclusively used by Graylog and
# must not contain any other files than the ones created by Graylog itself.
#
# ATTENTION:
#   If you create a seperate partition for the journal files and use a file system creating directories like 'lost+found'
#   in the root directory, you need to create a sub directory for your journal.
#   Otherwise Graylog will log an error message that the journal is corrupt and Graylog will not start.
message_journal_dir = /var/lib/graylog-server/journal

# Journal hold messages before they could be written to Elasticsearch.
# For a maximum of 12 hours or 5 GB whichever happens first.
# During normal operation the journal will be smaller.
#message_journal_max_age = 12h
#message_journal_max_size = 5gb

#message_journal_flush_age = 1m
#message_journal_flush_interval = 1000000
#message_journal_segment_age = 1h
#message_journal_segment_size = 100mb

# Number of threads used exclusively for dispatching internal events. Default is 2.
#async_eventbus_processors = 2

# How many seconds to wait between marking node as DEAD for possible load balancers and starting the actual
# shutdown process. Set to 0 if you have no status checking load balancers in front.
lb_recognition_period_seconds = 3

# Journal usage percentage that triggers requesting throttling for this server node from load balancers. The feature is
# disabled if not set.
#lb_throttle_threshold_percentage = 95

# Every message is matched against the configured streams and it can happen that a stream contains rules which
# take an unusual amount of time to run, for example if its using regular expressions that perform excessive backtracking.
# This will impact the processing of the entire server. To keep such misbehaving stream rules from impacting other
# streams, Graylog limits the execution time for each stream.
# The default values are noted below, the timeout is in milliseconds.
# If the stream matching for one stream took longer than the timeout value, and this happened more than "max_faults" times
# that stream is disabled and a notification is shown in the web interface.
#stream_processing_timeout = 2000
#stream_processing_max_faults = 3

# Since 0.21 the Graylog server supports pluggable output modules. This means a single message can be written to multiple
# outputs. The next setting defines the timeout for a single output module, including the default output module where all
# messages end up.
#
# Time in milliseconds to wait for all message outputs to finish writing a single message.
#output_module_timeout = 10000

# Time in milliseconds after which a detected stale leader node is being rechecked on startup.
#stale_leader_timeout = 2000

# Time in milliseconds which Graylog is waiting for all threads to stop on shutdown.
#shutdown_timeout = 30000

# MongoDB connection string
# See https://docs.mongodb.com/manual/reference/connection-string/ for details
mongodb_uri = mongodb://localhost/graylog

# Authenticate against the MongoDB server
# '+'-signs in the username or password need to be replaced by '%2B'
#mongodb_uri = mongodb://grayloguser:secret@localhost:27017/graylog

# Use a replica set instead of a single host
#mongodb_uri = mongodb://grayloguser:secret@localhost:27017,localhost:27018,localhost:27019/graylog?replicaSet=rs01

# DNS Seedlist https://docs.mongodb.com/manual/reference/connection-string/#dns-seedlist-connection-format
#mongodb_uri = mongodb+srv://server.example.org/graylog

# Increase this value according to the maximum connections your MongoDB server can handle from a single client
# if you encounter MongoDB connection problems.
mongodb_max_connections = 1000

# Maximum number of attempts to connect to MongoDB on boot for the version probe.
#
# Default: 0, retry indefinitely until a connection can be established
#mongodb_version_probe_attempts = 5

# Email transport
redacted

...

server.conf continued

# Encryption settings
#
# ATTENTION:
#    Using SMTP with STARTTLS *and* SMTPS at the same time is *not* possible.

# Use SMTP with STARTTLS, see https://en.wikipedia.org/wiki/Opportunistic_TLS
#transport_email_use_tls = true

# Use SMTP over SSL (SMTPS), see https://en.wikipedia.org/wiki/SMTPS
# This is deprecated on most SMTP services!
#transport_email_use_ssl = false


# Specify and uncomment this if you want to include links to the stream in your stream alert mails.
# This should define the fully qualified base url to your web interface exactly the same way as it is accessed by your users.
#transport_email_web_interface_url = https://graylog.example.com

# The default connect timeout for outgoing HTTP connections.
# Values must be a positive duration (and between 1 and 2147483647 when converted to milliseconds).
# Default: 5s
#http_connect_timeout = 5s

# The default read timeout for outgoing HTTP connections.
# Values must be a positive duration (and between 1 and 2147483647 when converted to milliseconds).
# Default: 10s
#http_read_timeout = 10s

# The default write timeout for outgoing HTTP connections.
# Values must be a positive duration (and between 1 and 2147483647 when converted to milliseconds).
# Default: 10s
#http_write_timeout = 10s

# HTTP proxy for outgoing HTTP connections
# ATTENTION: If you configure a proxy, make sure to also configure the "http_non_proxy_hosts" option so internal
#            HTTP connections with other nodes does not go through the proxy.
# Examples:
#   - http://proxy.example.com:8123
#   - http://username:password@proxy.example.com:8123
#http_proxy_uri =

# A list of hosts that should be reached directly, bypassing the configured proxy server.
# This is a list of patterns separated by ",". The patterns may start or end with a "*" for wildcards.
# Any host matching one of these patterns will be reached through a direct connection instead of through a proxy.
# Examples:
#   - localhost,127.0.0.1
#   - 10.0.*,*.example.com
#http_non_proxy_hosts =

# Disable the optimization of Elasticsearch indices after index cycling. This may take some load from Elasticsearch
# on heavily used systems with large indices, but it will decrease search performance. The default is to optimize
# cycled indices.
#
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
#disable_index_optimization = true

# Optimize the index down to <= index_optimization_max_num_segments. A higher number may take some load from Elasticsearch
# on heavily used systems with large indices, but it will decrease search performance. The default is 1.
#
# ATTENTION: These settings have been moved to the database in Graylog 2.2.0. When you upgrade, make sure to set these
#            to your previous settings so they will be migrated to the database!
#            This configuration setting is only used on the first start of Graylog. After that,
#            index related settings can be changed in the Graylog web interface on the 'System / Indices' page.
#            Also see https://docs.graylog.org/docs/index-model#index-set-configuration
#index_optimization_max_num_segments = 1

# The threshold of the garbage collection runs. If GC runs take longer than this threshold, a system notification
# will be generated to warn the administrator about possible problems with the system. Default is 1 second.
#gc_warning_threshold = 1s

# Connection timeout for a configured LDAP server (e. g. ActiveDirectory) in milliseconds.
#ldap_connection_timeout = 2000

# Disable the use of a native system stats collector (currently OSHI)
#disable_native_system_stats_collector = false

# The default cache time for dashboard widgets. (Default: 10 seconds, minimum: 1 second)
#dashboard_widget_default_cache_time = 10s

# For some cluster-related REST requests, the node must query all other nodes in the cluster. This is the maximum number
# of threads available for this. Increase it, if '/cluster/*' requests take long to complete.
# Should be http_thread_pool_size * average_cluster_size if you have a high number of concurrent users.
#proxied_requests_thread_pool_size = 64

# The server is writing processing status information to the database on a regular basis. This setting controls how
# often the data is written to the database.
# Default: 1s (cannot be less than 1s)
#processing_status_persist_interval = 1s

# Configures the threshold for detecting outdated processing status records. Any records that haven't been updated
# in the configured threshold will be ignored.
# Default: 1m (one minute)
#processing_status_update_threshold = 1m

# Configures the journal write rate threshold for selecting processing status records. Any records that have a lower
# one minute rate than the configured value might be ignored. (dependent on number of messages in the journal)
# Default: 1
#processing_status_journal_write_rate_threshold = 1

# Configures the prefix used for graylog event indices
# Default: gl-events
#default_events_index_prefix = gl-events

# Configures the prefix used for graylog system event indices
# Default: gl-system-events
#default_system_events_index_prefix = gl-system-events

# Automatically load content packs in "content_packs_dir" on the first start of Graylog.
#content_packs_loader_enabled = false

# The directory which contains content packs which should be loaded on the first start of Graylog.
#content_packs_dir = data/contentpacks

# A comma-separated list of content packs (files in "content_packs_dir") which should be applied on
# the first start of Graylog.
# Default: empty
#content_packs_auto_install = grok-patterns.json

# The allowed TLS protocols for system wide TLS enabled servers. (e.g. message inputs, http interface)
# Setting this to an empty value, leaves it up to system libraries and the used JDK to chose a default.
# Default: TLSv1.2,TLSv1.3  (might be automatically adjusted to protocols supported by the JDK)
#enabled_tls_protocols = TLSv1.2,TLSv1.3

# Enable Prometheus exporter HTTP server.
# Default: false
#prometheus_exporter_enabled = false

# IP address and port for the Prometheus exporter HTTP server.
# Default: 127.0.0.1:9833
#prometheus_exporter_bind_address = 127.0.0.1:9833

# Path to the Prometheus exporter core mapping file. If this option is enabled, the full built-in core mapping is
# replaced with the mappings in this file.
# This file is monitored for changes and updates will be applied at runtime.
# Default: none
#prometheus_exporter_mapping_file_path_core = prometheus-exporter-mapping-core.yml

# Path to the Prometheus exporter custom mapping file. If this option is enabled, the mappings in this file are
# configured in addition to the built-in core mappings. The mappings in this file cannot overwrite any core mappings.
# This file is monitored for changes and updates will be applied at runtime.
# Default: none
#prometheus_exporter_mapping_file_path_custom = prometheus-exporter-mapping-custom.yml

# Configures the refresh interval for the monitored Prometheus exporter mapping files.
# Default: 60s
#prometheus_exporter_mapping_file_refresh_interval = 60s

# Optional allowed paths for Graylog data files. If provided, certain operations in Graylog will only be permitted
# if the data file(s) are located in the specified paths (for example, with the CSV File lookup adapter).
# All subdirectories of indicated paths are allowed by default. This Provides an additional layer of security,
# and allows administrators to control where in the file system Graylog users can select files from.
#allowed_auxiliary_paths = /etc/graylog/data-files,/etc/custom-allowed-path

# Do not perform any preflight checks when starting Graylog
# Default: false
#skip_preflight_checks = false

# Ignore any exceptions encountered when running migrations
# Use with caution - skipping failing migrations may result in an inconsistent DB state.
# Default: false
#ignore_migration_failures = false

Finally, logs:

examples of /var/log/graylog-server/server/log entries around the time of the failures:

2023-02-16T22:14:51.518Z ERROR [MessagesAdapterOS2] Failed to index [2] messages. Please check the index error log in your web interface for the reason. Error: failure in bulk execution:
[160]: index [graylog_0], id [52a17f21-ae47-11ed-aacf-00163ef2bcdd], message [OpenSearchException[OpenSearch exception [type=mapper_parsing_exception, reason=failed to parse field [ListBaseType] of type [long] in document with id '52a17f21-ae47-11ed-aacf-00163ef2bcdd'. Preview of field's value: 'GenericList']]; nested: OpenSearchException[OpenSearch exception [type=illegal_argument_exception, reason=For input string: "GenericList"]];]
[170]: index [graylog_0], id [52b79f30-ae47-11ed-aacf-00163ef2bcdd], message [OpenSearchException[OpenSearch exception [type=mapper_parsing_exception, reason=failed to parse field [ListBaseType] of type [long] in document with id '52b79f30-ae47-11ed-aacf-00163ef2bcdd'. Preview of field's value: 'GenericList']]; nested: OpenSearchException[OpenSearch exception [type=illegal_argument_exception, reason=For input string: "GenericList"]];]
2023-02-16T22:29:43.362Z WARN  [MaxmindDataAdapter] Unable to look up city data for IP address /70.241.37.1, returning empty result.

 Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #22).

2023-02-16T22:29:50.664Z ERROR [MessagesAdapterOS2] Failed to index [1] messages. Please check the index error log in your web interface for the reason. Error: failure in bulk execution:
[253]: index [graylog_0], id [6a946871-ae49-11ed-aacf-00163ef2bcdd], message [OpenSearchException[OpenSearch exception [type=mapper_parsing_exception, reason=failed to parse field [ListBaseType] of type [long] in document with id '6a946871-ae49-11ed-aacf-00163ef2bcdd'. Preview of field's value: 'DocumentLibrary']]; nested: OpenSearchException[OpenSearch exception [type=illegal_argument_exception, reason=For input string: "DocumentLibrary"]];]
2023-02-16T22:29:52.126Z ERROR [MessagesAdapterOS2] Failed to index [1] messages. Please check the index error log in your web interface for the reason. Error: failure in bulk execution:
[5]: index [graylog_0], id [6b0a82d0-ae49-11ed-aacf-00163ef2bcdd], message [OpenSearchException[OpenSearch exception [type=mapper_parsing_exception, reason=failed to parse field [ListBaseType] of type [long] in document with id '6b0a82d0-ae49-11ed-aacf-00163ef2bcdd'. Preview of field's value: 'DocumentLibrary']]; nested: OpenSearchException[OpenSearch exception [type=illegal_argument_exception, reason=For input string: "DocumentLibrary"]];]
2023-02-16T22:29:54.981Z ERROR [MessagesAdapterOS2] Failed to index [3] messages. Please check the index error log in your web interface for the reason. Error: failure in bulk execution:
[101]: index [graylog_0], id [69a01d60-ae49-11ed-aacf-00163ef2bcdd], message [OpenSearchException[OpenSearch exception [type=mapper_parsing_exception, reason=failed to parse field [ListBaseType] of type [long] in document with id '69a01d60-ae49-11ed-aacf-00163ef2bcdd'. Preview of field's value: 'DocumentLibrary']]; nested: OpenSearchException[OpenSearch exception [type=illegal_argument_exception, reason=For input string: "DocumentLibrary"]];]
[168]: index [graylog_0], id [6a1e4e10-ae49-11ed-aacf-00163ef2bcdd], message [OpenSearchException[OpenSearch exception [type=mapper_parsing_exception, reason=failed to parse field [ListBaseType] of type [long] in document with id '6a1e4e10-ae49-11ed-aacf-00163ef2bcdd'. Preview of field's value: 'DocumentLibrary']]; nested: OpenSearchException[OpenSearch exception [type=illegal_argument_exception, reason=For input string: "DocumentLibrary"]];]
[169]: index [graylog_0], id [6a215b51-ae49-11ed-aacf-00163ef2bcdd], message [OpenSearchException[OpenSearch exception [type=mapper_parsing_exception, reason=failed to parse field [ListBaseType] of type [long] in document with id '6a215b51-ae49-11ed-aacf-00163ef2bcdd'. Preview of field's value: 'DocumentLibrary']]; nested: OpenSearchException[OpenSearch exception [type=illegal_argument_exception, reason=For input string: "DocumentLibrary"]];]
2023-02-16T22:44:46.660Z WARN  [MaxmindDataAdapter] Unable to look up city data for IP address /99.69.21.79, returning empty result.
java.lang.UnsupportedOperationException: Invalid attempt to open a GeoLite2-ASN database using the city method

2023-02-17T15:34:09.295Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #1).
2023-02-17T15:34:09.317Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #10).
2023-02-17T15:34:09.320Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #2).
2023-02-17T15:34:09.322Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #1).
2023-02-17T15:34:09.336Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #2).
2023-02-17T15:34:09.345Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #3).
2023-02-17T15:34:09.352Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #3).
2023-02-17T15:34:09.373Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #4).
2023-02-17T15:34:09.374Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #4).
2023-02-17T15:34:09.402Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #5).
2023-02-17T15:34:09.403Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #10).
2023-02-17T15:34:09.413Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #5).
2023-02-17T15:34:09.447Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #6).
2023-02-17T15:34:09.474Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #6).
2023-02-17T15:34:09.523Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #7).
2023-02-17T15:34:09.559Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #7).
2023-02-17T15:34:09.664Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #8).
2023-02-17T15:34:09.707Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #8).

at java.lang.Thread.run(Unknown Source) [?:?]
2023-02-17T15:34:09.932Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #9).
2023-02-17T15:34:09.985Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #9).
2023-02-17T15:34:10.343Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #11).
2023-02-17T15:34:10.450Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #11).
2023-02-17T15:34:10.451Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #10).
2023-02-17T15:34:10.508Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #10).
2023-02-17T15:34:11.490Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #11).
2023-02-17T15:34:11.553Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #11).
2023-02-17T15:34:12.394Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #12).
2023-02-17T15:34:12.532Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #12).
2023-02-17T15:34:13.550Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #12).
2023-02-17T15:34:13.621Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #12).
2023-02-17T15:34:16.492Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #13).
2023-02-17T15:34:16.659Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #13).
2023-02-17T15:34:17.666Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #13).
2023-02-17T15:34:17.751Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #13).
2023-02-17T15:34:21.627Z ERROR [ClusterAdapterOS2] An error occurred:  (Connection refused)
2023-02-17T15:34:21.627Z INFO  [IndexerClusterCheckerThread] Indexer not fully initialized yet. Skipping periodic cluster check.
2023-02-17T15:34:21.644Z ERROR [VersionProbe] Unable to retrieve version from Elasticsearch node: Failed to connect to /127.0.0.1:9200. - Connection refused.
2023-02-17T15:34:21.644Z INFO  [VersionProbe] Elasticsearch is not available. Retry #1
2023-02-17T15:34:24.690Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #14).
2023-02-17T15:34:24.865Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #14).
2023-02-17T15:34:25.873Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #14).
2023-02-17T15:34:25.972Z ERROR [Messages] Caught exception during bulk indexing: ElasticsearchException{message=OpenSearchException[An error occurred: ]; nested: ConnectException[Connection refused]; nested: ConnectException[Connection refused];, errorDetails=[]}, retrying (attempt #14).
2023-02-17T15:34:26.645Z ERROR [VersionProbe] Unable to retrieve version from Elasticsearch node: Failed to connect to /127.0.0.1:9200. - Connection refused.
2023-02-17T15:34:26.646Z INFO  [VersionProbe] Elasticsearch is not available. Retry #2
2023-02-17T15:34:31.647Z ERROR [VersionProbe] Unable to retrieve version from Elasticsearch node: Failed to connect to /127.0.0.1:9200. - Connection refused.
2023-02-17T15:34:31.647Z INFO  [VersionProbe] Elasticsearch is not available. Retry #3
2023-02-17T15:34:36.648Z ERROR [VersionProbe] Unable to retrieve version from Elasticsearch node: Failed to connect to /127.0.0.1:9200. - Connection refused.

2023-02-19T03:00:43.613Z ERROR [VersionProbe] Unable to retrieve version from Elasticsearch node: Failed to connect to /172.0.0.1:9200. - Connection refused.
2023-02-19T03:00:43.613Z INFO  [VersionProbe] Elasticsearch is not available. Retry #178


/var/log/opensearch/graylog.log

[2023-02-19T00:03:52,047][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T00:08:55,034][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T00:13:55,037][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T00:18:55,040][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T00:23:55,042][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T00:28:55,044][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T00:33:55,047][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T00:38:55,050][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T00:43:55,052][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T00:48:55,056][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T00:53:55,058][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T00:58:50,331][INFO ][o.o.a.t.CronTransportAction] [graylogopen] Start running AD hourly cron.
[2023-02-19T00:58:50,337][INFO ][o.o.a.t.ADTaskManager    ] [graylogopen] Start to maintain running historical tasks
[2023-02-19T00:58:50,338][INFO ][o.o.a.c.HourlyCron       ] [graylogopen] Hourly maintenance succeeds
[2023-02-19T00:58:50,544][INFO ][o.o.i.i.IndexStateManagementHistory] [graylogopen] No Old History Indices to delete
[2023-02-19T00:58:55,063][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T01:03:55,064][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T01:08:55,067][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T01:13:55,069][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T01:18:55,073][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T01:23:55,075][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T01:28:55,078][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T01:33:55,080][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T01:38:55,083][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T01:43:55,094][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T01:48:55,100][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T01:53:55,107][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T01:58:50,332][INFO ][o.o.a.t.CronTransportAction] [graylogopen] Start running AD hourly cron.
[2023-02-19T01:58:50,336][INFO ][o.o.a.t.ADTaskManager    ] [graylogopen] Start to maintain running historical tasks
[2023-02-19T01:58:50,340][INFO ][o.o.a.c.HourlyCron       ] [graylogopen] Hourly maintenance succeeds
[2023-02-19T01:58:55,115][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T02:03:55,116][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T02:08:55,119][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T02:13:55,122][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T02:18:55,125][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T02:23:55,133][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T02:28:55,138][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T02:33:55,148][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T02:38:55,154][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T02:43:55,162][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T04:17:38,520][INFO ][o.o.n.Node               ] [graylogopen] version[2.0.1], pid[206], build[tar/6462a546240f6d7a158519499729bce12dc1058b/2022-06-15T08:47:42.243126494Z], OS[Linux/5.19.0-1-amd64/amd64], JVM[Eclipse Adoptium/OpenJDK 64-Bit Server VM/17.0.3/17.0.3+7]
[2023-02-19T04:17:38,523][INFO ][o.o.n.Node               ] [graylogopen] JVM home [/graylog/opensearch/jdk], using bundled JDK [true]
[2023-02-19T04:17:38,524][INFO ][o.o.n.Node               ] [graylogopen] JVM arguments [-Xshare:auto, -Dopensearch.networkaddress.cache.ttl=60, -Dopensearch.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -XX:+ShowCodeDetailsInExceptionMessages, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dio.netty.allocator.numDirectArenas=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.locale.providers=SPI,COMPAT, -Xms24g, -Xmx24g, -XX:+UseG1GC, -XX:G1ReservePercent=25, -XX:InitiatingHeapOccupancyPercent=30, -Djava.io.tmpdir=/tmp/opensearch-16420459775048320987, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -XX:MaxDirectMemorySize=12884901888, -Dopensearch.path.home=/graylog/opensearch, -Dopensearch.path.conf=/graylog/opensearch/config, -Dopensearch.distribution.type=tar, -Dopensearch.bundled_jdk=true]
[2023-02-19T04:17:40,709][WARN ][stderr                   ] [graylogopen] SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
[2023-02-19T04:17:40,710][WARN ][stderr                   ] [graylogopen] SLF4J: Defaulting to no-operation (NOP) logger implementation
[2023-02-19T04:17:40,710][WARN ][stderr                   ] [graylogopen] SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
[2023-02-19T04:17:40,721][INFO ][o.o.s.s.t.SSLConfig      ] [graylogopen] SSL dual mode is disabled
[2023-02-19T04:17:40,722][WARN ][o.o.s.OpenSearchSecurityPlugin] [graylogopen] OpenSearch Security plugin installed but disabled. This can expose your configuration (including passwords) to the public.
[2023-02-19T04:17:42,827][INFO ][o.o.p.c.PluginSettings   ] [graylogopen] Trying to create directory /dev/shm/performanceanalyzer/.
[2023-02-19T04:17:42,828][INFO ][o.o.p.c.PluginSettings   ] [graylogopen] Config: metricsLocation: /dev/shm/performanceanalyzer/, metricsDeletionInterval: 1, httpsEnabled: false, cleanup-metrics-db-files: true, batch-metrics-retention-period-minutes: 7, rpc-port: 9650, webservice-port 9600
[2023-02-19T04:17:43,762][INFO ][o.o.i.r.ReindexPlugin    ] [graylogopen] ReindexPlugin reloadSPI called
[2023-02-19T04:17:43,764][INFO ][o.o.i.r.ReindexPlugin    ] [graylogopen] Unable to find any implementation for RemoteReindexExtension
[2023-02-19T04:17:43,872][INFO ][o.o.j.JobSchedulerPlugin ] [graylogopen] Loaded scheduler extension: reports-scheduler, index: .opendistro-reports-definitions
[2023-02-19T04:17:43,877][INFO ][o.o.j.JobSchedulerPlugin ] [graylogopen] Loaded scheduler extension: opendistro_anomaly_detector, index: .opendistro-anomaly-detector-jobs
[2023-02-19T04:17:43,878][INFO ][o.o.j.JobSchedulerPlugin ] [graylogopen] Loaded scheduler extension: opendistro-index-management, index: .opendistro-ism-config
[2023-02-19T04:17:43,906][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded module [aggs-matrix-stats]
[2023-02-19T04:17:43,907][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded module [analysis-common]
[2023-02-19T04:17:43,907][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded module [geo]
[2023-02-19T04:17:43,907][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded module [ingest-common]
[2023-02-19T04:17:43,908][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded module [ingest-geoip]
[2023-02-19T04:17:43,908][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded module [ingest-user-agent]
[2023-02-19T04:17:43,908][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded module [lang-expression]
[2023-02-19T04:17:43,908][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded module [lang-mustache]
[2023-02-19T04:17:43,908][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded module [lang-painless]
[2023-02-19T04:17:43,908][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded module [mapper-extras]
[2023-02-19T04:17:43,909][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded module [opensearch-dashboards]
[2023-02-19T04:17:43,909][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded module [parent-join]
[2023-02-19T04:17:43,909][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded module [percolator]
[2023-02-19T04:17:43,909][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded module [rank-eval]
[2023-02-19T04:17:43,909][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded module [reindex]
[2023-02-19T04:17:43,910][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded module [repository-url]
[2023-02-19T04:17:43,910][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded module [transport-netty4]
[2023-02-19T04:17:43,910][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded plugin [opensearch-alerting]
[2023-02-19T04:17:43,911][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded plugin [opensearch-anomaly-detection]
[2023-02-19T04:17:43,911][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded plugin [opensearch-asynchronous-search]
[2023-02-19T04:17:43,911][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded plugin [opensearch-cross-cluster-replication]
[2023-02-19T04:17:43,911][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded plugin [opensearch-index-management]
[2023-02-19T04:17:43,911][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded plugin [opensearch-job-scheduler]
[2023-02-19T04:17:43,912][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded plugin [opensearch-knn]
[2023-02-19T04:17:43,912][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded plugin [opensearch-ml]
[2023-02-19T04:17:43,912][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded plugin [opensearch-notifications]
[2023-02-19T04:17:43,912][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded plugin [opensearch-notifications-core]
[2023-02-19T04:17:43,912][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded plugin [opensearch-observability]
[2023-02-19T04:17:43,913][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded plugin [opensearch-performance-analyzer]
[2023-02-19T04:17:43,913][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded plugin [opensearch-reports-scheduler]
[2023-02-19T04:17:43,913][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded plugin [opensearch-security]
[2023-02-19T04:17:43,913][INFO ][o.o.p.PluginsService     ] [graylogopen] loaded plugin [opensearch-sql]
[2023-02-19T04:17:43,946][INFO ][o.o.e.NodeEnvironment    ] [graylogopen] using [1] data paths, mounts [[/ (Dauntless_zfs/containers/Graylog-Open)]], net usable_space [831.2gb], net total_space [852.4gb], types [zfs]
[2023-02-19T04:17:43,946][INFO ][o.o.e.NodeEnvironment    ] [graylogopen] heap size [24gb], compressed ordinary object pointers [true]
[2023-02-19T04:17:44,150][INFO ][o.o.n.Node               ] [graylogopen] node name [graylogopen], node ID [LtIiAaGeRY6dy9Q3bzhZrA], cluster name [graylog], roles [cluster_manager, remote_cluster_client, data, ingest]
[2023-02-19T04:17:47,396][INFO ][o.o.a.b.ADCircuitBreakerService] [graylogopen] Registered memory breaker.
[2023-02-19T04:17:47,713][INFO ][o.o.m.c.b.MLCircuitBreakerService] [graylogopen] Registered ML memory breaker.
[2023-02-19T04:17:48,125][INFO ][o.o.t.NettyAllocator     ] [graylogopen] creating NettyAllocator with the following configs: [name=opensearch_configured, chunk_size=1mb, suggested_max_allocation_size=1mb, factors={opensearch.unsafe.use_netty_default_chunk_and_page_size=false, g1gc_enabled=true, g1gc_region_size=16mb}]
[2023-02-19T04:17:48,176][INFO ][o.o.d.DiscoveryModule    ] [graylogopen] using discovery type [single-node] and seed hosts providers [settings]
[2023-02-19T04:17:48,478][WARN ][o.o.g.DanglingIndicesState] [graylogopen] gateway.auto_import_dangling_indices is disabled, dangling indices will not be automatically detected or imported and must be managed manually
[2023-02-19T04:17:48,773][INFO ][o.o.p.h.c.PerformanceAnalyzerConfigAction] [graylogopen] PerformanceAnalyzer Enabled: false
[2023-02-19T04:17:48,836][INFO ][o.o.n.Node               ] [graylogopen] initialized
[2023-02-19T04:17:48,836][INFO ][o.o.n.Node               ] [graylogopen] starting ...
[2023-02-19T04:17:48,961][INFO ][o.o.t.TransportService   ] [graylogopen] publish_address {192.168.128.253:9300}, bound_addresses {192.168.128.253:9300}
[2023-02-19T04:17:49,290][INFO ][o.o.c.c.Coordinator      ] [graylogopen] cluster UUID [ao2IvgcKTlOMX516DzIbfw]
[2023-02-19T04:17:49,444][INFO ][o.o.c.s.MasterService    ] [graylogopen] elected-as-cluster-manager ([1] nodes joined)[{graylogopen}{LtIiAaGeRY6dy9Q3bzhZrA}{bgWUisw6TE6cT05IB5sByQ}{192.168.128.253}{192.168.128.253:9300}{dimr}{shard_indexing_pressure_enabled=true} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 27, version: 608, delta: cluster-manager node changed {previous [], current [{graylogopen}{LtIiAaGeRY6dy9Q3bzhZrA}{bgWUisw6TE6cT05IB5sByQ}{192.168.128.253}{192.168.128.253:9300}{dimr}{shard_indexing_pressure_enabled=true}]}
[2023-02-19T04:17:49,827][INFO ][o.o.c.s.ClusterApplierService] [graylogopen] cluster-manager node changed {previous [], current [{graylogopen}{LtIiAaGeRY6dy9Q3bzhZrA}{bgWUisw6TE6cT05IB5sByQ}{192.168.128.253}{192.168.128.253:9300}{dimr}{shard_indexing_pressure_enabled=true}]}, term: 27, version: 608, reason: Publication{term=27, version=608}
[2023-02-19T04:17:49,840][INFO ][o.o.a.c.ADClusterEventListener] [graylogopen] Cluster is not recovered yet.
[2023-02-19T04:17:49,849][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:17:49,889][INFO ][o.o.i.i.ManagedIndexCoordinator] [graylogopen] Cache cluster manager node onClusterManager time: 1676780269889
[2023-02-19T04:17:49,897][WARN ][o.o.p.c.s.h.ConfigOverridesClusterSettingHandler] [graylogopen] Config override setting update called with empty string. Ignoring.
[2023-02-19T04:17:49,936][INFO ][o.o.h.AbstractHttpServerTransport] [graylogopen] publish_address {192.168.128.253:9200}, bound_addresses {192.168.128.253:9200}
[2023-02-19T04:17:49,937][INFO ][o.o.n.Node               ] [graylogopen] started
[2023-02-19T04:17:49,937][INFO ][o.o.s.OpenSearchSecurityPlugin] [graylogopen] Node started
[2023-02-19T04:17:49,939][INFO ][o.o.s.OpenSearchSecurityPlugin] [graylogopen] 0 OpenSearch Security modules loaded so far: []
[2023-02-19T04:17:51,539][INFO ][o.o.c.s.ClusterSettings  ] [graylogopen] updating [plugins.index_state_management.template_migration.control] from [0] to [-1]
[2023-02-19T04:17:51,542][INFO ][o.o.a.c.HashRing         ] [graylogopen] Node added: [LtIiAaGeRY6dy9Q3bzhZrA]
[2023-02-19T04:17:51,545][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:17:51,547][INFO ][o.o.a.c.HashRing         ] [graylogopen] Add data node to AD version hash ring: LtIiAaGeRY6dy9Q3bzhZrA
[2023-02-19T04:17:51,552][INFO ][o.o.a.c.HashRing         ] [graylogopen] All nodes with known AD version: {LtIiAaGeRY6dy9Q3bzhZrA=ADNodeInfo{version=2.0.1, isEligibleDataNode=true}}
[2023-02-19T04:17:51,553][INFO ][o.o.a.c.HashRing         ] [graylogopen] Rebuild AD hash ring for realtime AD with cooldown, nodeChangeEvents size 0
[2023-02-19T04:17:51,553][INFO ][o.o.a.c.HashRing         ] [graylogopen] Build AD version hash ring successfully
[2023-02-19T04:17:51,555][INFO ][o.o.a.c.ADDataMigrator   ] [graylogopen] Start migrating AD data
[2023-02-19T04:17:51,556][INFO ][o.o.a.c.ADDataMigrator   ] [graylogopen] AD job index doesn't exist, no need to migrate
[2023-02-19T04:17:51,557][INFO ][o.o.a.c.ADClusterEventListener] [graylogopen] Init AD version hash ring successfully
[2023-02-19T04:17:51,569][INFO ][o.o.g.GatewayService     ] [graylogopen] recovered [5] indices into cluster_state
[2023-02-19T04:17:53,022][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:18:00,158][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:18:00,375][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:18:00,542][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:18:01,437][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:18:12,030][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:18:15,215][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:18:19,984][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:18:21,897][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:18:24,187][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:18:26,364][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:18:28,049][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:18:31,110][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:18:34,167][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:18:38,021][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:18:40,388][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:18:43,821][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:18:46,135][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:18:47,591][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:18:49,892][INFO ][o.o.i.i.ManagedIndexCoordinator] [graylogopen] Performing move cluster state metadata.
[2023-02-19T04:18:49,893][INFO ][o.o.i.i.MetadataService  ] [graylogopen] ISM config index not exist, so we cancel the metadata migration job.
[2023-02-19T04:18:50,247][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:18:54,576][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:19:02,663][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:19:03,818][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:19:49,892][INFO ][o.o.i.i.ManagedIndexCoordinator] [graylogopen] Cancel background move metadata process.
[2023-02-19T04:19:49,893][INFO ][o.o.i.i.ManagedIndexCoordinator] [graylogopen] Performing move cluster state metadata.
[2023-02-19T04:19:49,894][INFO ][o.o.i.i.MetadataService  ] [graylogopen] Move metadata has finished.
[2023-02-19T04:20:36,910][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:20:41,125][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:22:49,293][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T04:22:59,452][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:23:07,442][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:23:17,016][INFO ][o.o.c.r.a.AllocationService] [graylogopen] Cluster health status changed from [RED] to [GREEN] (reason: [shards started [[graylog_0][0]]]).
[2023-02-19T04:23:20,264][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylogopen] Detected cluster change event for destination migration
[2023-02-19T04:24:09,774][WARN ][o.o.c.InternalClusterInfoService] [graylogopen] Failed to update node information for ClusterInfoUpdateJob within 15s timeout
[2023-02-19T04:24:11,970][WARN ][o.o.t.TransportService   ] [graylogopen] Received response for a request that has timed out, sent [17207ms] ago, timed out [2201ms] ago, action [cluster:monitor/nodes/stats[n]], node [{graylogopen}{LtIiAaGeRY6dy9Q3bzhZrA}{bgWUisw6TE6cT05IB5sByQ}{192.168.128.253}{192.168.128.253:9300}{dimr}{shard_indexing_pressure_enabled=true}], id [3195]
[2023-02-19T04:27:49,295][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T04:32:49,296][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep
[2023-02-19T04:37:49,297][INFO ][o.o.j.s.JobSweeper       ] [graylogopen] Running full sweep

Note: reference to 192.168.128.253 is the local private IP of the graylog server. I get the same results if I have it noted as 127.0.0.1 or 192.168.128.253

3. What steps have you already taken to try and solve the problem?

Several days of google-foo. Making changes to configs, increases index limits, etc.

4. How can the community help?

Hopefully a solution?

Thank you for everything!

Woke up to this (only one widget failure, aggregate count)

However:

The data for these widgets is obtained via this community plugin:

I’m ripping my hair out

UPDATE
Its been about 24 hours since my last major change. I had set my /etc/default/graylog-server at half my available memory per the site instructions. Having avoided the issues I had been experiencing recently on an ElasticSearch-backed Graylog instance, before, with much less memory defined in the /etc/default/graylog-server, I backed it down to -Xms16g -Xmx16g on my current build. So far so good.

Hey @accidentaladmin
Only thing I see is strange from your configuration is this.

Your opensearch.yml

# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 127.0.0.1
#

Graylog config

# Default: http://127.0.0.1:9200
elasticsearch_hosts = http://192.168.128.253:9200

Then you have…

# Default: 127.0.0.1:9000
http_bind_address = 127.0.0.1:9000

With

# Default: http://$http_bind_address/
http_publish_uri = http://192.168.128.253

Suggestions, If you want to user you local IP Address then this should work.

OpenSearch.yml

[root@graylog graylog-server]# cat /etc/opensearch/opensearch.yml  | egrep -v "^\s*(#|$)"
cluster.name: graylog
path.data: /var/lib/opensearch
path.logs: /var/log/opensearch
network.host: 192.168.128.253
http.port: 9200
action.auto_create_index: false
discovery.type: single-node
bootstrap.memory_lock: true
plugins.security.disabled: true
plugins.security.system_indices.enabled: false

graylog config settings

[root@graylog graylog-server]# cat /etc/graylog/server/server.conf  | egrep -v "^\s*(#|$)"
is_leader = true
node_id_file = /etc/graylog/server/node-id
password_secret = Holy Secrets Batman!
root_password_sha2 = yes, robin.  Secrets.
root_email = "greg.smith@domain.com"
root_timezone = America/Chicago
bin_dir = /usr/share/graylog-server/bin
data_dir = /var/lib/graylog-server
plugin_dir = /usr/share/graylog-server/plugin
elasticsearch_hosts = http://192.168.128.253:9200
rotation_strategy = count
elasticsearch_max_docs_per_index = 20000000
elasticsearch_max_number_of_indices = 20
retention_strategy = delete
elasticsearch_shards = 4
elasticsearch_replicas = 0

Next, shutdown Graylog-server service and restart Openseach. Once you see Opensearch is in green.

curl -X GET 192.168.128.253:9200/_cluster/health?pretty

should look something like this

{
  "cluster_name" : "graylog",
  "status" : "green", <--------GREEN
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "discovered_master" : true,
  "active_primary_shards" : 408,
  "active_shards" : 408,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

Now start up Graylog-server service.

Im not sure about your Proxy setting. but what I suggested should work without using this setting/s

# Comma separated list of trusted proxies that are allowed to set the client address with X-Forwarded-For
# header. May be subnets, or hosts.
trusted_proxies = 127.0.0.1/32, 192.168.128.0/24

Thank you, I appreciate the response! Unfortunately what you note is because of my own slopiness in pasting. When I had attempted 127.0.0.1, it was 127.0.0.1 across the board. When I attempted 192.168.128.253, it was 192.168.128.253; it was never mixed in matched (except in the confusing trail of evidence I left here on the site).

I did try something new that seems to be working, though:

Hey @accidentaladmin

Unless you using 127.0.0.1 bind address to logon Graylog Web UI, those other setting seem to be to many configurations that is not needed. Simple is good.

History repeats itsself, :laughing: if you have it working , awesome :+1:

1 Like

Unless you using 127.0.0.1 bind address to logon Graylog Web UI, those other setting seem to be to many configurations that is not needed. Simple is good.

For sure! Problem is, I’ve picked up a lot of “rubber-chicken” ceremony from my first go at Graylog that its hard to separate what actually fixed a problem with what coincided - quite accidentally - with a fix.

I’m hoping the 24 > 16 is not one of those rubber-chicken moments! haha

As always, thank you for your assistance!

2 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.