Before you post: Your responses to these questions will help the community help you. Please complete this template if you’re asking a support question. Don’t forget to select tags to help index your topic!
1. Describe your incident:
I completed a fresh install of Graylog 6.1.1 Enterprise on Ubuntu 22.04 LTS (VM) with MongoDB 7. I completed the pre-flight data nodes setup and logged into Graylog using my admin password. I am greeted with a “Elasticsearch nodes disk usage above low watermark” error in the overview.
2. Describe your environment:
OS Information:
Ubuntu 22.04 LTS (VM)
Package Version:
Enterprise 6.1.1
Service logs, configurations, and environment variables:
3. What steps have you already taken to try and solve the problem?
The OpenSearch cluster status is: “datanode-cluster is green. Shards: 13 active, 0 initializing, 0 relocating, 0 unassigned”. Clearing the error results in it coming back a few minutes later. There are no inputs configured yet. Results of the a df command show ~70% disk free space:
Thanks for the suggestions and for pointing me to the right way to query the data node API.
In case anyone reads this thread later, the correct URL for the data node health API call is (replace hostname:hostport with the correct values):
http://hostname:hostport/api/datanodes/any/opensearch/_cluster/health?pretty=true
I monitored both the data node logs and the graylog-server logs. I was not able to find anything anomalous-looking in the data node logs. I can see the following error in the graylog-server logs:
2024-10-29T18:38:26.621Z WARN [IndexerClusterCheckerThread] Elasticsearch node [127.0.1.1] triggered [ES_NODE_DISK_WATERMARK_LOW] due to low free disk space
I don’t know much about the data node architecture, but it is odd that 127.0.1.1 (vs. 127.0.0.1) would be the address generating the error.
Given that the data node status, health check and logs look clean, this feels like a bug in graylog server (i.e., there is nothing wrong with the data node and the error is being incorrectly displayed).
Wine_Merchant I compared the watermark settings between my production VM (20gb disk with OpenSearch) and the clean install VM (20gb disk with data node) and the settings are the same:
“disk”:{“threshold_enabled”:“true”,“watermark”:{“flood_stage”:“95%”,“high”:“90%”,“low”:“85%”,“enable_for_single_data_node”:“false”}
I only have a single data node in both installs. I assume that since enable_for_single_data_node is false, that there should not be any watermark errors displayed by Graylog.