Dockerized Graylog-Datanode Memory Consumption

1. Describe your incident:

Since we upgraded to datanode/opensearch the memory consumption is rising endless.
It looks like the settings for the opensearch heap are not used correctly

2. Describe your environment:

  • OS Information: Debian 12

  • Package Version: Docker Compose

    • graylog:7.0.3
    • graylog-datanode:7.0.3
  • Service logs, configurations, and environment variables:

    • datanode-environment (excerpt from compose file):
      GRAYLOG_DATANODE_OPENSEARCH_HEAP: “3g”
      GRAYLOG_DATANODE_JAVA_OPTS: “-Xms1g -Xmx1g”
      GRAYLOG_DATANODE_OPENSEARCH_JAVA_OPTS: “-XX:MaxDirectMemorySize=1g”
    • jps -lv (excerpt from within the datanode container):
      org.opensearch.bootstrap.OpenSearch -Xshare:auto -Dopensearch.networkaddress.cache.ttl=60 -Dopensearch.networkaddress.cache.negative.ttl=10 -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -XX:+ShowCodeDetailsInExceptionMessages -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dio.netty.allocator.numDirectArenas=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Djava.security.manager=allow -Djava.locale.providers=SPI,COMPAT -Xms1g -Xmx1g -XX:+UseG1GC -XX:G1ReservePercent=25 -XX:InitiatingHeapOccupancyPercent=30 -Djava.io.tmpdir=/tmp/opensearch-10431750803038713797 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=data -XX:ErrorFile=/tmp/hs_err_pid%p.log -Xlog:gc*,gc+age=trace,safepoint:file=/tmp/gc.log:utctime,pid,tags:filecount=32,filesize=64m -Djava.security.manager=allow -Djava.security.policy=file:///var/lib/graylog-datanode/opensearch/config/opensearch2907361363010041971/opensearch.policy -Xms3g -Xmx
    • docker stats (excerpt):
      CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM %
      aaabbbcccdd datanode 18.62% 12.9GiB / 33.2GiB 38.85%

4. How can the community help?

In my understanding the value for Xmx should be also at 3g and the memory usage of datanode container shouldn’t exceed a single-digit value.

In addition, the option MaxDirectMemorySize is not used at all.

Can anyone tell me, what i am doing wrong?

Hi @dlehnen ,

Regarding the Xmx, I believe your output of the command is just cropped there, the value is always configured, always with the same number as xms, both getting 3g, controlled by your GRAYLOG_DATANODE_OPENSEARCH_HEAP env var. This should be fine and what opensearch recommends.

MaxDirectMemorySize is currently not configurable.

Regarding the actual memory consumption, I hope one of my colleagues can guide you and help with debugging.

Hi @Tdvorak ,

thank you very much for your quick response.
I will remove MaxDirectMemorySize from my compose file then.

Hope that someone else has an idea about the memory issue.

Hi @dlehnen ,

does the container memory ever reach its limit and/or crashes with out-of-memory?

I am asking because the memory usage reported by docker stats can include the file cache (page cache), which is reclaimable, and so might not be real a problem.