65536 messages in process buffer, 100% utilized

hi,everyone.
i am using graylog-open version to receive syslogs(mostly device is network,such as fortigate firewall 、radius server logs).and i am facing a problem: when fortigate log transfer to graylog server,the process and output buffer is coming 100% immediately. soon the alert is coming as “Graylog high journal utilization is too high”.
graylog installation use docke-compose.
images are: graylog/graylog:4.3、mongo:4.2、docker.elastic.co/elasticsearch/elasticsearch-oss:7.10.2。
the yaml file is created according to official sample.
and i have try some methods in this community,but it does not work for me.
so can anyone help me about this problem?if you have any idea or solution for this, please leave a comment, it may help me someway. thansk a lot!

input traffic like this:

Hello,

If so could show the compose.yaml file it might help?
Normally when th e buffers are getting full it could mean a couple things.

1.bad regex,grok pattern, or piepline
2.Not enough resources for your buffers. this would be in you graylog.conf file. you can use "locate graylog.config" to find where its at.
3. last, Elasticsearch can not connect with Graylog to index those files in the journal.
4. Your JVM heap is to low

Just a suggestion, that is if you have the resources.

if you have this

- "ES_JAVA_OPTS=-Xms512m -Xmx512m"

then try this

- "ES_JAVA_OPTS=-Xms2048m -Xmx2048m"

NOTE: Confirming that elasticsearch is working “green” and Graylog/ES is connected when the journal fills up Elastic might go into read-mode. You best bet is to pause input or log coming it before more issue arise. and digg though the logs files. In your case…

root # docker ps

root # docker logs -f <container_id>

to help you further please take a look here

thanks for your replying!
there is my current docker-compose.yaml.i have changed “ES_JAVA_OPTS=-Xms512m -Xmx512m” to “ES_JAVA_OPTS=-Xms2048m -Xmx2048m”, and it seems not work. Process and output buffer is stll high. so cloud you please give me other sugguestion?

version: ‘3’
services:
mongo:
image: mongo:4.2
container_name: graylog_mongo
restart: unless-stopped
networks:
- graylog
volumes:
- ./mongo_data:/data/db
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch-oss:7.10.2
container_name: graylog_elasticsearch
restart: unless-stopped
volumes:
- ./es_data:/usr/share/elasticsearch/data
environment:
- http.host=0.0.0.0
- transport.host=localhost
- network.host=0.0.0.0
- “ES_JAVA_OPTS=-Dlog4j2.formatMsgNoLookups=true -Xms2048m -Xmx2048m”
ulimits:
memlock:
soft: -1
hard: -1
deploy:
resources:
limits:
memory: 16g
networks:
- graylog
graylog:
image: graylog/graylog:4.3
container_name: graylog_graylog
volumes:
- ./graylog_data_journal:/usr/share/graylog/data/journal
# - ./graylog.conf:/usr/share/graylog/data/config/graylog.conf
# - /etc/localtime:/etc/localtime:ro
environment:
# CHANGE ME (must be at least 16 characters)!
- GRAYLOG_PASSWORD_SECRET=somepasswordpepper
# Password: admin
- GRAYLOG_ROOT_PASSWORD_SHA2=8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918
- GRAYLOG_HTTP_EXTERNAL_URI=http://10.128.128.14:9000/
# timezone setting
- GRAYLOG_TIMEZONE=Asia/Shanghai
- GRAYLOG_ROOT_TIMEZONE=Asia/Shanghai
- TZ=Asia/Shanghai
entrypoint: /usr/bin/tini – wait-for-it elasticsearch:9200 – /docker-entrypoint.sh
networks:
- graylog
restart: always
depends_on:
- mongo
- elasticsearch
ports:
# Graylog web interface and REST API
- 9000:9000
# Syslog TCP
- 1514:1514
# Syslog UDP
# - 1514:1514/udp
- 1514-1550:1514-1550/udp
# GELF TCP
- 12201:12201
# GELF UDP
- 12201:12201/udp
networks:
graylog:
driver: bridge

thanks for your replying!
there is my current docker-compose.yaml.i have changed “ES_JAVA_OPTS=-Xms512m -Xmx512m” to “ES_JAVA_OPTS=-Xms2048m -Xmx2048m”, and it seems not work. Process and output buffer is stll high. so cloud you please give me other sugguestion?

version: '3'
services:
    mongo:
      image: mongo:4.2
      container_name: graylog_mongo
      restart: unless-stopped
      networks:
        - graylog
      volumes:
        - ./mongo_data:/data/db
    elasticsearch:
      image: docker.elastic.co/elasticsearch/elasticsearch-oss:7.10.2
      container_name: graylog_elasticsearch
      restart: unless-stopped
      volumes:
        - ./es_data:/usr/share/elasticsearch/data
      environment:
        - http.host=0.0.0.0
        - transport.host=localhost
        - network.host=0.0.0.0
        - "ES_JAVA_OPTS=-Dlog4j2.formatMsgNoLookups=true -Xms2048m -Xmx2048m"
      ulimits:
        memlock:
          soft: -1
          hard: -1
      deploy:
        resources:
          limits:
            memory: 16g
      networks:
        - graylog
    graylog:
      image: graylog/graylog:4.3
      container_name: graylog_graylog
      volumes:
        - ./graylog_data_journal:/usr/share/graylog/data/journal
        # - ./graylog.conf:/usr/share/graylog/data/config/graylog.conf
        # - /etc/localtime:/etc/localtime:ro
      environment:
        # CHANGE ME (must be at least 16 characters)!
        - GRAYLOG_PASSWORD_SECRET=somepasswordpepper
        # Password: admin
        - GRAYLOG_ROOT_PASSWORD_SHA2=8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918
        - GRAYLOG_HTTP_EXTERNAL_URI=x.x.x.x:9000
        # timezone setting
        - GRAYLOG_TIMEZONE=Asia/Shanghai
        - GRAYLOG_ROOT_TIMEZONE=Asia/Shanghai
        - TZ=Asia/Shanghai
      entrypoint: /usr/bin/tini -- wait-for-it elasticsearch:9200 --  /docker-entrypoint.sh
      networks:
        - graylog
      restart: always
      depends_on:
        - mongo
        - elasticsearch
      ports:
        # Graylog web interface and REST API
        - 9000:9000
        # Syslog TCP
        - 1514:1514
        # Syslog UDP
        # - 1514:1514/udp
        - 1514-1550:1514-1550/udp
        # GELF TCP
        - 12201:12201
        # GELF UDP
        - 12201:12201/udp
networks:
    graylog:
      driver: bridge

How are you parsing the incoming logs? Extractors? Pipeline? Can you show an example message and exactly how you are parsing it?

1 Like

Hello @masterdou

The following I found some odd and/or missed settings. Not sure if they were suppose to be there or were forgotten.

links:
      - mongodb:mongo
      - elasticsearch

- GRAYLOG_HTTP_BIND_ADDRESS=0.0.0.0:9000

### This setting  is kind of high unless you have that amount of memory resources.
deploy:
        resources:
          limits:
            memory: 16g

# Container time Zone
      - TZ=America/Chicago

I don’t think you need this VAR. :point_down:

- GRAYLOG_TIMEZONE=Asia/Shanghai

EXAMPLE of my Lab Graylog-Docker Enterprise version. I don’t have any issues. It does 2 GB a day. This should give you an idea of some troubleshooting steps.

root@ansible:/usr/local/bin# cat docker-compose.yaml
version: '3'
services:
   
  mongodb:
    image: mongo:4.4
    network_mode: bridge
   
    volumes:
      - mongo_data:/data/db
   
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch-oss:7.10.2-amd64
    
    network_mode: bridge
    
    volumes:
      - es_data:/usr/share/elasticsearch/data
    environment:
      - http.host=0.0.0.0
      - transport.host=localhost
      - network.host=0.0.0.0      
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    mem_limit: 1g
   
  graylog:
        image: graylog/graylog-enterprise:4.3.5-jre11
    network_mode: bridge
    dns:
      - 10.10.10.15
      - 10.10.10.16
   # journal and config directories in local NFS share for persistence
    volumes:
       - graylog_bin:/usr/share/graylog/bin
       - graylog_data:/usr/share/graylog/data/config
       - graylog_log:/usr/share/graylog/data/log
       - graylog_plugin:/usr/share/graylog/data/plugin
       - graylog_content:/usr/share/graylog/data/contentpacks  
       
    environment:
      # Container time Zone
      - TZ=America/Chicago
      # CHANGE ME (must be at least 16 characters)!
      - GRAYLOG_PASSWORD_SECRET=pJod1TRZAckHmqM2oQPqX1qnLVJS99jN
      # Password: admin
      - GRAYLOG_ROOT_PASSWORD_SHA2=ef92b778bafe771e89
      - GRAYLOG_HTTP_BIND_ADDRESS=0.0.0.0:9000
      - GRAYLOG_HTTP_EXTERNAL_URI=http://192.168.1.28:9000/
      - GRAYLOG_ROOT_TIMEZONE=America/Chicago
      - GRAYLOG_ROOT_EMAIL=greg.smith@enseva.com
      - GRAYLOG_HTTP_PUBLISH_URI=http://192.168.1.28:9000/      
      - GRAYLOG_HTTP_ENABLE_CORS=true
      
    links:
      - mongodb:mongo
      - elasticsearch
    depends_on:
      - mongodb
      - elasticsearch
    ports:
      # Graylog web interface and REST API
      - 9000:9000
      # Syslog TCP
      - 8514:8514
      # Elasticsearch
      - 9200:9200
      - 9300:9300
      # Syslog UDP
      - 8514:8514/udp
      # GELF TCP
      #- 12201:12201
      # GELF UDP
      - 12201:12201/udp
      # Reports
      - 9515:9515
      - 9515:9515/udp
      # beats
      - 5044:5044
      # email
      - 25:25
      - 25:25/udp
      # web
      - 80:80
      - 443:443
      - 21:21
      # Forwarder
      - 13302:13302
      - 13301:13301
      # keycloak
      - 8443:8443
      # packetbeat
      - 5055:5055
      # CEF Messages
      - 5555:5555/udp

volumes:
  mongo_data:
    driver: local
  es_data:
    driver: local
  graylog_journal:
    driver: local
  graylog_bin:
    driver: local
  graylog_data:
    driver: local
  graylog_log:
    driver: local
  graylog_plugin:
    driver: local
  graylog_content:
    driver: local
root@ansible:/usr/local/bin# ^C

This is not a copy & paste, you may need to adjust those setting to your setup but any ways it should give you a better understanding.

I did but you did not show the output on the suggestion I gave. We would need more information to help further troubleshoot your issue. Perhaps start with the logs and see if you can find a clue instead of guessing. The logs may lead directly to the issue, just a thought.

1 Like

there is only 8 extractors for each input. and it use similar Regular expression like this:


as for pipeline,there is not configured

thanks for your replying @gsmith!
in my compose yaml, three containers are using same network named graylog which type is bridge,so i don’t think the link configuration is necessary.
links:
- mongodb:mongo
- elasticsearch
as for other configuration,i have changed to your advise,but unfortunately nothing works.
there are three running containers and their logs during process’s utilization is very high.


graylog logs:

elasticsearch_logs:

mongdb logs:

and there are some other screenshots i take during problem occuered:

Thank you for the added info. I’ll look at closer when I get time from Graylog GO . Real quick, last time I seen this was from this post sum it up it elastic Couldn't point deflector to a new index org.graylog2.indexer.ElasticsearchException elasticsearch couldn't point deflector to a new index - #2 by tmacgbay

Hello @masterdou

So I went back over this, only reason I can think of why this would happen are the following.

  • Journal is corrupt, this would mean deleted the journal and then restart service again.
  • A bad GROK or REGEX on extractor/pipeline
  • Index template mapping is incorrect.
  • Elasticsearch is not indexing the logs from the journal.
  • Index is in read mode

To know exactly what going on. Tailing the container log files after a restart and getting the full trace.
How much logs are you ingesting per day? If its only a few GB’s then I would look at Elasticsearch not indexing those logs.

If this is a elasticsearch issue ,Perhaps executed some curl commands to find out what’s going on.

I had this happen before, the logs coming in a wrong input some how corrupted my journal. What I did was the follow.
Stopped Graylog Service
Deleted the journal
Start Graylog service and Manually rotated the index.

Ensured the Indices were NOT in read mode ONLY , if so Execute curl command to make the writable again.

Hi @gsmith
thanks for your replying!
i have read about your analyse,and i’m ready to comfirm what’s the reason for this problem.
but sadly i realise i don’t know how to use these commands to solve problems (i’m not fimilar with linux system or it’s applications or docker).
so cloud you please list the commands which i need to use?

:+1:
Ill write something up, give me a few and ill post it here.

and by the way,the amount of graylog receving is approximately 350GB per day.

Oh well then, that’s huge!!!, which makes a big difference in what I was going to suggest.

For 350 GB a day would be way more then one node could handle, you would need a cluster for something that big.