Graylog problems after upgrade to 3.0

Hi, I’m experiencing some problems during/after upgrade from 2.4.6-1 to 3.0.

I have succesfully upgraded elasticsearch to 5.6.13 which should be supported according to GL docs. It seems healty and responding to misc. curl tests.

I’m running the GL setup on Docker on Ubuntu. Before starting the upgrade of GL I did a full OS and docker upgrade and are on latest versions here.
The pull was made 11 february 2019.

I have been using a custom GL config which get mapped in during container build using docker-compose. This has been working fine, but i did download a new version 3.0 and compared to my 2.4.6-1 version and found several places it had changed, so I moved my previous custom settings into the v. 3.0 file. Here I have a bit uncertain of how to set the various settings of Http/binding. I think this could benefit from some better explanation/examples inside the default file.

My old working installation was configured to run on port 80, and only on ordinary http, no https/SSL.

There’s a Nginx-proxy in front of the containers, and right now I’m experiencing that nginx always redirects to https with a “bad gateway” error. This might be because GL never gets up and running fully, so when Nginx wants to redirect, theres no service listening, and then it redircts to its own https error page. I dont know exactly.

Anyway, the only errors I’m getting is 4 regarding startup of lookuptables and adapters, which I think could be ignored according to other support blogs:

2019-03-12 09:47:14,493 ERROR: org.graylog2.plugin.lookup.LookupDataAdapter - Couldn’t start data adapter spamhaus-drop/5b912293b9d8210001ef241a/@76ec50a8
org.graylog.plugins.threatintel.tools.AdapterDisabledException: Spamhaus service is disabled, not starting (E)DROP adapter. To enable it please go to System / Configurations.
at org.graylog.plugins.threatintel.adapters.spamhaus.SpamhausEDROPDataAdapter.doStart(SpamhausEDROPDataAdapter.java:85) ~[?:?]
at org.graylog2.plugin.lookup.LookupDataAdapter.startUp(LookupDataAdapter.java:59) [graylog.jar:?]
at com.google.common.util.concurrent.AbstractIdleService$DelegateService$1.run(AbstractIdleService.java:62) [graylog.jar:?]
at com.google.common.util.concurrent.Callables$4.run(Callables.java:119) [graylog.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
2019-03-12 09:47:14,493 INFO : org.graylog2.lookup.LookupTableService - Data Adapter spamhaus-drop/5b912293b9d8210001ef241a [@76ec50a8] STARTING
2019-03-12 09:47:14,487 ERROR: org.graylog2.plugin.lookup.LookupDataAdapter - Couldn’t start data adapter tor-exit-node/5b912293b9d8210001ef241b/@442655cb
org.graylog.plugins.threatintel.tools.AdapterDisabledException: TOR service is disabled, not starting TOR exit addresses adapter. To enable it please go to System / Configurations.
at org.graylog.plugins.threatintel.adapters.tor.TorExitNodeDataAdapter.doStart(TorExitNodeDataAdapter.java:89) ~[?:?]
at org.graylog2.plugin.lookup.LookupDataAdapter.startUp(LookupDataAdapter.java:59) [graylog.jar:?]
at com.google.common.util.concurrent.AbstractIdleService$DelegateService$1.run(AbstractIdleService.java:62) [graylog.jar:?]
at com.google.common.util.concurrent.Callables$4.run(Callables.java:119) [graylog.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
2019-03-12 09:47:14,496 WARN : org.graylog.plugins.threatintel.adapters.otx.OTXDataAdapter - OTX API key is missing. Make sure to add the key to allow higher request limits.
2019-03-12 09:47:14,498 ERROR: org.graylog2.plugin.lookup.LookupDataAdapter - Couldn’t start data adapter abuse-ch-ransomware-domains/5b912294b9d8210001ef241f/@3882a6b9
org.graylog.plugins.threatintel.tools.AdapterDisabledException: Abuse.ch service is disabled, not starting adapter. To enable it please go to System / Configurations.
at org.graylog.plugins.threatintel.adapters.abusech.AbuseChRansomAdapter.doStart(AbuseChRansomAdapter.java:96) ~[?:?]
at org.graylog2.plugin.lookup.LookupDataAdapter.startUp(LookupDataAdapter.java:59) [graylog.jar:?]
at com.google.common.util.concurrent.AbstractIdleService$DelegateService$1.run(AbstractIdleService.java:62) [graylog.jar:?]
at com.google.common.util.concurrent.Callables$4.run(Callables.java:119) [graylog.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
2019-03-12 09:47:14,487 INFO : org.graylog2.lookup.LookupTableService - Data Adapter abuse-ch-ransomware-domains/5b912294b9d8210001ef241f [@3882a6b9] STARTING
2019-03-12 09:47:14,494 INFO : org.graylog2.lookup.LookupTableService - Data Adapter whois/5b912294b9d8210001ef241e [@7e773466] STARTING
2019-03-12 09:47:14,517 INFO : org.graylog2.lookup.LookupTableService - Data Adapter whois/5b912294b9d8210001ef241e [@7e773466] RUNNING
2019-03-12 09:47:14,518 INFO : org.graylog2.lookup.LookupTableService - Data Adapter spamhaus-drop/5b912293b9d8210001ef241a [@76ec50a8] RUNNING
2019-03-12 09:47:14,488 ERROR: org.graylog2.plugin.lookup.LookupDataAdapter - Couldn’t start data adapter abuse-ch-ransomware-ip/5b912293b9d8210001ef2418/@6c037e04
org.graylog.plugins.threatintel.tools.AdapterDisabledException: Abuse.ch service is disabled, not starting adapter. To enable it please go to System / Configurations.
at org.graylog.plugins.threatintel.adapters.abusech.AbuseChRansomAdapter.doStart(AbuseChRansomAdapter.java:96) ~[?:?]
at org.graylog2.plugin.lookup.LookupDataAdapter.startUp(LookupDataAdapter.java:59) [graylog.jar:?]
at com.google.common.util.concurrent.AbstractIdleService$DelegateService$1.run(AbstractIdleService.java:62) [graylog.jar:?]
at com.google.common.util.concurrent.Callables$4.run(Callables.java:119) [graylog.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
2019-03-12 09:47:14,524 INFO : org.graylog2.lookup.LookupTableService - Data Adapter otx-api-ip/5b912294b9d8210001ef2420 [@3e80d0a5] RUNNING

The last log line I get from GL is:

2019-03-12 09:47:35,891 INFO : org.graylog2.bootstrap.ServerBootstrap - Services started, startup times in ms: {KafkaJournal [RUNNING]=26, BufferSynchronizerService [RUNNING]=32, OutputSetupService [RUNNING]=77, JournalReader [RUNNING]=131, EtagService [RUNNING]=190, ConfigurationEtagService [RUNNING]=191, StreamCacheService [RUNNING]=242, InputSetupService [RUNNING]=247, PeriodicalsService [RUNNING]=282, LookupTableService [RUNNING]=400, JerseyService [RUNNING]=21653}
2019-03-12 09:47:35,896 INFO : org.graylog2.shared.initializers.ServiceManagerListener - Services are healthy
2019-03-12 09:47:35,898 INFO : org.graylog2.shared.initializers.InputSetupService - Triggering launching persisted inputs, node transitioned from Uninitialized [LB:DEAD] to Running [LB:ALIVE]
2019-03-12 09:47:35,897 INFO : org.graylog2.bootstrap.ServerBootstrap - Graylog server up and running.
2019-03-12 09:47:35,916 INFO : org.graylog2.inputs.InputStateListener - Input [Beats (deprecated)/5bbf267cf264de00010dd0f7] is now STARTING
2019-03-12 09:47:36,006 INFO : org.graylog2.inputs.InputStateListener - Input [Beats (deprecated)/5bbf267cf264de00010dd0f7] is now RUNNING
2019-03-12 09:47:36,016 WARN : org.graylog2.plugin.inputs.transports.AbstractTcpTransport - receiveBufferSize (SO_RCVBUF) for input BeatsInput{title=Beat input on port 5044, type=org.graylog.plugins.beats.BeatsInput, nodeId=null} (channel [id: 0xa0ea90b2, L:/0.0.0.0:5044]) should be 1048576 but is 425984.

But the docker container never gets into a healty state, so its obvious not completed its startup cycle.

a56367c1ced6 graylog/graylog:3.0.0 “/docker-entrypoint.…” 27 minutes ago Up 25 minutes (unhealthy) 0.0.0.0:514->514/tcp, 0.0.0.0:5044->5044/tcp, 0.0.0.0:9001->9001/tcp, 0.0.0.0:514->514/udp, 0.0.0.0:12201->12201/tcp, 0.0.0.0:12201->12201/udp, 9000/tcp docker-build_graylog_1

My docker template looks like this:

########################
  ## GRAYLOG LOG SERVER ##
  ########################
  mongo:
    restart: unless-stopped
    image: mongo:3
    volumes:
      - /docker-data/graylog/mongodb:/data/db

  elasticsearch:
    restart: unless-stopped
    image: docker.elastic.co/elasticsearch/elasticsearch:5.6.13
    volumes:
      - /log/elasticsearch:/log/elasticsearch

    environment:
      - http.host=0.0.0.0
      - transport.host=localhost
      - network.host=0.0.0.0
      - xpack.security.enabled=false
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    #ports:
    #  - "9200"
    ulimits:
      memlock:
        soft: -1
        hard: -1

  graylog:
    restart: unless-stopped
    image: graylog/graylog:3.0.0
    volumes:
      - /docker-data/graylog/graylog:/usr/share/graylog/data/journal
      - ./graylog/config:/usr/share/graylog/data/config
    environment:
      #- GRAYLOG_HTTP_BIND_ADDRESS=127.0.0.1:80
      #- GRAYLOG_HTTP_EXTERNAL_URI=http://graylog.ngt.dbb.dk/
      #- GRAYLOG_WEB_ENDPOINT_URI=https://graylog.ngt.dbb.dk/api
      - VIRTUAL_HOST=graylog.ngt.dbb.dk,graylog.dbb.dk
      - HTTPS_METHOD=noredirect
    depends_on:
      - mongo
      - elasticsearch
    ports:
      - "80"
      - "514:514"
      - "514:514/udp"
      - "5044:5044"
      - "12201:12201"
      - "12201:12201/udp"

I tried many different setting in the graylog config. Currently it looks like this, all other settings are back to default:

http_bind_address = 127.0.0.1:80
#http_publish_uri = http://0.0.0.0:80/ (Using default)
http_external_uri = http://graylog.ngt.dbb.dk

The domain of the graylog server (graylog.ngt.dbb.dk) is registered in our own dns. So it resolves to the IP of the docker host. The graylog server should only beaccesible from inside of our network. Everything was working fine before upgrade.

I haven’t done anything with regards to Mongodb. Does it have to be upgraded as well?

Hope someone can suggest me some changes or troubleshooting hints, because I’m running out of ideas…

Best regards, Peter Meldgaard, Denmark

would you mind to edit your posting making the configuration and log parts more readable with code blocks like mentioned in the FAQ https://community.graylog.org/faq#format-markdown

that the lookups fail that “loud” is not nice and will be fixed in upcoming versions.

When did you pull the 3.0 image? we have improved that image in the last week.

Graylog runs in the 3.0 image now as user graylog so that is the reason binding to port 80/443 does not work I guess.

what you should configure where is hard to tell without knowing how you organize your internal networking and where what is reachable. You should make the edit of your post and clarify.

Hi Jan

Sorry about the formatting, my bad. It’s been fixed now including some extra specifications.

Theres not so much to say about the network. It’s not complicated at all. We have a domain for the GL setup in our own DNS. Its only for internal use. It resolves graylog.ngt.dbb.dk to the IP of the docker host.

From here the Nginx-proxy takes over to route the request to the registered container, using the environment setting in the docker compose file.

I do have a suspicion that before I get a healthy container I dont get anything through nginx-proxy.

So I need to get GL on its feet first.

Can I enable some more startup tracing somehow?

Br Peter

you set http_bind_address = 127.0.0.1:80 what makes Graylog listen to the localhost of your docker container - what makes it not reachabel to the outsite. That is the first what comes to my mind.
The second you should check, Graylog runs as user graylog what should not allow it to run on port 80 as this is priviledge port. You also do not need that if you have nginx in front of that what can listen on port 80 …

my advice:

Listen on all interfaces on the “default” port, using the nginx proxy to route to port 80.

http_bind_address=0.0.0.0:9000
http_external_uri = http://graylog.ngt.dbb.dk:9000

Now the NGINX is connecting to http://graylog.ngt.dbb.dk:9000 with the configuration taken from http://docs.graylog.org/en/3.0/pages/configuration/web_interface.html#nginx

server
{
    listen 80 default_server;
    listen [::]:80 default_server ipv6only=on;
    server_name graylog.example.org;

    location / {
      proxy_set_header Host $http_host;
      proxy_set_header X-Forwarded-Host $host;
      proxy_set_header X-Forwarded-Server $host;
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header X-Graylog-Server-URL http://$server_name/;
      proxy_pass       http://graylog.ngt.dbb.dk:9000;
    }
}

and server_name need to be adjusted to your configuration in the above.

Thanks, this might work if we where using nginx with config setup, but we are using environment vars to configure the proxy.
Secondly we want graylog to be available on default port 80 to users. Not on port 9000.

But I really think that the first problem to solve is that the container does enter a healthy state. I’m quite sure it does not expose a healthy web service to the outside yet.Could it be because the journal has not been deleted after the upgrade? And how do I troubleshoot this, I don’t know.

By the way, the proxy is this one:

And the docker config(less) is:

nginx-proxy:
restart: unless-stopped
image: jwilder/nginx-proxy
ports:
- “80:80”
- “443:443”
volumes:
- ./nginx/certs:/etc/nginx/certs
- /var/run/docker.sock:/tmp/docker.sock:ro
- ./nginx/increase_max_body_size.conf:/etc/nginx/conf.d/increase_max_body_size.conf

Update, seems like I somehow solved the problems yesterday. I’m not 100% sure what fixed it but I did some tweaks of config and docker compose file, and I also updated jwilder/nginx-proxy to latest and then I could connect to Graylog on standard port 80.

My settings in graylog.config is now:

http_bind_address = 0.0.0.0:80
#Default: http://$http_bind_address/
#http_publish_uri = http://0.0.0.0:80/
http_external_uri = http://graylog.ngt.dbb.dk:80/

And docker compose:

  - VIRTUAL_HOST=graylog.ngt.dbb.dk,graylog.dbb.dk
  - VIRTUAL_PORT=80
  - HTTPS_METHOD=noredirect
ports:
  - "80"
  - "514:514"
  ...Input ports...

I also deleted the Journal folder, but I dont think it had any effect on my problems.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.