Elasticsearch does not start after upgrading from 5 to 6

Hey everyone!

This is more for posterity and because https://twitter.com/jalogisch asked for it.
See tweet + thread here: https://twitter.com/Ilyas_RL/status/1227394577012838403
Also super thanks to ‘gimmic’ on IRC for pointing me to the solution.

tl;dr: switching from ‘elasticsearch’(5.6.x) to ‘elasticsearch-oss’(6.8.x) may potentially break file permissions, preventing ES from starting.
Back up both your Graylog AND Elasticsearch configs to recreate them later.
Check for permissions root:elasticsearch on at least /etc/default/elasticsearch and chown them if needed.
Also check for existence of the ‘elasticsearch’ user and group.

This story is recreated from memory since it had been 2-3 days since I had to fix this particular thing.
So you know, caveat emptor.

Components in play:
Debian 9 VM on Proxmox
Graylog 3.1, being upgraded to 3.2
Installed from the deb packages in the Graylog repo.
Elasticsearch 5.6.x, being upgraded to 6.8.
Installed from the deb packages in the Elastic repo.

I started with upgrading Graylog to 3.2 because a new version came out and you know, why not.
Following the docs, this upgrade went pretty smoothly.
I chose to overwrite the config files and apply my own settings again afterward.
Make a copy of your settings before you do this.

The documentation for upgrading Elasticsearch, however, left me a bit confused.
There are bits of pieces in the Graylog docs and they themselves point to the Elastic docs.
If you have only installed the ‘elasticsearch-oss’ packages then you should be fine.
I however installed the ‘elasticsearch’ 5.x packages in the past since I have been upgrading Graylog through the months.

If you follow the documentation you will add the new 6.x OSS repository and install the ‘elasticsearch-oss’ package.
On my machine I found that this REMOVES ‘elasticsearch’ and installs the new ‘elasticsearch-oss’ package.
My advice would be to BACK UP YOUR ELASTICSEARCH CONFIG FIRST, then upgrade and overwrite all config files from the -oss pacakge.
THEN re-apply your config, adapting it to the new ES6.x rules. See the Graylog and/or Elasticsearch docs for this.

In my particular case, somehow the file permissions did not get set correctly, this is supposed to be root:elasticsearch for all relevant files.
Also check /etc/passwd and /etc/group to see if the ‘elasticsearch’ user and group exist.
You may need to add the user to the group if things still won’t start.

After that, double-check that your Elasticsearch config is correct and make sure the Graylog config is also done correctly to find your ES instance.
In my case, I changed the ES IP to a non-local address and this needs to be explicitly set in the Graylog config.

That’s all I can think of, hope that helps anyone out there!