I just want to share a debugging experience I had today after we upgraded our production Elasticsearch environment to 7.10 from 6.8. We have a custom process that ships performance metrics for Graylog, Elasticsearch, and the hosts involved into Graylog so that we can review a dashboard and generate alerts. That process includes an index with custom field mappings so that the data the extractor pulls out into fields is associated to the correct data type.
Following the upgrade, suddenly no new data was appearing in the stream. I thought perhaps there was a problem with the input somehow, or the stream, or the host shipping the metrics in. However, after investigating all of those and finally building a new index set, the stream started processing into the index correctly.
Due to the change in ES, the Graylog documentation on this page https://docs.graylog.org/en/4.0/pages/configuration/elasticsearch.html is not accurate. If you examine the deflector you’ll see the difference. Once we identified the difference, we modified our template to match the new model, loaded the template to a new index set, and updated the stream to use the new index. Once that was done everything started working as expected.
To avoid having to build a new index set or potentially losing data, if you have custom templates then when you perform the ES upgrade to 7.x plan to remove the custom template and load a revised version of it. Then you’ll just need to rotate affected indexes.
I hope this saves others some time and frustration.
With ES 6.8 this worked fine, we used this model and the ES documentation to build what we needed. However, with ES 7.10, suddenly no messages were being indexed in the set with that custom template applied, even after we rotated the active index. We tried removing and reloading the custom template, but received the following error:
So, we looked at the pertinent deflector and reviewed the documentation and determined that we just needed to remove the ‘message’ structure and move ‘properties’ and its children a level up.
Thanks @ttsandrew . I do have a custom mapping applied in my setup on ES 6.8, exactly as instructed in the GL documentation. I would imagine there are many other users which do have a custom mapping as well. For all of us, it sounds like the upgrade to ES 7 is a time bomb waiting to explode, as data and/or uptime would be inevitably lost (unless the custom mapping is adressed before the upgrade)!
Hopefully one of the GL/ES ninjas on this forum (hey @bernd, @aaronsachs and @jan ) can shime in to confirm and to make the right changes in the documentation.
Howdy! this definitely seems like an artifact that’s been held over from docs referencing ES 6.X. It’s definitely something that I think we need to remedy in the docs. Obviously we’d need to fix the example there. Do y’all feel it would be worth adding a cautionary note in the upgrade section of the docs so folks don’t breeze past this? cc @ttsandrew
@aaronsachs, I think it’s worth a note. We did lose a few thousand messages identifying and fixing it. For us the messages in that stream aren’t critical so we were able to move on with the gap. For others it might be a bigger issue.
It looks like for now it’s just generating a deprecation warning and translating the field name. Our template used the field name [template] and it worked fine. Looks like it should be updated in the documentation though.