Field retype during runtime, questions about indexes/shards

Hi all,

I took over Graylog server after one of my colleagues and as usually, issues started happen :slight_smile: Therefore I have two questions.

  1. Few days ago out of the blue one team noticed that their messages are not being processed by Graylog anymore. They found out that the one of the fields “job_id” changed from string to long and Graylog was saying that “There were 93,640 failed indexing attempts in the last 24 hours.” We fix this issue by creating separate index for this team and now messages are being processed. I read there is something called “Dynamic field mapping” - is there any possibility it will change the field type depending on the majority of the data in that field or its setup once for the index and then not touched at all?

  2. With this another question arrised. We currently have three main indexes 7 days retention, 15 days and 30 days. These three are used by whole company to store the data (therefore issue in first question could happen?). It this something that is OK or better will be that each team with have their own index? I read that shards should be only one since we are running one instance, right? Anyway, one picture for thousands words :slight_smile:

Some tech details. We have one instance running as a VMware VM on Linux (runing on Docker), version is 4.3.5 (we are planning to upgrade to v5).

Thank you for your time and guidance! :slight_smile:

Hello && Welcome!

  1. Elasticsearch makes a dynamic evaluation of field every time the index rotates bases on the data coming in just after the rotation. So if your text field happens to be “777” on the first message coming in after the rotation, Elasticsearch will happily create it as a long. If you manually rotate the index, it will do it over again… meaning that it will switch back to keyword if that comes in to the field on the next message… You can enforce a field to be keyword with a custom mapping in Elasticsearch … There are posts about that including one I put in about custom mapping and historical correction that you can search for in the forum…

  2. They way you have your indexes are fine if that works well for you. The considerations for storage indexes is really based on retention and the related storage space. There are secondary considerations if you need to keep data separated or you need redundancy… etc…

1 Like

Hello && welcome @deb0ro

Adding on and agree with @tmacgbay,

This was a issue with my setup, whie back I did have have 7 days retention, 15 days and 30 days index sets. instead I have Windows, Linux, firewalls, switches, Databases, etc… This way the same log types dont get mixed up together. Hence, I dont have that issue with ES/OS dynamic mapping, the logs/fields are very similar.

For example instead of 4 shards per index set I have 1 shard for the database index set with a retention of 2 weeks, ( i.e., it became more flexable, saves room on the volume). you still can have 7, 15 & 30 days or a well defined setup can be set up for just 30 days if need be. Depeneding on what you want to do. Just an idea.

1 Like

Hi guys :slight_smile:

Thank you very much for your valuable inputs both @tmacgbay and @gsmith!

From what I know in previous setup there were separate indexes for different apps and this issue never happened. Maybe a time to split it again :slight_smile:

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.