Search issue after maintenance index rotation between two different indicies with custom mapping applied

We have a custom index mapping applied. The field data type for the pertinent field is float. All of the documents indexed for as far back as I can search show a data type of “float” for that field. The index rotated per 1 hour. Now, searches that cross the rotation threshold into the prior index (which worked fine until it was rotated out) and attempt to aggregate data for trending return an error that “type “keyword” is not valid for an aggregation” or “exception [type=aggregation_execution_exception, reason=merging/reducing the aggregations failed when computing the aggregation [agg-1] because the field you gave in the aggregation query existed as two different types in two different indices].”

If I search only within the timeframe of the new index everything works fine. If I search in any timeframe in the previous index, it returns this aggregation error. The messages themselves show the correct field data type as does the custom index mapping on that index.

We look at these graphs every day to monitor trends. Something about the index rotation has manifested this issue. Has anyone else run into this?

This problem same like this thread:

We checked and in a specific one index set there was no index on the field aggregation with other types, we completely deleted all indexes and even recreated index set it anew.

There are others index sets in the elasticsearch cluster that contain this field, but the search and aggregation is performed only in one changed index set

Graylog Graylog 4.2.1+5442e44 + Elasticsearh “version”: {
“number”: “7.10.2”,
“build_flavor”: “oss”,
“build_type”: “rpm”
}

We have applied custom mappings in various ways:

http://domain:9200/_template/l7filter-mapping?pretty
{
  "template" : "l7filter_*",
    "mappings": {      
      "properties": {
        "BC": {
          "type": "long"
        },
        "BE": {
          "type": "keyword"
        },
        "BS": {
          "type": "keyword"
        },
        "BT": {
          "type": "long"
        },
        "CC": {
          "type": "long"
        },
        "CH": {
          "type": "keyword"
        },
        "DA": {
          "type": "keyword"
        },
        "DM": {
          "type": "keyword"
        },
        "DP": {
          "type": "keyword"
        },
        "GI": {
          "type": "keyword"
        },
        "HC": {
          "type": "long"
        },
        "ID": {
          "type": "keyword"
        },
        "IDT": {
          "type": "keyword"
        },
        "ITD": {
          "type": "keyword"
        },
        "PID": {
          "type": "keyword"
        },
        "RA": {
          "type": "keyword"
        },
        "RDNS": {
          "type": "keyword"
        },
        "REC": {
          "type": "keyword"
        },
        "RH": {
          "type": "keyword"
        },
        "RID": {
          "type": "keyword"
        },
        "RL": {
          "type": "long"
        },
        "RQ": {
          "type": "keyword"
        },
        "RR": {
          "type": "keyword"
        },
        "RS": {
          "type": "keyword"
        },
        "RT": {
          "type": "float"
        },
        "SID": {
          "type": "keyword"
        },
        "SR": {
          "type": "keyword"
        },
        "T": {
          "type": "keyword"
        },
        "UA": {
          "type": "keyword"
        },
        "V": {
          "type": "long"
        }               
      }
   }  
}

AND

http://domain:9200/_template/l7filter-mapping?pretty
{
  "template" : "l7filter_*",
  "mappings": {
    "dynamic_templates": [
      {
        "l7filter": {
	"match_mapping_type": "string",
          "match":   "RT",          
          "mapping": {
            "type": "float"
          }
        }
      }
    ]
  }
}

AND

http://domain:9200/_template/l7filter-mapping?pretty
{
  "template" : "l7filter_*",
  "mappings": {
    "numeric_detection": true
  }
}

Each method works correctly and does exactly the custom mapping that we need in all indices:
We checked this with a query in each index.(In the example below, we changed one field “RT” and tried to do aggregation on it.)

http://domain:9200/l7filter_33/_mapping/
{
  "l7filter_33": {
    "mappings": {
      "dynamic_templates": [
        {
          "internal_fields": {
            "match": "gl2_*",
            "match_mapping_type": "string",
            "mapping": {
              "type": "keyword"
            }
          }
        },
        {
          "store_generic": {
            "match_mapping_type": "string",
            "mapping": {
              "type": "keyword"
            }
          }
        }
      ],
      "properties": {
        "BC": {
          "type": "keyword"
        },
        "BE": {
          "type": "keyword"
        },
        "BS": {
          "type": "keyword"
        },
        "BT": {
          "type": "keyword"
        },
        "CC": {
          "type": "long"
        },
        "CH": {
          "type": "keyword"
        },
        "DA": {
          "type": "keyword"
        },
        "DM": {
          "type": "keyword"
        },
        "DP": {
          "type": "keyword"
        },
        "GI": {
          "type": "keyword"
        },
        "HC": {
          "type": "keyword"
        },
        "ID": {
          "type": "keyword"
        },
        "IDT": {
          "type": "keyword"
        },
        "ITD": {
          "type": "keyword"
        },
        "PID": {
          "type": "keyword"
        },
        "RA": {
          "type": "keyword"
        },
        "RD": {
          "type": "keyword"
        },
        "RDNS": {
          "type": "keyword"
        },
        "REC": {
          "type": "keyword"
        },
        "RH": {
          "type": "keyword"
        },
        "RID": {
          "type": "keyword"
        },
        "RL": {
          "type": "keyword"
        },
        "RQ": {
          "type": "keyword"
        },
        "RR": {
          "type": "keyword"
        },
        "RS": {
          "type": "keyword"
        },
        "RT": {
          "type": "float"
        },
        "SID": {
          "type": "keyword"
        },
        "SR": {
          "type": "keyword"
        },
        "T": {
          "type": "keyword"
        },
        "UA": {
          "type": "keyword"
        },
        "V": {
          "type": "long"
        },
        "full_message": {
          "type": "text",
          "analyzer": "standard"
        },
        "gl2_accounted_message_size": {
          "type": "long"
        },
        "gl2_message_id": {
          "type": "keyword"
        },
        "gl2_processing_timestamp": {
          "type": "date",
          "format": "uuuu-MM-dd HH:mm:ss.SSS"
        },
        "gl2_receive_timestamp": {
          "type": "date",
          "format": "uuuu-MM-dd HH:mm:ss.SSS"
        },
        "gl2_remote_ip": {
          "type": "keyword"
        },
        "gl2_remote_port": {
          "type": "long"
        },
        "gl2_source_input": {
          "type": "keyword"
        },
        "gl2_source_node": {
          "type": "keyword"
        },
        "message": {
          "type": "text",
          "analyzer": "standard"
        },
        "source": {
          "type": "text",
          "analyzer": "analyzer_keyword",
          "fielddata": true
        },
        "streams": {
          "type": "keyword"
        },
        "timestamp": {
          "type": "date",
          "format": "uuuu-MM-dd HH:mm:ss.SSS"
        }
      }
    }
  }
}

When you create a template you have to account for the index rotating to the next which increments the number (like from l7filter_33 to l7filter_34) so your template should have a wildcard like so:

"template" : "l7filter_*"

On a side note, when posting, use the forum tools such as the </> to make code/logs readable
image
I did this with the template snippet above.
The other side note is you can snip out the part of code/logs that are redundant… :stuck_out_tongue:

Hopefully that helps - Mark this as the answer for future searches if it fixed it!

2 Likes

We used wildcard l7 filter_*, it just didn’t show up because of HTML formatting, in the thread I fixed it on your advice, but the problem is not because of this, since we used the correct l7filter_*, wildcard.

I created issue on github

@Rinat-Sadykov: It looks like in the new index the first value which was indexed for that field was of type keyword and there was no explicit mapping for the field at this point. This means that this field in this index cannot be aggregated together with indices which have this field with a float type. To fix this, you need to delete that index (probably not an option) or reindex the offending index with an explicit float type for this field.

We are using GitHub issues for tracking bugs in Graylog itself, but this doesn’t look like one. Please post this issue to our public mailing list or join the #graylog channel on freenode IRC.

Thank you!

Rinat-Sadykov commented 17 hours ago

When I try to follow the link Redirecting to Google Groups public mailing list I get “access error”

@Rinat-Sadykov

Author

Rinat-Sadykov commented 17 hours ago

When i try to join on freenode irc i get error We couldn’t connect to that server :frowning:
Unknown error

@Rinat-Sadykov

Author

Rinat-Sadykov commented 17 hours ago

When I “Rotate active write index” manually all works is fine, i get error onlye after automatic rotate index.

To be honest, I do not know what to do next with this problem…

Perhaps it is a bug, on the other hand there are plenty of people, myself included, that have applied custom mappings…

Did you apply all three of your examples then rotate your index or did you try each of the three with a rotate and none of them withsuccess? Since they are all the same name (“l7filter-mapping”) they will overwrite each other so the last one will be the working one. When I was creating my custom mapping it looked slightly different to yours - here is one I was using:

{
  "template": "GroupOne_*",
  "mappings" : {
    "message" : {
      "properties" : {
        "test_duration" : {
          "type" : "long"
        }
      }
    }
  }
}

I used all three options with the removal of all indexes and index sets, when manually rotating the index everything works fine, when rotating the index automatically 1 time per hour there is a problem.

Is this a new Index OR are you modifying a default one?
EDIT I came across this post maybe it can help.

https://megamorf.gitlab.io/2021/01/20/graylog-and-elasticsearch-troubleshooting/#update-custom-field-mappings

This is a new index, everything that is described in the article under the link has been done. But the problem still persists.

Because the names of your three custom indexes are the same, only the last custom index will apply… which is setting numeric detection to true. Elasticsearch is not great at numeric detection.

If you want to apply all three, each should have a unique name so as to not over-write the changes of the other.

1 Like

I wrote earlier that we tried to do custom mapping in different ways and described how, of course, they were not applied all three at once, only one after which a test was carried out, with index rotation manually and automatically according to a schedule, and problems arise with each option of custom matching.

Looking at the custom mappings you have posted, they are slightly different than the ones I have applied in the past. Yours was template->mappings->properties where the one I used was template->mappings->message->properties Here is the code in the custom mapping I would apply to enforce RT to be float in index I7filter_*:

{
  "template": "l7filter_*",
  "mappings" : {
    "message" : {
      "properties" : {
        "RT" : {
          "type" : "float"
        }
      }
    }
  }
}

I tried this mapping format, when I try to apply it I get the error:

{
  "error": {
    "root_cause": [
      {
        "type": "mapper_parsing_exception",
        "reason": "Root mapping definition has unsupported parameters:  [message : {properties={RT={type=float}}}]"
      }
    ],
    "type": "mapper_parsing_exception",
    "reason": "Failed to parse mapping [_doc]: Root mapping definition has unsupported parameters:  [message : {properties={RT={type=float}}}]",
    "caused_by": {
      "type": "mapper_parsing_exception",
      "reason": "Root mapping definition has unsupported parameters:  [message : {properties={RT={type=float}}}]"
    }
  },
  "status": 400
}

There is a mention of this here:
https://docs.graylog.org/v1/docs/elasticsearch

Note

The above template is only compatible with Elasticsearch 6.X. If using Graylog 4.0 with Elasticsearch 7.x, use the template below, saving it as graylog-custom-mapping-7x.json .

{
  "template": "graylog_*",
  "mappings": {
    "properties": {
      "http_method": {
        "type": "keyword"
      },
      "http_response_code": {
        "type": "long"
      },
      "ingest_time": {
        "type": "date",
        "format": "strict_date_time"
      },
      "took_ms": {
        "type": "long"
      }
    }
  }
}

In fact, I do not have a problem to ask a custom mapping, the question is not that, the question is that when the correct custom mapping is specified, after the automatic rotation of the index, aggregation does not work, which I wrote about above, I checked in the old indexes the mappings for the field correct, but aggregation still doesn’t work. This is some kind of bug at the level of a greylog or elastic, I created threads on the github on this issue, but I have not received an answer yet.

Hello,

If your having issues with rotating indices, mapping and/or templates that’s not being applied. This would be a elasticsearch problem I do believe.

I opened an issue on the elasticsearch github:

You were right, last time I applied a custom template I was on Elasticsearch 6.x.

I am on Elasticsearch 7.x (technically 7.14 which is not recommended… but that’s another story) and I happen to have a test index I can play with… so I took a field that was of type keyword and used a custom mapping to change it to a long and both with manually rotating and with pumping in enough messages to get it to automatically rotate, the field stayed as a long. I also changed the mapping to float and it still maintained float regardless as to whether the index was rotated manually or if it rotated automatically. Each time I ran a sum aggregation on the field with no issue. I am unable to recreate the problem, it’s possible that it is an issue with Elasticsearch 7.10.2 that you are running.

EDIT: The only way I could recreate this is if I did not restrict my aggregation to the stream(index) I had applied the custom mapping to.

2 Likes

curl -X GET "elastic01:9200/l7filter_*/_mapping/field/*?pretty" | grep -B 5 -A 3 RT