Search issue after index rotation

We have a custom index mapping applied. The field data type for the pertinent field is float. All of the documents indexed for as far back as I can search show a data type of “float” for that field. The index rotated overnight. Now, searches that cross the rotation threshold into the prior index (which worked fine until it was rotated out) and attempt to aggregate data for trending return an error that type “keyword” is not valid for an aggregation. If I search only within the timeframe of the new index everything works fine. If I search in any timeframe in the previous index, it returns this aggregation error. The messages themselves show the correct field data type as does the custom index mapping on that index.

We look at these graphs every day to monitor trends. Something about the index rotation has manifested this issue. Has anyone else run into this?

@aaronsachs any thoughts? We are on Elasticsearch 7.11.2 which I know is strictly not supported, but the upgrade was performed before I was able to check compatibility so we just rolled with it and paused future updates. Possibly this is an issue introduced by that version incompatibility?

Graylog 4.0.5

I actually have this same issue with one of my fields. rx_bytes and tx_bytes. But are stored as longs and Ive verified that both in GL and ES. But for some reason, when I try to chart either of those… it errors out saying it was expecting a numeric value but instead received [keyword].

One thing about my situation, is that some of the messages are in multiple streams and potentially in multiple indices. That being said, I’ve verified the mapping in the indices in ES to ensure they are stored as long. So I’m a bit at a loss now also.

I am also on GL 4.0.5, but am still running ES 6.8.14

1 Like

FWIW I also have these messages stored in multiple streams and indices, but they’re mapped as floats in all cases. And, the graph is only using one stream/index set. Does it work correctly for you if you limit the search window to only the current write index?

it does not… I just tried… again… are you casting to float or do you have a custom mapping setup?

Custom index mapping, and when I pull the mappings for the current or any prior indices the mapping is present. I can also verify by checking any message.

More telling however is that if I set my search window to only the last millisecond of any index other than the current write index I receive the error even if there’s no document captured in that search window, and if I set my search window to the any time in the current write index it works fine.

Unable to perform search query: Elasticsearch exception [type=illegal_argument_exception, reason=Field [supn_percent] of type [keyword] is not supported for aggregation [max]].

interesting… do you have any other numeric values in the same stream/index that do work? it seems that all of my numeric fields are behaving the same way. how many shards do you have allocated for the index?

Yeah, I actually have 3 addtional floats and 1 long that are working fine. 4 shards.

wondering if maybe my issue started after an index rotation and I never noticed… have you rotated indices again (either manually or scheduled) and the issue has followed?

I’m curious if this is indicative of some issue with either the deflector or the closing of the old index and the opening of the new.

Additional information. This was also the case for me on 4.0.2. I don’t recall it being an issue on my previous 3.x version.

I doubt rotating the index will have any effect since searching any timeframe in any prior index manifests the issue, but I’ll give it a try.

This definitely only just started for me on the most recent version following the first scheduled rotation. Searches within what is the immediate previous index while it was the current index worked fine, but we did upgrade to 4.0.5 in that timeframe so this would be the first index rotation following the upgrade to 4.0.5.

Alas rotating the active write index did not change anything, but curiously searching in the prior index still works. So it appears it’s only occuring in indexes created before the latest update?

what version did you upgrade from?

We follow the releases as they come out, so Graylog 4.0.4. But the upgrade for Graylog would have occurred right at the end of February, whereas the upgrade beyond 7.10 to 7.11.2 happened around 20 days ago. So I’m leaning towards this being a version incompatibility rather than a Graylog bug, at least from my perspective. It is curious that you’re seeing something similar however.

Hm, so although I don’t see any case where this could apply because the index mappings are in place, there is something pertinent listed in breaking changes for 7.11.

The significant text aggregation now throws an error if applied to a numeric field.

The significant text aggregation could previously be applied to a fields that were defined as numeric, which made little sense and would always return an empty result. Given that applying a text-specific aggregation to a non-text field is almost certainly a mistake, this has now been changed to throw an error.

possibly, but I don’t really know what a “significant text aggregation” is. And I don’t see anything similar in the 6.8.x notes…

I wonder if there’s a way to analyze the GET requests between GL and ES to see if something is being requested incorrectly from ES or it is being sent incorrectly from ES

You could inspect the requests and replies by watching network activity in the developer console in Chrome. Not sure if other browsers have the same. I may do that next week, it’s a good thought.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.