Index template for ambiguous fields


(Zoulja) #1

I need to proceed Nginx access logs and try to understand how to handle fields which can be presented as a number and as string.
Example: upstream_response_length can be “123” or “-” if upstream response was not used.
Purpose (as I understand it currently):
if field is a number I must be able to use math operations (like max,min, mean, etc) on this field.
if field is a string, record still must be searchable by other existing fields.
After playing with grok patterns I came to extractor like this:
“(%{BASE10NUM:upstream_response_length;int}|-)”

curl ‘http://localhost:9200/_all/_mapping?pretty’ shows
“upstream_response_length” : {
“type” : “long”
},

What’s the standard solution/recommendation for such cases?


(Tess) #2

Interesting situation, because I believe that the ElasticSearch index will only allow one type per field for each index, so it’s either/or.


(Jan Doberstein) #3

the mapping in Elasticsearch can be only either-or - I would solve this situation with the processing pipeline. Having one field that holds the number (the length) to be able to make math operations. With a processing pipeline, test if that field contain a number and if not, write a zero into this field and write a second field that holds the string information (if needed) or a second field that can be used as an indicator that the field doens’t had a number value …


(system) closed #4

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.