Some statistics still NaN?

uclnj · June 18, 2018, 12:14pm

I have a few extractors on a feed, getting bytes sent, received and message level. All grok patterns, all number (int). The fields get added to the stream but when I view field stats, sent and received get NaN while the level (a number that doens’t change, it’s almost always 6) works just fine. I limited the query to just where the field sent_bytes exists.

Each of the 17 results in the attachment has a value in sent and received bytes but when I create a quick values chart, it only shows 7 messages with field rcvd_bytes.

Bug?

jan · June 18, 2018, 1:29pm

I guess those values are not saved as number in Elasticsearch. You should add your custom template to ES that those are saved as number.

http://docs.graylog.org/en/2.4/pages/configuration/elasticsearch.html#custom-index-mappings

uclnj · June 18, 2018, 1:35pm

That’s a bit of a problem then no? The Grok pattern is set to numeric:int - the message fed is level=“6” bytes=“4444” - the extractor is storing the 6 as a number but the 4444 is stored as something else?

What’s the point of the grok extractor then?

jochen · June 18, 2018, 1:36pm

If there are other messages with the message field “rcvd_bytes” or “sent_bytes” with a different data type, then Elasticsearch will try to guess the data type based on the first message in the index with these fields.

As @jan already said, create a custom Elasticsearch index mapping for the fields you want to analyze.

uclnj · June 18, 2018, 3:47pm

The syslog feed will only ever have bytes for rcvd/sent and what the hell is the point of me telling it %{NUMBER:int} if Elasticsearch is going to guess the data type? Why not just %{Guess}. It’s pointless to include a feature where you can specify the data type if the backend is just going to guess the type to be stored based on what it sees.

Is there a way to view the schema of what elasticsearch thinks is the data type being stored?

jochen · June 18, 2018, 3:52pm

It would indeed work, if there were only messages having that one data type for the message field in question.
That doesn’t seem to be the case in your Elasticsearch cluster.

You probably want to read about dynamic mapping:

Yes, it’s called index mapping.

uclnj · June 18, 2018, 4:09pm

Perfect, thank you all.

Topic		Replies	Views
Some field show NaN and errors Graylog Central (peer support)	13	942	September 10, 2019
Struggling with parsing number field Graylog Central (peer support)	6	1647	July 17, 2020
Field statistics/chart on extracted number showing NAN? Graylog Central (peer support)	2	2095	June 6, 2018
Unexpected behavior for handling numeric field Graylog Central (peer support)	5	504	January 21, 2019
Custom mapping not applied to indices Graylog Central (peer support)	4	3149	November 19, 2019

Some statistics still NaN?

Related topics