Before you post: Your responses to these questions will help the community help you. Please complete this template if you’re asking a support question. Don’t forget to select tags to help index your topic!
1. Describe your incident:
I have had a sudden CPU spike that has been ongoing for most of the day. Prior to this CPU utilization has been low and no changes have been made to the deployment.
2. Describe your environment:
OS Information:
Ubuntu 20.04.5 LTS
Package Version:
4.3.9
Service logs, configurations, and environment variables:
In the server.log file i see the following:
@graylog:/var/log/graylog-server$ tail server.log
at org.graylog.shaded.elasticsearch7.org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:496)
at org.graylog.shaded.elasticsearch7.org.elasticsearch.ElasticsearchException.fromXContent(ElasticsearchException.java:407)
at org.graylog.shaded.elasticsearch7.org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:437)
... 33 more
Caused by: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [message] in order to load field data by uninverting the inverted index. Note that this can use significant memory.]]
at org.graylog.shaded.elasticsearch7.org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:496)
at org.graylog.shaded.elasticsearch7.org.elasticsearch.ElasticsearchException.fromXContent(ElasticsearchException.java:407)
at org.graylog.shaded.elasticsearch7.org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:437)
... 35 more
3. What steps have you already taken to try and solve the problem?
Ran a ‘htop’ command to see that the graylog process is utilizing all cores. Considering rebooting but have not done so.
From what I can understand/see is this part of the logs.
reason=Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [message] in order to load field data by uninverting the inverted index. Note that this can use significant memory.]]
Need some more information to help you further. Chances are something is going on with Elasticsearch.
It could be bad GROK pattern, Pipeline, Dashboard Widget, index template, etc…
Hi @gsmith . Thanks for responding back! My graylog is still in the POC/QA stage so this is just a one node cluster. For months this has been operating flawlessly. I’ve made the application upgrades without issues so its strange that all of a sudden this issue cropped up. Interestingly enough the CPU has dialed down just a bit but still very high so the alerts arent as constant. I do have some dashboards, no pipelines. What information can i start giving you? Im the only one in charge of this deployment so i know there hasnt been any modifications/changes tot he graylog configuration.
Interesting development. I have shut down one of my inputs as I know that one is receiving more data than the others and wouldnt you know it the CPU utilization dropped immediately after. This is very strange as I made no changes to the configured extractors. This was all working fine. Will continue to debug but i honestly dont know where to begin.
I had this phenomina once and after hours of digging deep, it turned out that one of the Graylog users had forgotten a Firefox tab running a heavy “All time” query with the “Play” button enabled…
I do indeed do parsing from that input.
So for everyone watching, this turned out to be the extractor I’ve been using. What’s interesting is that absolutely nothing has changed. Perhaps the volume of the logs increased which is something i would need to investigate more. But for now that seems to be root cause.
Thank you for everyone chiming in and offering assistance. Appreciate you!