Our Graylog installation has just started to ignore a bunch of previously working syslog sources, with lots of messages like these ones in /path/to/graylog-server/server.log:
WARN [Messages] Failed to index message: index=<> id=<> error=<{“type”:“mapper_parsing_exception”,“reason”:“failed to parse [application_name]”,“caused_by”:{“type”:“illegal_argument_exception”,“reason”:“Invalid format: "prism_gateway"”}}>
ERROR [Messages] Failed to index [499] messages. Please check the index error log in your web interface for the reason. Error: One or more of the items in the Bulk request failed, check BulkResult.getItems() for more information.
and it seems that now all messages from the source(s) are lost (not showing up in graylog), even the well-formated ones that always had showed up earlier in graylog.
I suspect this came after a restart of graylog (automatic upgrade from version 2.4.3 → 2.4.4)
But there were many things happening at the same time. The Disk Journal also got filled up due to not coping with more logs from new and old sources.
We had to add additional ram to Graylog with the increased amount of incoming messages (and restart it again).
After pulling enough hair pulling, just rotating the indices on the index that Graylog complained about, it went away and processing of the sources it previously complained about started to work.
Unless you’ve create a custom index template with the correct mapping for that field, Elasticsearch will try to “guess” the type of each field from the first message it receives in the index.
In that case, it seems that the contents of the “application_name” message field contained something resembling a date in the first message which was written into that Elasticsearch index.
Hmm… that makes sense as someone added new (spammy) syslog sources.
Given that graylog default has some predefined fields (like application_name), wouldn’t it make sense that graylog itself would enforce at least those by adding a custom index mapping for them? I can log a GitHub issue for that.
btw. “graylog_deflector” is missing from the latest docs too.
ok, my bad, but how is application_name generated then? Because this is a pretty vanilla graylog installation, and we never installed an extractor for that field AFAICT.