Graylog: ES Log: Limit of total fields [1000] in index [graylog_519] has been exceeded and no longer processing


#1

I have seen this a few times now, restarting generally fixes the issues though. I have had a couple of times now, my systems will take in messages but will not output messages anymore.

image

Output from curl -XGET localhost:9200/_cluster/health?pretty=true

{
“cluster_name” : “graylog”,
“status” : “yellow”,
“timed_out” : false,
“number_of_nodes” : 1,
“number_of_data_nodes” : 1,
“active_primary_shards” : 465,
“active_shards” : 465,
“relocating_shards” : 0,
“initializing_shards” : 0,
“unassigned_shards” : 465,
“delayed_unassigned_shards” : 0,
“number_of_pending_tasks” : 0,
“number_of_in_flight_fetch” : 0,
“task_max_waiting_in_queue_millis” : 0,
“active_shards_percent_as_number” : 50.0
}

Output of graylog-ctl status

image

Everything appears good, but about once a week my processing appears to just stop. Any idea’s where to look?


Unprocessed Messages in Journal
#2

2017-08-08_16:50:30.56996 [DEBUG][o.e.a.a.i.m.p.TransportPutMappingAction] [koB3YKZ] failed to put mappings on indices [[[graylog_519/ET4HI3gYQpOs$
2017-08-08_16:50:30.57306 java.lang.IllegalArgumentException: Limit of total fields [1000] in index [graylog_519] has been exceeded

I think this is the pertinent log. How do I increase the ES fields limit which I believe is by default 1000 based on this conversation.

Here are my index settings.


(Jochen) #3

The thread on the Elastic discussion boards you’ve linked to already shows how to increase the maximum number of fields per index/mapping, also see https://www.elastic.co/guide/en/elasticsearch/reference/5.5/mapping.html#mapping-limit-settings for details.

This being said, I think having 1000 different fields in one index is a bit excessive and you should think about splitting up your messages in different index sets, so that each index has less than 1000 different fields: http://docs.graylog.org/en/2.3/pages/configuration/index_model.html


#4

That seems very reasonable, do you have a recommend time you use? I am currently rotating my index once a day, with a average of about 150 messages a second.

I am thinking of just splitting it in half, what are your thoughts on that?


(Jochen) #5

Try splitting your one-size-fits-all index set into multiple index sets, e. g. one for your application logs and one for your network appliances logs (or similar).

It’s NOT about changing the index rotation/retention settings of an index set.


#6

I figured it out, in case anyone else runs into this, here are the instructions.

Streams -> Create Stream -> Name and Chose Index Set -> Create Stream Rules that Catch Messages You Want in New Index.


#7

Jochen,

I made 9 new index for various types of messages.

My network index is still hitting 1000 fields in a 24 hour period according to my logs. Should I further split these logs into 6 hour chunks to alleviate this?


(Jochen) #8

You can try, but having 1000 fields in a single index still sounds wrong to me.

What type of log messages do you record in that index set? Maybe normalizing these logs at an earlier stage (e. g. using extractors or pipeline processing rules) would make sense.


#9

These are just messages from my Fortigate firewalls, I was not having this issue before 2.3.0.

I actually have this error on two different index’s, my network index and my default graylog index.


#10

Jochen,

I figured out what is inflating my field count, but not why.

Here is the pertinent information I found.

It appears graylog is turning certain DLP events into their own fields.\

Here are all of the listed fields for this search criteria.

Here is the raw message that got processed.

date=2017-08-10 time=09:48:14 devname=Firewall devid=ID logid="0954024577" type="utm" subtype="dlp" eventtype="dlp" level="notice" vd="root" filteridx=0 filtertype="none" filtercat="none" severity="medium" policyid=8 sessionid=62954284 epoch=1946636875 eventid=0 srcip=10.3.20.70 srcport=57508 srcintf="port4" dstip=23.196.127.39 dstport=80 dstintf="port2" proto=6 service="HTTP" filetype="unknown" direction="incoming" action="log-only" hostname="images.outbrain.com" url="/v1/QWV2b096U1paNHhpdFExLy8rZUlWZz09/eyJpdSI6Ijk1NmVmNmRiNTAxNTAxNmY3MTAyZjE2MGE2NWRjYTRkNmJjNDcwY2Q3ZGI4NzgyZDNlZTc2NjQzYTlkMDUzOGEiLCJ3IjoyMTUsImgiOjE0MCwiZCI6MS4wLCJjcyI6MCwiZiI6MH0%3D.webp" agent="Chrome/60.0.3112.90" filename="eyJpdSI6Ijk1NmVmNmRiNTAxNTAxNmY3MTAyZjE2MGE2NWRjYTRkNmJjNDcwY2Q3ZGI4NzgyZDNlZTc2NjQzYTlkMDUzOGEiLCJ3IjoyMTUsImgiOjE0MCwiZCI6MS4wLCJjcyI6MCwiZiI6MH0=.webp" filesize=9754 profile="default"


(Jochen) #11

Try using a Raw/Plaintext TCP or UDP input for your Fortigate logs instead of a Syslog TCP or UDP input.


#12

May I ask why you think that will solve the problem?


(Jochen) #13

The Syslog input tries to be smart about the non-standard Fortigate syslog messages, but seems to fail on the URL in your logs because of special characters (like =). The Raw/Plaintext input doesn’t do any parsing.


#14

So my best bet would be to create a new input, on a new port, set as “RAW” and then point my Fortigates at the new port?


(Jochen) #15

Yes, you can try this.


#16

First I updated the extractor I got from Graylog Marketplace to extract the URL to the following, so far it appears to have correct but I won’t know for a few hours.

Here is my updated extractor.

^.*url=\"(\/[\w\/\-\.\+@#%_\(\)\{\}\|\?\=&;]+)\"\s+


#17

I am not getting anymore errors from my Network Index but my default index is still showing it has more than 1000 fields and is throwing errors. Is there a way to see what field(s) is tripping up this index?

Here is the error I am seeing.

Caused by: org.apache.lucene.queryparser.classic.ParseException: Cannot parse 'SGAWKZkziGhZA:': Encountered "<EOF>" at line 1, column 14.

(system) #18

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.