ElasticsearchException ... Limit of total fields [1000] has been exceeded

Description of your problem

Hello,
im getting “ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Limit of total fields [1000] has been exceeded]]”

Description of steps you’ve taken to attempt to solve the issue

I managed to solve this temporary by moving all WinlogBeats logs into its own index. But for the last couple of months i’ve had to increase the limit from 1000 to 1300 by issuing the following command.
“curl -XPUT http://logserver:9200/graylog_14/_settings -H ‘Content-Type: application/json’ -d’{ “index.mapping.total_fields.limit”: 1300 }’”

I know its not the best solution but its the best that i know of. But now since upgrade today to graylog-server-4.1.5-1.noarch on RedHat 8, it no longer works. I dont know if its because of the upgrade or if just happens to be a bad day…

Environmental information

Operating system information

RedHat 8.4 with latest patches

Package versions

elasticsearch-oss-7.10.2-1.x86_64
mongodb-org-server-4.2.16-1.el8.x86_64
graylog-server-4.1.5-1.noarch

If i look at the two last indexes there is some difference,

```
curl 'http://IP:9200/graylog_14/_settings'
{"graylog_14":{"settings":{"index":{"mapping":{"total_fields":{"limit":"1300"}},"number_of_shards":"4","blocks":{"write":"true","metadata":"false","read":"false"},"provided_name":"graylog_14","creation_date":"1629676803648","analysis":{"analyzer":{"analyzer_keyword":{"filter":"lowercase","tokenizer":"keyword"}}},"number_of_replicas":"0","uuid":"sEiychjxTC2xtI3dTW027w","version":{"created":"7100299"}}}}} 


curl 'http://IP:9200/graylog_15/_settings'
{"graylog_15":{"settings":{"index":{"mapping":{"total_fields":{"limit":"1300"}},"number_of_shards":"4","provided_name":"graylog_15","creation_date":"1630886401586","analysis":{"analyzer":{"analyzer_keyword":{"filter":"lowercase","tokenizer":"keyword"}}},"number_of_replicas":"0","uuid":"6ClHO1OjTemcUTriq7g1_A","version":{"created":"7100299"}}}}}
```
I dont know how to solve this... Thanks!!
Your code goes inside the triple backticks

1000-1300 fieldnames… that’s a lot of fields - chances are something is parsing wrong and you are picking up ever-changing data for field names. Check out any extractors, GROK commands, set_fields() functions etc. in your winlogbeats and look at some of your messages for clues on field names that are actually data… Once you get that under control you can rotate your index or create a new one to clear out unused fields.

1 Like

Thanks @tmacgbay, will try removing some extractors.

So i had installed some dashboards that i had downloaded from the community store. Some came with a lot of extractors. I have removed everything and rotated the index. It has been running fine for three days now but i will wait some more days hoping this was the fix for it.

Thanks!

2 Likes

Well, i have removed all extractors and rotated index. Still get the same error after some days.
Cant see that I have any extra gork commands other than what comes with the installation.
I have no idea how to find out what is causing this. Should I rotate index more often, once per week?

This i my default index
Index prefix:
graylog
Shards:
4
Replicas:
0
Field type refresh interval:
5 seconds

Index rotation strategy:
Index Time
Rotation period:
P14D (14 days, 14 days)

Index retention strategy:
Delete
Max number of indices:
27

Thanks!

Are there messages that have random fields that you don’t need? That’s where you need to clean up…where there is an extraction or pipeline process that creates random fields. How many fields do you have in a typical message? Can you post some examples of message received and maybe how it is broken out? Nicely formatted with the forum tools of course!

Hope my formatting is ok…
We wish to store everything for all servers in Graylog and there filter out data or do forensic search.
In my inputs for Syslog/udp/tcp and Winlogbeat i have chosen , “Store full message”, perhaps that isnt the best.

  • Here are some syslog examples,
full_message
    <30>1 2021-09-24T12:58:09.517151+02:00 uppsok server - - - Fri Sep 24 12:58:09 CEST 2021 search: query='select id,rec,oaid,otag,set from uppsok where ((DCLA:ENG AND (CULTURAL AND DISTANCE\.))) order by -none limit 36 5 with result 000171ba-51933ab2;', hits: 206, init: 0, dbsearch: 8, transform: 0, send: 0, total: 8
message
    Fri Sep 24 12:58:09 CEST 2021 search: query='select id,rec,oaid,otag,set from uppsok where ((DCLA:ENG AND (CULTURAL AND DISTANCE\.))) order by -none limit 36 5 with result 000171ba-51933ab2;', hits: 206, init: 0, dbsearch: 8, transform: 0, send: 0, total: 8

Next

full_message
    <30>1 2021-09-24T12:58:09.109158+02:00 api-prod server - - - 2021-09-24T12:58:09,108 [http-bio-8080-exec-1311] INFO  whelk.component.ElasticSearch - ES query took 25 (23 server-side)
message
    2021-09-24T12:58:09,108 [http-bio-8080-exec-1311] INFO  whelk.component.ElasticSearch - ES query took 25 (23 server-side)

Next


full_message
    <27>1 2021-09-24T12:58:08.524835+02:00 git sssd ldap_child[69412 - - Failed to initialize credentials using keytab [MEMORY:/etc/krb5.keytab]: Preauthentication failed. Unable to create GSSAPI-encrypted LDAP connection.

message
    Failed to initialize credentials using keytab [MEMORY:/etc/krb5.keytab]: Preauthentication failed. Unable to create GSSAPI-encrypted LDAP connection.
  • Winlogbeats, maybe this is way to much to store via Beats
message
    An account failed to log on.

    Subject:
    	Security ID:		S-1-0-0
    	Account Name:		-
    	Account Domain:		-
    	Logon ID:		0x0

    Logon Type:			3

    Account For Which Logon Failed:
    	Security ID:		S-1-0-0
    	Account Name:		xyz
    	Account Domain:		xyz

    Failure Information:
    	Failure Reason:		Unknown user name or bad password.
    	Status:			0xC000006D
    	Sub Status:		0xC0000064

    Process Information:
    	Caller Process ID:	0x0
    	Caller Process Name:	-

    Network Information:
    	Workstation Name:	XYZ
    	Source Network Address:	IP
    	Source Port:		58143

    Detailed Authentication Information:
    	Logon Process:		NtLmSsp 
    	Authentication Package:	NTLM
    	Transited Services:	-
    	Package Name (NTLM only):	-
    	Key Length:		0

    This event is generated when a logon request fails. It is generated on the computer where access was attempted.

    The Subject fields indicate the account on the local system which requested the logon. This is most commonly a service such as the Server service, or a local process such as Winlogon.exe or Services.exe.

    The Logon Type field indicates the kind of logon that was requested. The most common types are 2 (interactive) and 3 (network).

    The Process Information fields indicate which account and process on the system requested the logon.

    The Network Information fields indicate where a remote logon request originated. Workstation name is not always available and may be left blank in some cases.

    The authentication information fields provide detailed information about this specific logon request.
    	- Transited services indicate which intermediate services have participated in this logon request.
    	- Package name indicates which sub-protocol was used among the NTLM protocols.
    	- Key length indicates the length of the generated session key. This will be 0 if no session key was requested.

Thanks!
//Mattias

What is the resulting message that is stored in Graylog? For instance with this line

<30>1 2021-09-24T12:58:09.517151+02:00 uppsok server - - - Fri Sep 24 12:58:09 CEST 2021 search: query='select id,rec,oaid,otag,set from uppsok where ((DCLA:ENG AND (CULTURAL AND DISTANCE\.))) order by -none limit 36 5 with result 000171ba-51933ab2;', hits: 206, init: 0, dbsearch: 8, transform: 0, send: 0, total: 8

picking out the last couple of fields, you would want to see something like:

...
hits:
206

init:
0

dbsearch:
8
...

But if you are seeing fields like:

...
hits206:
init

0dbsearch:
8
...

Then they are being captured wrong and each time the number in the fieldname changes its a whole new field for elasticsearch to save data against. Fieldname hits should have a value of 206 rather than the fieldname being hits206

@tmacgbay Im not skilled enough to know how to verify this.
How do I actually verify what is stored in Graylog?

Thanks!

In the Graylog web interface, when you look at the message?

ah! :slight_smile:
Here is an example from the same host. I cant see anything strange

You can go right to elasticsearch and list all your indexes with

curl -X GET "elastic-server:9200/_cat/indices/*?v&s=index&pretty"

Which would get results like this:

health status index                                     uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   winevents_15                              In4T4Q8dS0z4n-9O1oE9hA   4   0       2576            0    545.2kb        545.2kb
green  open   winevents_16                              d-7Bln2PRMWD-A6SQIXTWw   4   0        102            0     81.5kb         81.5kb
green  open   winevents_17                              iwcqamvMRZSY32sDUN5z3w   4   0      18870            0      3.7mb          3.7mb
...

Then for any particular index (i.e. winevents_15 ) you can look at all the fields with:

curl --netrc -X GET "elastic-server:9200/winevents_15/_mapping?pretty"

which would give you and output of the winevents_15 index like

...     
   },
        "streams" : {
          "type" : "keyword"
        },
        "tags" : {
          "type" : "keyword"
        },
        "timestamp" : {
          "type" : "date",
          "format" : "uuuu-MM-dd HH:mm:ss.SSS"
        },
        "havana_dues" : {
          "type" : "long"
        },
        "havana_cost" : {
          "type" : "long"
        }

There is likely an easier way to do this but essentially you are looking for an index that has way many more fields than you want… in particular nonsensical fields that you don’t want to keep.

1 Like

@tmacgbay I must say this is difficult to understand :slight_smile:
Maybe its to much to post but dont know how to deselect that what is not important.
Should i look at docs.count for a hint on in where the problem relies?
Is it possible to close all indexes and start over?
Sorry for big post…

9200/_cat/indices/*?v&s=index&pretty" 
health status index               uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   .tasks              -9oct7xpRPiXLYMHun6mdA   1   1          3            0     42.1kb           21kb
green  open   beats_index_0       uzkN0BQgR0ia_BCZfR5_NA   4   0    1486645            0      1.1gb          1.1gb
green  open   gl-events_0         Mk3KS8WhQ7-__GOgh0XP3A   4   0        179            0     87.7kb         87.7kb
green  open   gl-events_1         wb8vpwyRRymL9utMEgzrmg   4   0        173            0     86.2kb         86.2kb
green  open   gl-events_10        tB5DSgIGRBiZx-5QKX6Mew   4   0       2655            0    788.9kb        788.9kb
green  open   gl-events_2         icQH53CjSAijGb_VSEYVPQ   4   0    2359166            0    381.7mb        381.7mb
green  open   gl-events_3         MiKvhJS9QMasbgk7Wfpv4w   4   0      11009            0      1.9mb          1.9mb
green  open   gl-events_4         xDQktZ_bRCe1eV0u1dlEhg   4   0        262            0    106.7kb        106.7kb
green  open   gl-events_5         X18B9T8TR3uRjXQTyaLmjQ   4   0        697            0    192.5kb        192.5kb
green  open   gl-events_6         9zWALHrFToe1-9H1vx93Tg   4   0        442            0    146.7kb        146.7kb
green  open   gl-events_7         HWrWQy9RRo2dvA004Ou4YQ   4   0        231            0    103.6kb        103.6kb
green  open   gl-events_8         nEzNLs1vQiu9YbuJHvu2ew   4   0        142            0     83.1kb         83.1kb
green  open   gl-events_9         sVN4ADzCQSGV2EBsPyutjA   4   0        315            0    126.4kb        126.4kb
green  open   gl-system-events_0  xm08zDEjSrmgdblo0IVs8Q   4   0          0            0        1kb            1kb
green  open   gl-system-events_1  V-tewU6fQZySpLdBeye5AQ   4   0          0            0       832b           832b
green  open   gl-system-events_10 712WpSl0SPytLrSiA9wWtg   4   0          0            0       832b           832b
green  open   gl-system-events_2  4lObtjNjT16RcdzityVs_w   4   0          0            0       832b           832b
green  open   gl-system-events_3  DuaUty4gQuaW-quQW0Q8qw   4   0          0            0       832b           832b
green  open   gl-system-events_4  01jtPKhJRUmv9maVKxZZSQ   4   0          0            0       832b           832b
green  open   gl-system-events_5  j8J4GjQZRie9WSCCIoYg3Q   4   0          0            0       832b           832b
green  open   gl-system-events_6  tu32eMeaRvqP-EEqqflrTg   4   0          0            0       832b           832b
green  open   gl-system-events_7  QIy4Md_SSPK3z_bSNA_5aQ   4   0          0            0       832b           832b
green  open   gl-system-events_8  VA5VvyiWTsSVnoXgzCXUMQ   4   0          0            0       832b           832b
green  open   gl-system-events_9  VdE1tdLeQZ-RQWPhx-byxw   4   0          0            0       832b           832b
green  open   graylog_0           NiEp2VVsSeCbJbFgX_KBEQ   4   0   72274404            0     22.6gb         22.6gb
green  open   graylog_1           e4oCHGx0QsK2-mgEb_nZ3w   4   0  263747092            0     75.4gb         75.4gb
green  open   graylog_10          gbTzPiU4ThedH3C73qPxgA   4   0   75593930            0     23.1gb         23.1gb
green  open   graylog_11          xNbl1nGaRk-OJrwpebpKHQ   4   0   76359946            0     23.1gb         23.1gb
green  open   graylog_12          _qz9TSP4QOej6TnqM6uP6g   4   0   77574764            0     23.7gb         23.7gb
green  open   graylog_13          UZ3Zd3PnSGGmiT1TBolH0w   4   0   66986570            0     20.9gb         20.9gb
green  open   graylog_14          sEiychjxTC2xtI3dTW027w   4   0   44073994            0     12.9gb         12.9gb
green  open   graylog_15          6ClHO1OjTemcUTriq7g1_A   4   0   35325711            0     10.3gb         10.3gb
green  open   graylog_16          K4sO2MVySjW7NXHEd5N4gQ   4   0    7961839            0      2.2gb          2.2gb
green  open   graylog_17          e4uEiK_ESYCzAYtC1tWWCQ   4   0   24488079            0      6.8gb          6.8gb
green  open   graylog_18          0fzOQFeGSAKp3eku2lmkfw   4   0    2509392            0      870mb          870mb
green  open   graylog_2           61D-0c2jRQygJ-bT_ViV0g   4   0   60665027            0     17.9gb         17.9gb
green  open   graylog_3           TkrafyN4QSKEgFM1VEjn1A   4   0   72468335            0     21.6gb         21.6gb
green  open   graylog_4           uB6Kw-loQfqlXtoFaSqISw   4   0   69568197            0     20.8gb         20.8gb
green  open   graylog_5           wF2vBLUqS8utMq1gvkZEWw   4   0   75445288            0     22.8gb         22.8gb
green  open   graylog_6           O5hU6FKpQaiE39pxAN6CcQ   4   0   75206338            0     22.8gb         22.8gb
green  open   graylog_7           t2nhE36eSUeSrMDGa9W8Tg   4   0   45635448            0     13.8gb         13.8gb
green  open   graylog_8           i_gPA05xTdycAVitwYWw0g   4   0   34314140            0     10.5gb         10.5gb
green  open   graylog_9           P5KyDktqTYyQYMdWzxsZtQ   4   0   82860321            0       25gb           25gb

I took the second command you posted and piped it to wc to get a hint on how many lines there was.
Perhaps that is the wrong approach. Every index has a huge number of fields. Some examples at bottom.

root@log01:~# curl --netrc -X GET "log01:9200/graylog_0/_mapping?pretty" | wc -l
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 26039  100 26039    0     0  12.4M      0 --:--:-- --:--:-- --:--:-- 12.4M
919
root@log01:~# curl --netrc -X GET "log01:9200/graylog_18/_mapping?pretty" | wc -l
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 34941  100 34941    0     0  16.6M      0 --:--:-- --:--:-- --:--:-- 16.6M
1597
root@log01:~# curl --netrc -X GET "log01:9200/beats_index_0/_mapping?pretty" | wc -l
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 37186  100 37186    0     0  17.7M      0 --:--:-- --:--:-- --:--:-- 17.7M
1210
root@log01:~# curl --netrc -X GET "log01:9200/graylog_1/_mapping?pretty" | wc -l
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 14166  100 14166    0     0  13.5M      0 --:--:-- --:--:-- --:--:-- 13.5M
532

From Beats

        "winlogbeat_winlog_user_domain" : {
          "type" : "keyword"
        },
        "winlogbeat_winlog_user_identifier" : {
          "type" : "keyword"
        },
        "winlogbeat_winlog_user_name" : {
          "type" : "keyword"
        },
        "winlogbeat_winlog_user_type" : {
          "type" : "keyword"
        },

From normal syslog
    },
    "floser" : {
      "type" : "keyword"
    },
    "fogelvik" : {
      "type" : "keyword"
    },
    "fors" : {
      "type" : "keyword"
    },
    "forsberg" : {
      "type" : "keyword"
    },

I have lines for everything, usernames, searches, machine names …
No clue what to look for

Thanks!!!

What are the fields floser, fogelvik, fors, forsberg … are they fields you are creating from a syslog message - What do they mean? (I am American so I have an unfortunate narrow view of language… :slight_smile: )

Yes! The fields are created from syslog messages. Can be usernames, search words from a web search. “forsberg” is a lastname… :slight_smile:

Have I setup my syslog client wrong and that is whats creating all lines?!?!?
All clients have this rsyslog.conf entry:

*.emerg;*.=warn;\
  auth,authpriv.*;\
  cron,daemon.*;                             @log-server:514;RSYSLOG_SyslogProtocol23Format

Definitely that is what it is. All those things (usernames, search strings, etc…) should be contained by field name. Use Graylog to parse what kind of syslog message is coming in with an extractor or in the pipeline, then you can have fields related to the information coming in. Something more along the lines of:

},
"username" : {
  "type" : "keyword"
},
"search_word" : {
  "type" : "keyword"
},
"last_name" : {
  "type" : "keyword"
},
"src_ip" : {
  "type" : "keyword"
},

OK!
The thing is, i need to keep all data coming in from syslog clients and from winlogbeats clients. I have them on different inputs/ports. Is there a good guide how to split everything into different indexes/extractors or some other solution the will prevent me from getting these problems?

Thanks for great support!!

Here is an excellent video that will help start (Lawrence Systems - Graylog) It has a good beginning explanation of extractors, indexes and the pipeline. It doesn’t go into depth about how to use the extractors/pipeline but it is an excellent framework to work off of and will make it easier as you delve into documentation (Extractors) and (Pipeline).

Good luck!

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.