Hello everyone
I have done a bad thing: pushed all the logs from windows and linux to the same index and hit the 1000 fields limit in ES.
So I made separate indices for windows, linux and metricbeat and set up routing. The original index remains for windows. New messages work fine and I tried reindexing older linux messages to its own index. ES processed it just fine, however they don’t show up in graylog. I have tried recalculating index ranges, but it seems to do nothing. Index detail shows 222k messages, but results show only 14k messages (stored since reindex).
Graylog 4.0.8 on Debian 10 (Graylog itself shows Debian 11, but that’s not true), ES 6.8.16, mongo 4.2.14
I bet you wont do that again
There might be something here that will help you about reindexing/Elasticsearch
Not sure all what you did or your configuration made but I know its about the communication between Graylog and elasicsearch. Do you see anything in the log files about this issue?
Hope that helps
I created the index via graylog itself, so it knows about it. Messages indexed into it via graylog are found and shown, however older messages (re)indexed manually via ES api are not found and shown.
I don’t see any logs when searching the index or recalculating ranges.
Can you tried a couple of command on your Elasticsearch?
The following commands are for LOCALHOST, if your ES config file is different replace localhost with the correct address.
Not sure if you have done this but it might help identify whats going on.
ES Health Check curl -XGET http://localhost:9200/_cluster/health?pretty=true
gl-events_19 1 p STARTED 0 261b 127.0.0.1 Scmw-HV
gl-events_19 3 p STARTED 0 261b 127.0.0.1 Scmw-HV
gl-events_19 2 p STARTED 0 261b 127.0.0.1 Scmw-HV
gl-events_19 0 p STARTED 0 261b 127.0.0.1 Scmw-HV
gl-system-events_21 1 p STARTED 0 261b 127.0.0.1 Scmw-HV
gl-system-events_21 2 p STARTED 0 261b 127.0.0.1 Scmw-HV
gl-system-events_21 3 p STARTED 0 261b 127.0.0.1 Scmw-HV
gl-system-events_21 0 p STARTED 0 261b 127.0.0.1 Scmw-HV
sgt-metrics_0 1 p STARTED 236716 144.9mb 127.0.0.1 Scmw-HV
sgt-metrics_0 3 p STARTED 236413 144.6mb 127.0.0.1 Scmw-HV
sgt-metrics_0 2 p STARTED 238112 145.5mb 127.0.0.1 Scmw-HV
sgt-metrics_0 0 p STARTED 238510 146.1mb 127.0.0.1 Scmw-HV
webbox_1 1 p STARTED 4999724 2.2gb 127.0.0.1 Scmw-HV
webbox_1 3 p STARTED 4999812 2.2gb 127.0.0.1 Scmw-HV
webbox_1 2 p STARTED 4999899 2.2gb 127.0.0.1 Scmw-HV
webbox_1 0 p STARTED 5001010 2.2gb 127.0.0.1 Scmw-HV
sgt-linux_0 1 p STARTED 62611 36.9mb 127.0.0.1 Scmw-HV
sgt-linux_0 2 p STARTED 62615 35.4mb 127.0.0.1 Scmw-HV
sgt-linux_0 3 p STARTED 62340 35.3mb 127.0.0.1 Scmw-HV
sgt-linux_0 0 p STARTED 62811 35.3mb 127.0.0.1 Scmw-HV
sgt__1 1 p STARTED 132286 77.8mb 127.0.0.1 Scmw-HV
sgt__1 2 p STARTED 132394 77.5mb 127.0.0.1 Scmw-HV
sgt__1 3 p STARTED 132794 77.6mb 127.0.0.1 Scmw-HV
sgt__1 0 p STARTED 132926 78mb 127.0.0.1 Scmw-HV
gl-system-events_20 1 p STARTED 0 230b 127.0.0.1 Scmw-HV
gl-system-events_20 3 p STARTED 0 230b 127.0.0.1 Scmw-HV
gl-system-events_20 2 p STARTED 0 230b 127.0.0.1 Scmw-HV
gl-system-events_20 0 p STARTED 0 230b 127.0.0.1 Scmw-HV
webbox__0 1 p STARTED 4998551 1.6gb 127.0.0.1 Scmw-HV
webbox__0 3 p STARTED 4998956 1.6gb 127.0.0.1 Scmw-HV
webbox__0 2 p STARTED 5002882 1.6gb 127.0.0.1 Scmw-HV
webbox__0 0 p STARTED 4999638 1.6gb 127.0.0.1 Scmw-HV
wb2__0 1 p STARTED 48 122.9kb 127.0.0.1 Scmw-HV
wb2__0 3 p STARTED 59 209.5kb 127.0.0.1 Scmw-HV
wb2__0 2 p STARTED 41 120.5kb 127.0.0.1 Scmw-HV
wb2__0 0 p STARTED 52 183.2kb 127.0.0.1 Scmw-HV
gl-events_18 1 p STARTED 3 9.3kb 127.0.0.1 Scmw-HV
gl-events_18 3 p STARTED 2 8.5kb 127.0.0.1 Scmw-HV
gl-events_18 2 p STARTED 5 10.9kb 127.0.0.1 Scmw-HV
gl-events_18 0 p STARTED 5 11.2kb 127.0.0.1 Scmw-HV
sgt__0 1 p STARTED 1843105 1.6gb 127.0.0.1 Scmw-HV
sgt__0 2 p STARTED 1839851 1.6gb 127.0.0.1 Scmw-HV
sgt__0 3 p STARTED 1840025 1.6gb 127.0.0.1 Scmw-HV
sgt__0 0 p STARTED 1840398 1.6gb 127.0.0.1 Scmw-HV
gl-events_21 1 p STARTED 0 261b 127.0.0.1 Scmw-HV
gl-events_21 2 p STARTED 0 261b 127.0.0.1 Scmw-HV
gl-events_21 3 p STARTED 0 261b 127.0.0.1 Scmw-HV
gl-events_21 0 p STARTED 0 261b 127.0.0.1 Scmw-HV
webbox_0 1 p STARTED 5000119 2gb 127.0.0.1 Scmw-HV
webbox_0 3 p STARTED 5000415 2gb 127.0.0.1 Scmw-HV
webbox_0 2 p STARTED 4997033 2gb 127.0.0.1 Scmw-HV
webbox_0 0 p STARTED 5002636 2gb 127.0.0.1 Scmw-HV
gl-system-events_18 1 p STARTED 0 261b 127.0.0.1 Scmw-HV
gl-system-events_18 2 p STARTED 0 261b 127.0.0.1 Scmw-HV
gl-system-events_18 3 p STARTED 0 261b 127.0.0.1 Scmw-HV
gl-system-events_18 0 p STARTED 0 261b 127.0.0.1 Scmw-HV
webbox__1 1 p STARTED 1203937 445.9mb 127.0.0.1 Scmw-HV
webbox__1 2 p STARTED 1202448 445.3mb 127.0.0.1 Scmw-HV
webbox__1 3 p STARTED 1204902 448.2mb 127.0.0.1 Scmw-HV
webbox__1 0 p STARTED 1203242 446.4mb 127.0.0.1 Scmw-HV
graylog_0 1 p STARTED 1341830 1.4gb 127.0.0.1 Scmw-HV
graylog_0 3 p STARTED 1342102 1.4gb 127.0.0.1 Scmw-HV
graylog_0 2 p STARTED 1341283 1.4gb 127.0.0.1 Scmw-HV
graylog_0 0 p STARTED 1342823 1.4gb 127.0.0.1 Scmw-HV
gl-events_20 1 p STARTED 0 230b 127.0.0.1 Scmw-HV
gl-events_20 2 p STARTED 0 230b 127.0.0.1 Scmw-HV
gl-events_20 3 p STARTED 0 230b 127.0.0.1 Scmw-HV
gl-events_20 0 p STARTED 0 230b 127.0.0.1 Scmw-HV
gl-system-events_19 1 p STARTED 0 261b 127.0.0.1 Scmw-HV
gl-system-events_19 3 p STARTED 0 261b 127.0.0.1 Scmw-HV
gl-system-events_19 2 p STARTED 0 261b 127.0.0.1 Scmw-HV
gl-system-events_19 0 p STARTED 0 261b 127.0.0.1 Scmw-HV
ES Shard Info
output
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "unable to find any unassigned shards to explain [ClusterAllocationExplainRequest[useAnyUnassignedShard=true,includeYesDecisions?=false]"
}
],
"type": "illegal_argument_exception",
"reason": "unable to find any unassigned shards to explain [ClusterAllocationExplainRequest[useAnyUnassignedShard=true,includeYesDecisions?=false]"
},
"status": 400
}
ES List Indices
output
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open gl-events_11 JsuasY0oT4KA4mC50QAsMQ 4 0 2 0 17.9kb 17.9kb
green open gl-system-events_19 T5QM2zNsRd2ww_ktItN2Hw 4 0 0 0 1kb 1kb
green open gl-system-events_14 OTNr9pz-SguqCozjSo7m9A 4 0 0 0 1kb 1kb
green open webbox_1 RJ7_3D8PQgKUwL4pYMv_7w 4 0 20000445 0 8.8gb 8.8gb
green open gl-events_12 kahttUWyRzeRJX1_SgC4Dg 4 0 0 0 1kb 1kb
green open gl-events_18 BK-WFN_FQ7qpOJuhS1ahag 4 0 15 0 40.1kb 40.1kb
green open gl-events_16 UvUbCIxdS9acO14Do4kKhw 4 0 0 0 1kb 1kb
green open gl-events_21 -IVVw7ZER1q1UYMrmDxskQ 4 0 0 0 1kb 1kb
green open webbox__1 MQSJRrBcSC6UXlrLYbzp3w 4 0 4814389 0 1.7gb 1.7gb
green open gl-events_10 nJVDx0kkQbawl1Do25E4TQ 4 0 1 0 9.4kb 9.4kb
green open sgt-linux_0 Cx84pdHhR5O2RzIzrEIk9w 4 0 250377 6 143.1mb 143.1mb
green open gl-system-events_15 B-G57PUWSbCEqsJ-DRoR-g 4 0 0 0 1kb 1kb
green open graylog_0 oRJ-90o1TIaFY3p_DfaDEQ 4 0 5368038 0 5.6gb 5.6gb
green open gl-system-events_17 PWcuhS4CQGq4hO5VFXa6uQ 4 0 0 0 1kb 1kb
green open gl-system-events_10 DA2lydduRDC8cw4ZTyRpYA 4 0 0 0 1kb 1kb
green open gl-events_19 2ZQZZHWVScipu_bw_gRiGQ 4 0 0 0 1kb 1kb
green open gl-events_20 7NuoC1lxQeCEjzNc_D88Bw 4 0 0 0 920b 920b
green open gl-system-events_16 BfFSc-d8S0OvCp76LPwWqA 4 0 0 0 1kb 1kb
green open gl-events_14 djgmkXzCRfmQn_vhR4X1SA 4 0 0 0 1kb 1kb
green open gl-events_15 k4ZMDR2MQwScYDvxSdup5A 4 0 0 0 1kb 1kb
green open gl-system-events_18 LRU6IAfoSuC2J8OLWV1znQ 4 0 0 0 1kb 1kb
green open sgt__1 zoHPZTdbQsWDs4opag1A_Q 4 0 530227 0 310.9mb 310.9mb
green open webbox_0 x1yBam6JQ7SQNsT6JuRlxg 4 0 20000203 0 8.3gb 8.3gb
green open gl-events_17 wi-C1sv0Q9mY1cfBgygIHQ 4 0 0 0 1kb 1kb
green open gl-system-events_13 0jk_N9nlTZ-ux8B-PXsbQA 4 0 0 0 1kb 1kb
green open gl-system-events_11 8W8vT9vLQwaUqCasWRMFrA 4 0 0 0 1kb 1kb
green open wb2__0 YQ000ligQqGBqSkQETCpew 4 0 200 0 636.2kb 636.2kb
green open gl-system-events_21 XYNzEHnwTEiS1VV5jUcQtw 4 0 0 0 1kb 1kb
green open sgt__0 IYqFODXrTl67JGAskNXUMA 4 0 7363379 0 6.6gb 6.6gb
green open gl-system-events_20 eelxa_fZR0SG1j_Yv6zRkA 4 0 0 0 920b 920b
green open gl-system-events_12 L1pHpMPCRzifriSSHAFuxQ 4 0 0 0 1kb 1kb
green open gl-events_13 V9Nd0EdTTP-pV_I6QOm-LQ 4 0 35 0 44.6kb 44.6kb
green open webbox__0 IjbuABNQRZOzRvcDCcvmTg 4 0 20000027 0 6.7gb 6.7gb
green open sgt-metrics_0 7AYFY1_jSE26OwBQX6sRgQ 4 0 949696 0 581.4mb 581.4mb
Hello,
Thanks for the added info. I really didnt notice anything that stuck out that could be wrong. It actually looks good.
As for you
I never had your issue before so its a little unclear how to recover old message that were combined on the same index then separted. Really dont know what to tell you. Maybe someone else here might have an idea how to retrieve those older messages.
What is your index retention configuration set as? If set on delete/close maybe this may have happed when you recalculating index. Have you also tried rotating active write index?
I think the recalculate index ranges function/button does not work.
according to the source code, there should be something in the log (I have logging set to “info”), but there is nothing.
Edit:
the index ranges of most indices (including this one) are begin:0, end:0. It seems to me that active write index does not use index ranges (I think the mentioned code also skips them)
the index is set to hold 20M messages (default setting), max 20 indices and then delete. It has not rotated yet, so there is only one index with 4 shards.
okay, so I have solved the “mystery”
Graylog uses stream id in the search query and since the old index had different id, the messages from it were not returned by ES.
So I updated all messages in the index to have the new stream id and graylog shows them.