Hi all,
We have old Graylog 3.3.15 setup running on AWS ECS/EC2 with AWS OpenSearch(ES 5.6) cluster behind it (yes, I know, very old versions, we plan to update it this year, but I’d like to fix the current issue).
About 2 weeks ago ES cluster went into read-only mode, because it run out of storage and before we noticed and fixed it, it was in state green, but not accepting any new messages. After space was freed, it started working again, but it caught up to about current time -4h and stays there since more than a week.
Notes:
- There was no change in how messages are being sent, timestamps look correct
- I’ve recalculated index ranges - it changes nothing
- CPU/mem on containers and ES cluster seem to be pretty low, so it should be able to catch up to current time
- Graylog is showing 2 messages constantly:
Uncommited messages deleted from journal
andJournal utilization is too high
- I have a suspicion that it is going through them at the same rate that the new messages are being saved. I’m ok with emptying the journal, even if it means losing those messages.
Any pointers/help would be appreciated.