I am facing message duplication issue sometimes, I can say it as 1%. I have referred multiple discussion on this. but still I am not clear on the final conclusion on it. in most of the discussion it is because of multiple streams with multiple index matching the same message causing this. with my setup it is not proving to be right always. let me give you little details about the setup.
- I am using graylog 4.0, running on kubernetes
- I have 3 pods running graylog server in cluster
- using gelf kafka (legacy mode) as a input
- have multiple streams and each stream has its own index
- message to graylog comes from non kubernetes nodes, so I dont see any issue with client and graylog server pod running on the same node.
I am sending my all kind of logs to graylog, and it is expected to be get matched in multiple streams, the problem is some time few messages gets duplicated. that is same message is available in different index. I have verified my message source, it does not have duplicate message and every message has unique id, so I can know in graylog it is duplicated or not.
My question is, if having multiple streams with multiple index cause message duplication then why it is not happing for all or for majority of messages. if that is not the case then why only few messages gets duplicated. I am not able to reproduce this, it happens randomly.
Please give me some insight on this. let me know if you need more information.
Thanks in advance.