Graylog inputs stopped when k8s pod re-spawns after failure

bcrowell · June 20, 2019, 7:47pm

I have a rare but annoying problem with GL running in k8s and wondering if anyone else has solved it.

My specific problem is this…

I have 2 GL server pods (GL v2.5) …one running as master/web UI, the other dedicated to data ingestion. On rare occasions the data ingestion pod has crashed and restarted but the inputs associated to that pod (all my inputs only go to that one pod) end up in stopped state after the pod has recovered because k8s changes the pod name dynamically when the new pod is spawned.

I have considered as a possible solution enabling both pods to ingest data, that way if only one pod crashes I can still ingest data on the other pod/server. (But I’d rather not change that piece of my design if I can avoid it.)

Or would it be sufficient to enable the “Global” option the input? Would that mean that any new server that appears in the cluster would automatically be assigned to that input? (essentially replacing the old node name which will have disappeared)

Alternatively maybe I can finagle k8s into presenting a static name to the cluster. I’d rather not mess with my k8s config if at all possible.

Thoughts?

jan · June 21, 2019, 9:33am

Or would it be sufficient to enable the “Global” option the input? Would that mean that any new server that appears in the cluster would automatically be assigned to that input? (essentially replacing the old node name which will have disappeared)

yes - select global means exactly that.

system · July 5, 2019, 9:33am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Send Graylog local logs into Graylog Graylog Central (peer support)	3	519	August 3, 2019
Graylog on multinode k8s cluster Graylog Central (peer support)	1	460	December 24, 2019
Kubernetes Deployment and Data Nodes Graylog Central (peer support) documentation , docker , architecture	0	49	April 10, 2024
Graylog host logs and monitoring Graylog Central (peer support) architecture	5	155	November 27, 2023
How to handle node failure for Graylog Kafka and aws input plugin Graylog Central (peer support)	3	527	April 24, 2018

Graylog inputs stopped when k8s pod re-spawns after failure

Related Topics