Fawad Khaliq

03/22/2023, 5:30 PM
Excellent write-up by Reddit engineering on the Pi Day outage. This team has a strong availability culture, but even so, upgrading Kubernetes versions and components remains a big challenge (which was the trigger for the 03/14 outage) I'd love to know how is your experience managing upgrades?

Hugo Pinheiro

03/22/2023, 5:34 PM
I have found that doing blue - green cluster deploys is a better experience, doing in cluster upgrades is super dangerous, unfortunately it looks like they got some tech debt and cant use that method - we use a similar method as this post without the ci ( would love to do that though but small steps 😂 ) -

Andre Marcelo-Tanner

03/22/2023, 7:16 PM
Blue green and argocd. Using Terraform. We do not trust in place upgrades and even so there are cluster wide components that have the same blast radius