This message was deleted.
# platform-culture
s
This message was deleted.
o
We have tried to make our migration related changes such that they run on both platforms, behind a flag if needed. The idea is that the same commit is running in production on both the old stack and the new stack, as we shift traffic incrementally. This allows feature teams to continue making changes and delivering features to both platforms during the migration. Building and supporting these kind of abstractions certainly slows things down for us, but it keeps the feature teams delivering at about the same rate they were before we started the migration. We try to take on one migration at a time, because you really never know what you're getting into until you really get in the weeds.
n
That's a good practice. It seems to imply that you also have an explicit abstraction layer that hides all these things you're switching? How would you go about moving an entire stack? Eg from ELK to PromGrafMimir? Specifically what would you do about all the integrations/automations your feature team engineers already rely on with their current ELK stack as you try to move them to the prom stack?
o
I don't know what those are. We try to run parallel CICD pipelines triggered by the same actions (code merge etc). As the new stack becomes more stable we switch over more things like lower environments CNAMES and testing tools. For a while 2 deployments have to be validated and two releases have to be done in parallel, this is the risky, stressful part, so we try to make it as short as possible. We don't really create a overarching abstraction layer, rather we look at dependencies on the infrastructure one case at a time and try to build an abstraction or flag around that. Post migration we'll review those flags and abstractions for possible removal
r
are you talking about vm deployments on the data centers or cloud? either ways incremental migration or gradual migration of each migration stack while testing all scenarios is an option. As Oakley said, for sometime there will be 2 parallel deployments and validations and we switch off the legacy once the new environment is setup considering HADR, autoscaling, networking (egress, ingress) and enabling multiple endpoints across geo locations and tested end to end