Platform engineering is the discipline of designing and building toolchains and workflows that enable self-service capabilities for software engineering organizations in the cloud-native era.

Platform Engineering

Sharing our blog post on The New Stack: <https://thenewstack.io/why-staging-is-a-bottleneck-for-microservice-testing/|"Why Staging Is a Bottleneck for Microservice Testing">
We've found that as microservice architectures grow, staging environments become a major bottleneck. I've talked with engineering teams who've created Slack bots just to manage staging environment access queues, with wait times stretching to hours. This article explores why traditional staging approaches break down at scale and how modern testing strategies using lightweight sandboxes can help teams test earlier without duplicating infrastructure.
If your team is experiencing staging bottlenecks or microservice testing challenges, I'd love to hear your experiences in the comments.

Great Read!
One suggestion that is I think used by a lot of big and medium sized organizations is to have dedicated team environments (not production grade but something in between) that have almost the same deployment architecture as staging or prod. Devs can carry out there testing there and staging can only be dedicated to QA or UAT teams for final testing final go ahead to production. Thoughts?

yes, that’s a common approach that comes with pros/cons. It definitely helps mitigate the contention on shared environments but it’s also adds a lot of operational overhead to maintain multiple envs. Also cross-team interactions can only be tested in the UAT env which could introduce delays in fixing such integration issues.

Yes I agree with overhead but then atleast devs have their own environment and they don't have to wait for their own unit tests or functional test before giving the build forward. The operational overhead can be overcome by dedicated lab or deployment teams but that is only valid for medium and large organizations.

One approach which is very ideal is runtime environments. Devs should be provided with platform via which they can spin up temporary environments with application deployed and after their respective dev work, that environment should be destroyed saving infra overhead as well as operational one.

I agree with <@U08G1TKF3FB>.

Ideally, the ability to spin up &amp; down testing, and staging (partial or full) environments should be available to developers (within reason and cost accountability) through the platform or at least along side of it.

If the infrastructure is truly built completely though IaC then enhancing it to support multiple new environments may make this easier.

Cost accountability for the developers who can invoke these is an important cultural and technical shift.  But failure to do so while enabling developers to spin up costly environments at whim is a recipe for severe budget overruns.

Developers are generally unaware how quickly those $5/hour elastic charges add up and in many places aren't really accountable for the costs.  Even just having them informed of the approximate cost (in $/month or fractions of a developer as well as $/hour) before spinning up is a good step towards taking responsibility.

When the shared staging environment has a train wreck, being able to spin up an alternate environment for higher priority changes may make sense.

But the nature of micro services means that much of the complexity has moved into the interactions between services.  You need to test this somewhere or you will be moving towards my least favorite t-shirt:
> I rarely test my code
> and when I do it is in production.