This message was deleted Platform Engineering #platform-leadership

Join Slack

This message was deleted.

# platform-leadership

Slackbot

02/27/2024, 10:59 AM

This message was deleted.

Steve Fenton

02/27/2024, 11:57 AM

The research (DORA, SlashData, et al) suggests that technical practices mean throughput and stability can both improve. There's no trade off. If you sacrifice stability to get throughput, it suggests a puzzle piece is missing. It's okay to "break things" in terms of trying an idea and finding it's not helping achieve your goal. It's not as much about bringing services to their knees with every deployment 😄 Throughput and stability both enhance your ability to run more experiments with your software. That's the kind of "move fast" that's really desirable.

Scott Hiland

02/27/2024, 1:53 PM

Toil reduction through automation is one of the best technical practices to follow as it aids consistency.

Jordan Chernev

02/27/2024, 2:07 PM

one of the strategies i’ve seen used at scale is a production readiness scorecard / checklist that needs to be satisfied before a new service gets deployed to production. you pay for the reliability upfront so expect an initial “slowdown” as a one time start up cost that will pay you dividends in the long run

Shubham Srivastava

02/27/2024, 2:08 PM

That's a good starting point for sure @Jordan Chernev, do you have an example checklist or a reference that I could use?

Jordan Chernev

02/27/2024, 2:10 PM

you can search for examples on google for an actual example. this one is always a great starting point - https://sre.google/sre-book/launch-checklist/

Jordan Chernev

02/27/2024, 2:11 PM

i also found a lot of results just now by searching for “production readiness checklist template” on google

Jordan Chernev

02/27/2024, 2:11 PM

some interesting results, including cached .docx files

Jordan Chernev

02/27/2024, 2:11 PM

look around…

Scott Hiland

02/27/2024, 2:43 PM

Be mindful that checklists can quickly become an antipattern. They're easily gamed, ignored, or pushed to the side by teams and overcome by events when things go sideways. @Jordan Chernev when you say at scale, are you talking about scale for a specific service, number of services brought up in a period of time, or total number of teams supported by a platform? I'm trying to think of instances from my clients where I've seen checklists provide greater value than heartache over the long term and I can't remember any. Teams tend to see checklists as impediments rather than value-adding outputs. So building automation and templated paths to production that test go-live capabilities at agreed upon service level objectives are more effective, efficient, and reduce latency in feedback loops. If you're talking real platform engineering, building or improving paths to production alongside your developers is the way to go. Checklists just put a wall between platform teams and developers.

Jordan Chernev

02/27/2024, 3:01 PM

number of services brought up in a period of time, or total number of teams supported by a platform

these two

Jordan Chernev

02/27/2024, 3:01 PM

So building automation and templated paths to production that test go-live capabilities at agreed upon service level objectives are more effective, efficient, and reduce latency in feedback loops.

the two aren’t orthogonal in my mind

Jordan Chernev

02/27/2024, 3:02 PM

create your checklist, start templatizing / automating against it

Jordan Chernev

02/27/2024, 3:02 PM

If you’re talking real platform engineering, building or improving paths to production alongside your developers is the way to go. Checklists just put a wall between platform teams and developers.

somewhat agree. we are discussing SRE practices in this thread, less so platform engineering

Jordan Chernev

02/27/2024, 3:03 PM

there is a difference in terms of missions and concerns between PEs and SREs

Jordan Chernev

02/27/2024, 3:03 PM

https://www.infoq.com/articles/platform-sre-evolving-devops/

Jordan Chernev

02/27/2024, 3:03 PM

the chart in the middle of the article is great

Scott Hiland

02/27/2024, 5:54 PM

Platform Engineers at their highest level of maturity should care about all of those things. It's one of the reasons why a strong platform engineering team is not easy to build. DevOps and SRE encompass practices that platform engineering teams must also become proficient in. It's a little bit like Maslow's hierarchy of needs. For Platform Engineers you start at technical proficiency, grow into operational efficiency, and eventually on to fully realized developer experience. I don't agree that mature SRE, DevOps, and Platform Engineering teams shouldn't have full overlap. There's no need for that kind of separation of duties in a performant organization. The divisions should be business lines, not technical lines. Anyhow, if you're talking taking a checklist you build more automation against over time, that makes more sense, something that as a dev pushes code automatically validates the feature on its way to prod in an observable way, that is a Good Thing. I understand that you can't do it all at once. Checklists as a thing that becomes a static wall that developers must bang their heads against are not. It's a little too easy for the artifact to become a monument if the platform team isn't focused on user-centricity and feedback loops.

Jordan Chernev

02/27/2024, 7:26 PM

i feel like we are closer to an agreement here, as opposed to not. in case it helps, most of my experience and lens here is colored by being a member of a multi-thousand technologist community, e.g. 4k+, comprised of 620+ product teams across 5 timezones

Scott Hiland

02/27/2024, 10:34 PM

Probably. And I won't get out a 📏 .

Andrew Fong

02/28/2024, 12:05 AM

managed delivery is the answer. Checklists are awful :)

Steve Fenton

02/28/2024, 12:25 PM

Whenever I see a checklist, I get a premonition of automation! You sometimes have to create the checklist (temporarily) to map out what's happening... but most checkbox exercises can and should be automated.

Jordan Chernev

02/28/2024, 12:30 PM

steve gets it

Steve Fenton

02/28/2024, 12:32 PM

Usually 5 minutes after everyone else though 😄

2 Views

Open in Slack

Previous Next