Hi all. I'm wondering how people who are running p...
# general
p
Hi all. I'm wondering how people who are running platform products that span across a large number of internal customers go about making major system changes: • how do you communicate changes? • how do you align on changes? • how to you get sign off from management/stakeholders? • Is there some system of escalation/enforcement (relevant probably only to large engineering orgs)
s
I'd look to product management for this - product leadership has been doing this for decades, but it's very new for platforms
r
Some of the things which we did in my previous org • Depending on who your consumers are, we used to sent a newsletter and monthly updates, to what's coming and any important dates. • For users of the platform, more real time communication is important with something like slack • I've seen organizations having change request windows, were you need to submit what changes you'll be doing, specially in production and Level of impact. • Also you can have your platform Product backlog to give more visibility to what's coming. • Monthly cadence or office hours with your end users
a
It depends if you can make your changes as non-breaking/no outage changes and how big that span is.. in a big enterprise I've seen change management is still very much a slow, tightly governed process for this very reason. If a change is backwards compatible and only impacts in a way a client would be expected to auto retry/recover change should be quite loosely coupled.
p
Thanks for the thoughts, they all make sense
If a change is backwards compatible and only impacts in a way a client would be expected to auto retry/recover change should be quite loosely coupled.
Yeah this is usually the case (when we've designed it well...) but there still seems to be a need to release comms. When things go wrong people need to know whats going on
Monthly cadence or office hours with your end users
I like this idea a lot
I'd look to product management for this
Indeed... I guess thats the role I'm finding myself trying to fill a gap in 😅
j
I'm the Product Owner for and AWS Landing Zone in a large enterprise, we manage major system changes in a few ways; • We have a slack channel that all consuming dev teams are part of, i use emoji's in that channel to grab attention of "important" notices to them and they use it to highlight Pull Requests that need our attention etc • we have a central change management system (servicenow) that i have to update if we expect an outage - this informs the business leaders and teams • I run a backlog that shows what we're working on at any given point in time, and what we have coming up next • getting alignment is tricky at certain times of the year but we thankfully have a gitops approach and are able to make almost all changes with out an outage so we're lucky that we dont often have to try and find a slot that everyone agree's to
p
This is what we do at Weave (Communications): • Communicating changes: ◦ widely-followed Slack channel for platform engineering announcements, coupled with a very active platform engineering support Slack channel ◦ we use our pervasively-used IDP CLI/UI tool to automate some of the changes (including code changes and YAML config changes), and automatically run checks for deprecated configs/codebases and alert devs as they’re using the CLI • For alignment / sign off, we use a combination of topic-specific Slack channels to allow for async discussion with devs and/or Eng leadership, shared docs to share details and solicit detailed feedback, and synchronous meetings (Eng All Hands, Eng Leadership) where needed • For enforcement: we mostly try create a “happy path of least resistance” and only mandate/enforce where absolutely necessary. Even though our teams are very empowered, it’d be a lot of work to “go it alone” and they’re pretty happy with the platform they’ve helped us build (through their feedback and code contributions). • Where we need more “aggressive” enforcement, that’s typically done ◦ through our IDP CLI/UI (forbidding use of deprecated / obsolete things) ◦ CI/CD pipeline checks - prohibiting PR approvals unless an engineering leader overrides ◦ deployment checks - preventing deployment of deprecated / obsolete configurations, or things that would cause conflicts