Good morning, hopefully the right channel to post ...
# platform-leadership
j
Good morning, hopefully the right channel to post this. I've set up a platform team to deal with the foundational work towards a IDP. They work embedded in the product streams to reduce the feedback loop. Responsibilities are infrastructure design and automation, CI/CD, container base image maintenance. That had worked well and now I'm looking to integrate with the Operation team to reduce knowledge gap and increase efficiency. Currently Operations deal with customer support (2nd line), monitoring, patching and part of the team is skilled on scripting. They also focused on improving either cost or monitoring. From my point of view there is a clear boundary on customer support and platform/operations. However how can I make this more a responsibility boundary rather than a fence which makes knowledge sharing difficult? Or should that customer support be part of the platform to really understand what is working for the user and what's not? Any experience or suggestions from this group? Thank you.
s
Are you using an internal tool like Slack for comms? What is your support and ops schedule? One solution I've seen work is to have a platform person whose "other duties" for the week include support for the customer facing support team and for that person to monitor the support/ops slack channel, within boundaries e.g. user story or ticket for the problem must be created, the question from the support team must include the detailed steps they followed with the customer or when troubleshooting, and they must stay assigned as the owner of the story until it is resolved. Rotate the platform person supporting them weekly so everyone sees the common problems. It will help inform future priorities. Another big help is when everyone uses the same status dashboard, but the magic is when everyone has agreed to what that is observed, monitored, and alerted on that dashboard. Slack or some other form of non-email, non-ticket system asynchronous communications is the best way to make sure teams are talking. It requires all of the team's leadership to be willing to engage in that way, though.
d
We are currently in a somewhat similar situation actually, but our team is all bundled in one (because of lack of resources and small team). So our approach is we created a semi-automated service-desk in JIRA which is somewhat of a ticketing system yet opens/integrates with a comms channel specific for a said ticket/problem. The advantages of this is we slowly gather/analyze the data/incidents opened by our customers to better understand and visualize where the pain points are. So +1 on what Scott mentioned for sure, I can't say whether the result will be super effective just yet (we are in the middle of the new process). But the rotations do help spread some of the knowledge in the team and hands on experience.
The problem we are currently mostly facing is high cognitive overload in the team, we are trying to strive and create a platform/IDP while supporting current state.. Which is proving to be almost impossible with the team size we have..
b
@Jorge Cruz Lambert I faced a similar challenge of integrating the product engineering with existing operations teams as part of common platform engineering effort. This approach to bridging the gap between operations and engineering has proven to be effective 👇 Building platform engineering mindset in the operations team: it takes time to shift away from thinking around traditional service monitoring, patching and resolving 2nd line tickets towards more developer centric thinking. We should expect more proactiveness and ownership, this ofc. requires some level of empowerment. Establishing operations handover process: that encourages an incremental approach, building the necessary skills to operate in each space (knowledge sharing sessions). Following a clear responsibility model between the teams and gradually expanding this as operations team build more confidence. Ops team should start handling support activities from here. Revisiting operations team productivity metrics. The number of tickets resolved is not an indicative success criteria for multi-disciplinary platform engineering team. The operations team should also contribute to the documentation, alerting, automation runbooks, etc. Finally, introducing operational triage aimed at identifying, assessing, and prioritising issues, incidents, or tasks that require immediate attention in an operational setting. Each week asking the following questions: • Are there any processes that are currently inefficient or could be streamlined? Have we encountered any recurring issues? How might we improve them? • Do our team members have all the necessary skills and tools to perform their tasks effectively? If not, what training or tools do we need to provide? • What are our key objectives for the team, and what steps are we taking to achieve them?
e
To compliment this approach, I'd say: - Pick champion role in Operations team: having a dedicated person, a name behind the role, builds sense of ownership and accountability. They often act as the main points of contact to the rest of the team (until a closer relationship between engineering and operations is established). - Involve the Operations team early in the project (if possible): this ensures that operational perspectives are considered from the beginning, and leads to better knowledge distribution and smoother transition in the future. Overall, I think it’s important put an emphasis on proactiveness and ownership-driven culture.