Platform engineering is the discipline of designing and building toolchains and workflows that enable self-service capabilities for software engineering organizations in the cloud-native era.

Platform Engineering

automate recovery or collecting of the data?

<@U03FASGNB38> this is all your wheelhouse :slightly_smiling_face:

collecting the data.  We're looking at using some data from PagerDuty, but wanted to get other folks views on it.

I built a tool a long time ago that exported their data and then built graphs off it. At this point I would just dump their data into whatever BI tools you use. I haven’t tried their analytics interface. The problem was always that we wanted to group events so we built rules around certain pages being cascading/dependent (some explicit some heuristics) 

<http://rootly.com|rootly.com> or the other incident management tools probably make it easier. 

<@U04QFDLEGLX> looks like PD just acquired Jeli so your problem is going to get way easier it looks like

Really appreciate the engagement <@U033HAUVBBJ>!

<@U04QFDLEGLX>, I would highly recommend considering the Grafana and Prometheus stack for automating MTTR (Mean Time To Recovery) in your projects. This combination has proven to be highly effective in various projects, and it comes with a wealth of community contributions and strong support. The flexibility and capabilities offered by Grafana and Prometheus make them a reliable choice for monitoring, alerting, and incident response automation, ultimately helping to streamline the recovery process and reduce downtime.

Thanks for the plug <@U033HAUVBBJ>!

Hey <@U04QFDLEGLX> — this is a perfect use case for us at <http://Rootly.com|Rootly.com>. We are helping Figma, LinkedIn, and 100s more do this.

Want to drop me a line <mailto:jj@rootly.com|jj@rootly.com> and I’ll take you through it personally? (I am the CEO here!)

_And yes I realize how late I was here!_

Nice to meet you JJ - and congrats on your Forbes 30 under 30 selection!