00:00
49:46
Incidents happen! And when asking Laura Nolan who was an SRE at Google and Slack, healthy organizations should take proper time to analyze and learn from them. This will improve future incident response as well as overall system resiliency.Tune in to this episode and hear Laura’s tips & tricks what makes a good SRE organization. It starts with doing good write ups of incidents, doing your research on incident reports of software and services that you are looking into using. We also spent a good amount of time discussing root cause analysis where she highlighted an incident that happened at her time at Google and what she learned about outdated alerting.Thanks Laura for a great discussion and lots of insights.

Here are the additional links we discussed during the podcast
Incidents happen! And when asking Laura Nolan who was an SRE at Google and Slack, healthy organizations should take proper time to analyze and learn from them. This will improve future incident response as well as overall system resiliency.Tune in to this episode and hear Laura’s tips & tricks what makes a good SRE organization. It starts with doing good write ups of incidents, doing your research on incident reports of software and services that you are looking into using. We also spent a good amount of time discussing root cause analysis where she highlighted an incident that happened at her time at Google and what she learned about outdated alerting.Thanks Laura for a great discussion and lots of insights. Here are the additional links we discussed during the podcast Laura on LinkedIn: https://www.linkedin.com/in/laura-nolan-bb7429/ Laura on Twitter: https://twitter.com/lauralifts Incident Template talk @ SRECon: https://www.usenix.org/conference/srecon22emea/presentation/nolan-break What SRE could be talk @ SRECon: https://www.usenix.org/conference/srecon22emea/presentation/nolan-sre Howie Post-Incident Guide: https://www.jeli.io/howie/welcome My philosophy on Alerting article: https://docs.google.com/document/d/199PqyG3UsyXlwieHaqbGiWVa8eMWi8zzAn0YfcApr8Q/edit read more read less

11 months ago #alerting, #dynatrace, #incident, #pureperformance, #sre