Great Circle Waypoints BlogItems of interest to Great Circle clients and friends
“How do we keep senior managers from disrupting incident responses?” That audience question generated the strongest response last week at my workshop on Incident Command for IT at the fantastic USENIX SREcon18 Americas.
Senior management definitely has a critical role to play in incident response, but as soon as somebody asked that question, the room lit up; it seemed like all 200 people had tales to share about active incident responses that were inadvertently derailed by directors, executives, and other senior managers. It was clear that this was a significant source of frustration for incident responders and incident leaders in the room.
Incident management is about controlling chaos, and senior management can be a significant source of chaos during an incident, usually without meaning to be. Why is this so, and how can senior managers, incident leadership, and responders all work together to avoid this? […]
Come join me at next week's BayLISA meeting in San Jose, as I'll be speaking about Learning from the Fire Department: Experiences with Incident Command for IT. I'll be sharing key lessons, along with a few war stories, from companies such as Google, Heroku, and...
Was your last outage a triumph, or a tragedy? Was it like Apollo 13, or more like Titanic? Are you ready for your next outage? Are you prepared to respond quickly, smoothly, effectively, and efficiently? Users expect uptime, all the time, and it's critical to your...
I'm excited to share that I just signed the venue contract for a full-day "Mastering Outages" public class on incident management, for here in the SF Bay Area on Friday 18 May 2018. I'll be posting full details shortly, as I get everything else set up, but you should...
Effective incident management matters because it both reduces downtime for your service, and reduces the impact of dealing with that downtime on your staff. That obviously impacts users, but outages also have a huge impact on your staff, and particularly on the ongoing work of the organization. If you’re not careful, they can lead to a toxic cycle. […]
Everybody wants to be a hero when there’s an outage or similar service problem, swooping in to save the day with their knowledge, skill, and wisdom, and then reaping their reward of praise and adulation from management and peers. However, if heroics are what you reward, then you’ll get more heroics, and that’s not good for your team or your service. Instead, you should prepare to respond *without* heroics, by using good incident management practices, and reward folks for avoiding the need for heroics. […]
[et_pb_module_placeholder selected_tabs="all"]<!-- [et_pb_line_break_holder] -->`;<!-- [et_pb_line_break_holder] --></script>