Chances are you know a person at work that is good at cleaning up messes and handling critical situations. in the information technology (IT) community, such people are often referred to “janitors,” “cleaners,” or my personal favorite, “fire fighters.” These are commonly thought of as the first person to go to when technology goes really wrong. I’ve learned through the years that these people may be subject matter experts in a particular tech field, but they are also experts in managing a crisis. Here’s how:
When I’ve entered into an IT fire, there are a plethora of emotions from talented, frustrated people caught up in the blaze. I’ve learned that keeping yourself calm helps calm everyone else. Usually, technical teams are stressed and management is panicked. Getting people to focus on the problem and not emotions is always critical when entering the situation. I’ve watched employees walk out the door during an outage because a manager started insulting him. Once the situation calmed, this same employee got the website back up. He was the key to solving the problem, but his manager let emotions get in the way of solving the problem. I worked with the employee on the problem and got him focused on the problem not the stress of the situation. I’ve also watched teams instantly focus when a calm person takes control of the situation. I’ve also seen long lasting work relationships and friendships end of poorly managed critical situations where emotion trumped finding the solution.
Organize and Communicate
Get people focused on solving the problem. Solving a problem almost always involves multiple people and many different types of skills. There could be an error in a piece of code, maybe there is a network error or something going wrong with your server instances. Putting the puzzle pieces together involves communication and organization between everyone. If you don’t have the right people to solve the problem, get them. I’ve learned that crisis situations are about getting the right people involved and getting everyone organized correctly to solve the problem.
In the television series “House,” Dr. House used his team of experts to gather information and help propose theories. Dr. House would challenge these theories or tie the different theories together to come up with a hypothesis. Every fire fighting team needs firefighters, but they also need a fire marshal to direct the team. In my career, I was called in from the outside many times because of my expertise in a subject as many times as I was to be the fire marshal. Don’t underestimate the fire marshal’s ability to see the bigger picture, communicate between stakeholders and the team as well as put a buffer between the guys solving the problem and the management.
There is always pressure to fix something quickly, and there is a false assumption that the first fix needs to be the right fix. Many times, the fix is more complex than restarting a server or rolling out a patch, and sometimes there is a need to work on an educated guess with the information you have at hand. I’ve found being honest with an approach, communicating the risks of the approach and why you think the approach is right helps generate ideas, build confidence in the team and generate trust between all the stakeholders during the crisis. Many times trouble shooting works off a hypothesis based on the information you have at hand. It may take a few tries to find the needle in the haystack.
Fixing the real problem
Finally, I’ve found that many troubled projects, constant outages or other critical situations started months before, and there needs to be a strategic plan that addresses problems in communication, processes and people to ensure long term success. In my experience handling critical situations, IT fire fighters are always called back if they don’t fix some of the underlying problems. Problems on projects occur when people over commit, there is scope creep, there were poor technology decisions or resource can’t perform the job. Troubled projects and other critical situations are often a mixture of problems that caused the situation. I’ve used the classic 30/60/90 plan to address problems as a get well plan. “What can I fix to get the things running properly in 90 days so I don’t have to come back?” Understanding that fighting a fire is both tactical and strategic helps put prevention before reaction, perhaps even giving IT fire fighters a well deserved day off.
This first appeared in The Havok Journal on December 23, 2018.
© 2023 The Havok Journal