- November 22, 2016
- Posted by: Troy White
- Category: Process Improvement
ITSM principles provide guidelines to assist IT organizations in creating processes to provide customers with the highest quality experience. The processes need to be interconnected to ensure consistent interactions with all aspects of IT. If the processes exist in individual silos, they will break, and consistency of support will be sacrificed. The one industry that has learned the significance of having a continuous full life cycle process is the hospital emergency room (ER). Obviously IT is not necessarily saving lives, but the IT industry can learn from the way an ER functions.
Several components need to be in place in order to interconnect processes:
- Prioritization and Classification
- Management and Coordination
- Escalation Criteria
- Communication Protocols
Prioritization and Classification
The key to the success of the emergency room starts with the hospital’s triage nurses. They are responsible for determining the severity of each case that visits the ER. They do this for all of the people coming into the ER, regardless of if they walk in, are transported by ambulance, or are airlifted. The service desk is the triage mechanism for the IT organization. The service desk is the initial point of contact for all incidents, whether they come in by phone, chat, email, or other channels. Therefore, all incidents should not be entered through channels outside of the service desk. Can you imagine if a patient bypassed the triage nurse? How many people would die because they were not treated based on severity? The person who came in with a stomach virus could be seen before a person who was having a heart attack. If incidents bypass the service desk, then the ones who need to be assisted first, such as someone with a computer virus, would have to wait because the service desk would be busy helping a customer unable to print or having problems changing a password, potentially causing a larger problem.
The triage nurse follows protocols on which emergencies need to be handled first. They check each patient’s vital signs and ask questions such as “What is your pain level from 1 to 10?” and “What are your symptoms?” The triage nurse can determine if the emergency is routine or life threatening from this assessment and prioritize based on this information. The service desk needs to have questions and tools essential to analyze the prioritization for all incidents. Service priority can be determined by several factors such as how many customers are affected or if the systems that are affected are business critical. The questions and answers will determine if the incidents are routine, high impact, or critical. Each classification should be defined with examples, and the process should determine what will be done for each classification. For instance, a high impact incident can be defined as an event that impacts multiple users (but not company-wide), no more than two systems or services, or cuts across more than one service delivery team. Some examples of high impact incidents include:
- Data center outages
- Loss of physical/virtual environment
- Service degradation on the company’s network
A critical incident can be defined as an event that impacts the IT department’s ability to provide services to customers across and outside the company. Some examples of critical incidents include:
- Widespread information security event (virus, worm, hacking event, etc.)
- External events
- A large scale geological or weather event
- A political or terrorist event (evacuation, office destruction, etc.)
- A data center related event (electrical failure, flood, fire, etc.)
- A service provider failure (global links, internet, etc.)
- A full-scale system or network outage
In the ER, timing is important to save lives, therefore speed is urgent. The triage nurse can make an assessment on the criticality of the patient in minutes. The triage nurse knows that if a patient does not get a certain medication or procedure in a certain amount time the person could die. There are time limits set for each of the steps in the process. The process for the IT department should have time limits based on the classification of the incident. For example, the service desk agent should make the determination on the classification of the incident within 15 minutes, and tier 2 should have 30 minutes to do their initial assessment to determine if the incident should be classified as a high impact or critical incident.
Management and Coordination
The relationships between the paramedics, nurses, pharmacists, and doctors are extremely important. They are like a football team where they need to know where and what each person is doing to be successful, and in the case of the ER, to save lives. Every single person in the ER knows their role and they trust in each other’s capabilities to do their job. This should be the same for the IT department. When a high impact or critical incident has been identified, the members of the IT department should have trust in each other to do their job to resolve the incident or problem.
The service desk should be the center of the IT department. They should be the ones managing the incidents from the beginning until resolution. Tier 2 should have the trust in the service desk when high impact and critical incidents are raised to start immediately working on the incident until resolution.
Everyone who is registered at the ER has a chart that follows them throughout their stay at the hospital to notify the hospital staff what has been done since the person was admitted. Every patient’s chart notes what medication was given, vital signs, allergies, etc. The ticket created by the service desk should be equivalent to the patient’s chart. Every ticket from the service desk needs to include what was done on every interaction with the customer, which includes the steps taken in an attempt to resolve the incident.
The triage nurse knows which patient needs to be escalated first by the data given from the patient, their symptoms, and their vital signs. The service desk knows what ticket needs to be escalated by the definition and examples for routine, high impact, and critical incidents.
The information from the patient also determines what type of doctor is required to help the patient. You don’t call a cardiologist to treat a rash. The same is true for the service desk. They need to know which tickets need to be referred to resolve the incident or problem. The agent needs to know which group can resolve the ticket immediately. This will eliminate the “ping pong” effect for a ticket, which would reduce the mean time to resolve if a ticket goes from one tier 2 group to another tier 2 group.
If multiple people with the same symptoms come into the ER in a period of time, they call this an epidemic. The ER has detailed processes on how to escalate an emergency to an epidemic and how to handle an epidemic, like isolating the affected patients from the rest of the hospital so they don’t infect anyone else. This is the same for IT support when multiple customers are having the same IT problem. The incidents need to be escalated to a problem, and the IT department would need to follow the problem management process.
In the hospital, you frequently hear someone on the loud speaker barking out codes that are understood by those who work in the hospital. The codes describe what emergency is happening or what is about to happen and who needs be involved. There is constant communication throughout the hospital doctors, nurses, pharmacists, etc. This should be the same for the IT organization. IT needs a communication protocol dependent on the categorization of the incident. For example, when a high impact or critical incident is identified, then certain IT departments, dependent on the severity, need to be notified to assist in resolving the incident to reduce risk and downtime and continue business operations. One common problem with communication in IT is a lack of continuous communication during a high impact or critical incident. Tier 2 and 3 is content on just fixing the problem and not communicating the status of the resolution. The following departments need be informed throughout the resolution of the incident:
- Tier 2 needs to be communicated to in an attempt to work on resolving the incident
- The service desk needs to be notified to maintain communication to the customers
- The communications team needs to be notified to properly communicate to customers
- Senior management needs to be notified to make business decisions
The process should dictate how often and by what means (email, phone call, etc.) tier 2 should notify the multiple IT departments.
Follow the Process
In times of chaos, there is calm in the ER because of the process. The key to success for an ER is following, trusting, and believing in the process. The IT support industry can adopt this approach in their everyday operations. The process should be so clear and precise that no one needs to think about what they need to do; they just need to follow the process. If the process is followed correctly, everyone will make informed decisions. This is true for both the ER and IT. Following the process saves lives in the ER, and in IT, following the process will reduce the mean time to resolve incidents, which in turn restores services to normalcy in a timely manner.