The Incident Management practice is one of the most popular in Service Management.
Therefore, the purpose of the Incident Management practice is to minimize the negative impact of incidents by restoring service to normal operation as quickly as possible.
So, what is an incident? We can define an incident as an unexpected interruption of service, or a reduction in service quality.
About Incident Management
Obviously, Incident Management can have a huge impact on customer and user satisfaction and how they perceive the service provider, as it works very closely with those who use the services to achieve their own results.
Certain considerations deserve to be mentioned for a better understanding of this practice.
All incidents must be logged and managed to ensure resolution within a timeframe that meets customer and user expectations.
In addition, goals for resolution times must be agreed, documented, and communicated, in order to ensure that expectations are realistic and everyone is aware of them.
Incidents are prioritized based on a classification, also within a pre-established agreement, to ensure that the resolution of incidents with a greater impact on the business occurs before the resolution of other lower-impact incidents.
Of course, organizations must plan their Incident Management practice to provide coordination and allocation of resources according to different types of incidents.
Low-impact incidents should be managed efficiently, without consuming too many resources. On the other hand, high-impact incidents may and should require more resources and more careful and complex management.
It is common for separate procedures to exist to deal with major incidents and information security incidents.
Incident Management Tools
Information about incidents should be stored in records, supported by appropriate tools.
Ideally, technology should allow for relationships between incidents, configuration items, changes, problems, known errors, and other relevant information to support quick and efficient diagnosis and recovery.
Modern tools offer the ability to automatically match incidents, problems, and known errors, and even intelligent data analysis to generate support recommendations for future incidents.
People working on an incident should provide good quality information in a timely manner, such as symptoms, affected configuration items, business impact, among others, for proper communication.
Therefore, collaboration tools play a key role in this practice, as people working on an incident can contribute effectively.
Have doubts if processes can stifle innovation? We made a post discussing this subject!
About Problem Management
First of all, we know that every service has errors, faults, or vulnerabilities that can cause incidents, in any of the four dimensions.
Many of these errors we identify and resolve before the service goes into production, but some do not, threatening services in operation.
In ITIL, these errors are called “problems” and are addressed by the Problem Management practice.
The goal of Problem Management is to reduce the probability and impact of incidents by identifying the real and potential causes of these incidents and by treating workarounds and known errors.
Anyway, it is important to know the definitions of the terms below for a better understanding:
- Problem: can be defined as the cause, or potential cause, of one or more incidents;
- Known error: is a problem that has already undergone analysis, but has not yet been resolved;
- Workaround: is a solution that reduces or eliminates the impact of an incident or problem, for which a complete solution is not yet available.
Some workarounds serve the purpose of reducing the probability of incidents occurring.
Problem Identification
Anyway, Problem Management involves three distinct stages: problem identification, problem control, and error control.
Problem identification is also responsible for problem logging. Problem control has other activities, such as problem analysis and documentation of workarounds and known errors.
Error control manages known errors after initial problem analysis, usually identifying failed components.
Incidents versus Problems
Anyway, the difference between an incident and a problem needs to be clear!
Problems are related to incidents – an incident “is born as an incident and dies as an incident”. An incident does not turn into a problem, okay?
On the other hand, one or more incidents can give rise to or identify a problem.
Therefore, we must manage problems differently from incidents!
Incidents have an impact on users or business processes, and we must resolve them as quickly as possible to restore normal business activity.
Problems, on the other hand, are the causes of incidents. In this case, they require investigation and analysis to identify the cause, develop workaround solutions, and make recommendations for a complete and long-term solution.
So, to wrap up, a simple analogy can be used to finally understand the difference between the two: incident management would be the firefighter, while problem management would be the expert who investigates the cause of the accident, in order to avoid recurrence or to minimize future damage.