When it comes to the major incident management best practices, they’re best understood when you zoom out and look at the whole picture.The digitalization of the modern world has forced companies to reevaluate their security posture and how they respond to major incidents like network outages.
Between 1980 and 2000 the IT Infrastructure Library (ITIL) was developed and released. The series of more than 30 volumes covered best practices for managing information systems. It was seen by many industry experts as the go-to source for responding to an incident. This explains why it’s best practices were incorporated in ISO 20000.
In 2019 the fourth version of digital ITIL was released, providing a more flexible, agile, and configurable approach—one that’s geared for modern businesses.
So, what are the major incident management best practices? And how can you apply them to your business? Let’s review.
What is ITIL?
Over the years the ITIL has undergone several revisions, including condensing volumes or updating guidance according to new technologies. Some of it’s lasting focuses have included:
- Automating processes
- Improving service management
- Integrating IT department with the business
The most recent volume has updated the framework to accommodate modern technologies, tools, and softwares.
It’s purpose is to help further solidify the integration between your IT team and the rest of the business, while managing risks and maintaining infrastructure. According to CIO:
The newest version of ITIL focuses on company culture and integrating IT into the overall business structure. It encourages collaboration between IT and other departments, especially as other business units increasingly rely on technology to get work done. ITIL 4 also emphasizes customer feedback, since it’s easier than ever for businesses to understand their public perception, customer satisfaction and dissatisfaction.
Aside from compliance, there are several incident management benefits you can expect from applying ITIL to your organization. They include:
- Improved productivity
- Reduced IT costs
- Improved IT services
- Better customer satisfaction and relationships
- Improved delivery of third-party services
- Mitigating risk, disruption, and failure
What Is An Incident?
It’s important to note that incidents aren’t the same thing as a problem or a request. So, what are they?
According to the ITIL, an incident is “an unplanned interruption that causes, may cause or reduces the quality of an IT service.” Because of this, it’s vital that you categorize an issue if it causes a service outage—one that forces your organization to stray from prevailing incident management processes.
A problem is an IT condition that’s identified by several incidents cropping up with the same symptoms. A request, on the other hand, is a formal request for something like a new piece of hardware or credentials, and which requires approval before IT is able to fulfil it.
Common incidents include:
- Printer not functioning
- Hardware not working
- Error message when trying to launch an application
- Active directory password reset
- Monitor flickering
It’s worth noting that not all incidents are created equal. There are three distinct levels of incidents, separated by level of impact on your organization and its clients:
- Low-priority – Incidents that don’t interrupt users or the business and can be managed. Both the services to users and customers remain operational.
- Medium-priority – Incidents that affect some staff and can interrupt work to one degree or another. Customers may be inconvenienced by it.
- High-priority – Incidents that impact a large number of users or customers, interrupts the business, and impacts the delivery of service.
ITIL Incident Management Best Practices
Incident management is amongst the most critical IT support processes any business can focus on.
Its goal is to offer quick fixes that resolve an issue and restore service to full capacity. The focus isn’t about discovering the root problem, rather, taking the user incident from a reported status to a closed status.
Because incidents can be commonplace, incident management is something that must be constantly applied and improved across all levels of your organization. But adopting the ITIL framework is easier said than done. It must be a concerted effort from upper management on down.
So, how can you use incident management to transform your work environment? ITIL 4 recommends that you apply the following best practices:
Utilizing the Service Desk
Incident management has several different functions. The most important of which involves the service or help desk.
The service desk acts as the sole point of contact that users can report incidents to. This is why it’s so critical. Without it, there’s no easy way to prioritize incidents and ensure proper workflows. As a result, low-priority incidents may get immediately fixed while high-priority incidents go by the wayside, causing significant damage to your business in the process.
The service desk must be structured in such a way as to optimize incident management. Ideally, this is accomplished by dividing the help desk into tiers of support:
- First-tier – Basic issues like computer troubleshooting or problems with passwords. Because these types of issues pop up frequently, it’s simple to templatize and streamline support. Since it’s often possible to work around the incident, they don’t require immediate attention from the service desk.
- Second-tier – Issues that may require more training, skill, or access in order to be resolved. Because incidents can fall into medium-priority issues, they’ll often require a speedier response from the desk.
- Third-tier – A rarity, major incidents necessitate instant escalation and are always the utmost priority since they could cause significant disruption to your business.
Building Robust Workflows to Manage the Incident Lifecycle
The best way to quickly get services back in order is to implement a dynamic work process with standardized processes. This involves separating major incidents from the rest and then streamlining incident responses.
How? By finding areas to simplify and automate the following processes:
- Identifying the incident
- Alerting the impacted parties of the incident
- Assigning the right people to resolve the incident
- Monitoring the incident through the entirety of its life cycle
- Escalating an incident if a service level agreement is breached
- Resolving and closing incident cases
- Generating analyses of documentation
By preparing for any and all incidents ahead of time and stipulating workflows, your entire team will know what to do and how to respond with alacrity, especially to major incidents.
Identifying and Defining the Incident
IT Issues can negatively impact your business and its relationship with your customers.
Ideally, incidents are identified at an early stage via automatic monitoring before it can ever impact a user. But realistically, that’s not always possible. Sometimes incidents are only identified by the user who’s been impacted. When that occurs, they’ll notify the IT service desk to see if they can perform incident management.
To prevent confusion, it’s helpful to categorize significant incidents based on key elements like urgency, severity, and impact. Best practices you can apply include:
- Configure data fields and event tags to automate classification and save time in the triage and mitigation process.
- Create deduplication rules to categorize similar alerts and prevent the on-call team from receiving redundant notifications.
- After alerting the team of the incident, all important data should be manually input.
Classification and Prioritization
Rigorous documentation allows you to respond to threats in a timely fashion and chart your historical record.
Any incident that is reported to the service desk—no matter how big or small—needs to be logged as a ticket. The ticket must include crucial details, like:
- User information
- Description of the incident
- Resolution details
- Configuration items
- Closure details
After the incident has been logged, you’ll need to categorize it in order to “assign, escalate, and monitor” incident trends. Additionally, you’ll be required to assign priority (based on urgency level) to determine who will handle the issue, how it will be addressed, and when it will be fixed.
By prioritizing incidents, the on-call team can understand the severity of the problem upon first glance. Also, you can streamline the process by setting automatic configurations to assign priority.
Automation, Escalation, and Assigning Status to an Incident
The larger your organization, the greater the likelihood your IT team will have to juggle a variety of different incidents on a daily basis. Your team needs to be able to escalate an incident to the right people as soon as possible. For this, automated escalation is critical.
To help maintain a channel of communication and ensure that every ticket is being handled, it’s important to assign each incident with a specific status in your system. This allows the incident management team to instantly see where a specific incident is in the process, making it easier to prioritize their focus.
There are six different incident statuses:
- New – A new status alerts the service desk that it has received an incident report but has not yet assigned an expert to handle it.
- Assigned – The incident has been assigned to an individual service desk expert.
- In progress – The incident has been assigned but is not yet resolved. The agent is in the middle of collaborating with the user to diagnose the issue and resolve it.
- On-hold – The incident needs a response or further information, weather from the user or a third party. On-hold is meant to ensure service level agreements deadlines aren’t missed while awaiting a response.
- Resolved – The service desk has fixed the incident and the user’s service is restored to the service level agreement watermark.
- Closed – The incident is resolved and there are no further actions required.
Communicating the Incident
The key stakeholders need to know about the incident occurring throughout its entire life cycle.
If an incident does occur, particularly a major incident, it’s important that both your customers and the internal teams that interact with customers are aware of the issue and aware of the mitigation taking place. Ideally, this will be done via automated updates sent from a centrally managed hub.
Internally, all relevant teams should be included in the notification. This ensures that they have visibility on the incident and are aware of the mitigating actions taking place. Additional details can be logged on a private status page so the relevant parties can have an even better idea of what’s happening.
Externally—particularly for B2Cs—it’s important to have a public status page that’s regularly maintained and updated. So, if a user runs into a service issue, their first step will be to check the status page. Doing so will help keep your service desk from being inundated with redundant incident reports and reduce wasted time spent answering basic level questions.
Training your employees
An effective incident management plan starts with training. Preparation is the first thing you can do to make sure that all incidents are properly handled.
ITIL has certification programs that can improve your employees ability to manage any incident. Better trained employees can help prepare your business to manage risk, instill cost-effective practices, and build a stable IT environment.
Per CIO there are levels of certification including;
- ITIL Foundation – The basic ITIl module. It’s an introductory course that covers the core concepts and principles of ITIL 4.
- ITIL Managing Professional (MP) – This type of program is ideal for IT practitioners who oversee service teams in IT operations. It contains four different exams, including:
- ITIl Specialist Create Deliver & Support
- ITIL Specialist Drive Stakeholder Value
- ITIL Specialist High Velocity IT
- ITIL Strategist Direct Plan & Improve
- ITIL Strategic Leader (SL) – Covers all digitally enabled services in the organization, including those outside of IT operations. To gain this certification you must pass two tests:
- ITIL Strategist Direct Plan & Improve
- ITIL Leader Digital & IT Strategy
- ITIL Master – If you complete both the MP and SL you are able to qualify for an ITIL Master designation. But first you must have at least five years of experience working in IT service management in a role of leadership, management, or higher management advisor.
The more your team knows, the greater the likelihood that they’ll be able to deliver quality IT services that align with your overall strategy.
RSI Security—Incident Management Experts
The ITIL was created as a framework organizations could instill to properly manage and respond to incidents both great and small.
Abiding by ITIL isn’t easy, but it becomes more manageable when you utilize best practices like:
- Utilizing the service desk
- Building robust work flows to help manage an incident throughout its lifecycle
- Identifying and defining the incident
- Automation, escalation, and assigning status to an incident
- Communicating the incident
- Training your employees
But what if you need help with incident management? What if you lack the internal resources to properly follow ITIL?
RSI Security is standing by.
We’re a team of experts that can assist whenever an incident occurs. We provide incident management services that include:
- 24/7 incident response and recovery
- Forensic analysis
- Breach assessment
If you want an immediate, custom response to all incidents—and one that comes with a personal on-site touch—RSI security is the solution.