IT Operations Management (ITOM)
Showing results for 
Search instead for 
Do you mean 

Fight fires with focus: how to automate incident resolution with IT process orchestration

‎05-02-2014 03:41 PM - edited ‎01-31-2015 09:48 PM

By Nimish Shelat, Product Marketing Manager, HP Automation and Cloud Management

 

Inside many IT organizations, incident resolution is often a surprisingly manual and complex process. Even when an organization implements event consoles like HP Operations Manager i (OMi) to compile events across multiple domains and weed out irrelevant or duplicate data, Tier 1 and Tier 2 operations still spend much of their workdays responding to alarms and putting out fires.

 

(Source: Flickr/NY National Guard )

 

But what if most of those firefighting exercises can be eliminated with IT Process Automation? Let’s take a look at how incident remediation works when HP OM operates in concert with IT Process Automation and IT Process Orchestration.

 

FREE: The new HP Operations Orchestration Community Edition

 

 

How incident resolution works with OM

 

Let’s say that your IT environment experiences 68 million raw events per day (as one HP customer did). HP OM will automate the collection, correlation and deduplication of these events, prioritizing them based on their business impact and then applying automatic-actions to fix common problems. This is an excellent start—as the HP customer found out, it can slash the number of alerts you need to address down to 5,000.

 

However, resolving 5,000 alerts can still add up. Here’s why: When the OM enterprise console presents an alert to a Tier 1 Operations team, they manually turn to reference documentation such as runbooks, knowledge bases, or their own tribal knowledge (or maybe just a note tacked up on their cubicle wall—don’t kid yourself, it happens).

 

 

Fig. 1: How manual incident remediation processes work with HP Operations Management.

 

But what if first responders can’t resolve the event? Then Tier 1 must escalate to Tier 2 subject matter experts for manual troubleshooting, triage and (ideally) repair (Figure 1, above). Even then, some alerts will not get resolved, at which point Tier 2 administrators create an incident that is routed to an Infrastructure or Applications team to investigate further.

 

Clearly this can be a long, manual process of investigations, trial-and-error fixes and hand-offs by one or several IT personnel.

 

How OM and Operations Orchestration fully automate incident resolution

 

Operations Orchestration (OO) can replace many of the most repetitive processes that Tier 1 and Tier 2 administrators use for investigation and repair (Figure 2).

 

 

Fig. 2: How process automation remediates incidents with HP Operations Management and HP Operations Orchestration

 

When OM registers an event, it will use policies with criteria you set to trigger OO automated processes for incident resolution. Depending on the event and the policies, OO launches step-by-step logical flows for diagnosis and self-healing repair, delivering acknowledge/annotate alert messages with detailed information that can be reviewed by operators (Figure 3). OO records all flow execution activity for auditing and reporting, and when necessary will automatically create enriched incident tickets to the Service Desk.

 

 

Fig. 3: Example of an HP Operations Orchestration flow 

 

Operator-Assisted Incident Resolution

 

One variation to this fully automated model is to incorporate operator assistance. In this scenario, the OM event alert goes to Tier 1 Operations, which may choose to launch “guided” HP OO flows from the enterprise console menu and make decisions interactively.

 

Of course, not every event will be resolved through OO incident remediation flows, but they can address the vast majority of them in a consistent, standardized way. For example, the HP customer I mentioned above was able to reduce it to a much more manageable 1,500 alerts. Integrating OM and OO allows Tier 1 and Tier 2 personnel to focus their efforts.

 

Experience HP Operations Orchestration for free

The new HP Operations Orchestration Community Edition is a free download of the OO platform with out-of-the-box content packs for automating incident remediation. Designed for easy self-installation, you will be able to begin experiencing within two hours the power of IT process automation and IT operations orchestration.

 

 

0 Kudos
About the Author

HPE-SW-Guest

This account is for guest bloggers. The blog post will identify the blogger.

Leave a Comment

We encourage you to share your comments on this post. Comments are moderated and will be reviewed
and posted as promptly as possible during regular business hours

To ensure your comment is published, be sure to follow the Community Guidelines.

Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.
Events
Aug 29 - Sep 1
Boston, MA
HPE Big Data Conference 2016
Attend HPE’s Big Data Conference on August 29 - September 1, 2016 to learn from peers in every industry and hear from Big Data experts and thought lea...
Read more
Sep 13-16
National Harbor, MD
HPE Protect 2016
Protect 2016 is our annual conference on September 13 - 16, 2016, and is the place to meet the world’s top information security talent, discuss new pr...
Read more
View all