Business Service Management
Showing results for 
Search instead for 
Do you mean 

Fight fires with focus: how to automate incident resolution with IT process orchestration

Guest Blogger (HPE-SW-Guest) ‎05-02-2014 03:41 PM - edited ‎01-31-2015 09:48 PM

By Nimish Shelat, Product Marketing Manager, HP Automation and Cloud Management


Inside many IT organizations, incident resolution is often a surprisingly manual and complex process. Even when an organization implements event consoles like HP Operations Manager i (OMi) to compile events across multiple domains and weed out irrelevant or duplicate data, Tier 1 and Tier 2 operations still spend much of their workdays responding to alarms and putting out fires.


(Source: Flickr/NY National Guard )


But what if most of those firefighting exercises can be eliminated with IT Process Automation? Let’s take a look at how incident remediation works when HP OM operates in concert with IT Process Automation and IT Process Orchestration.


FREE: The new HP Operations Orchestration Community Edition

 download now.png


How incident resolution works with OM


Let’s say that your IT environment experiences 68 million raw events per day (as one HP customer did). HP OM will automate the collection, correlation and deduplication of these events, prioritizing them based on their business impact and then applying automatic-actions to fix common problems. This is an excellent start—as the HP customer found out, it can slash the number of alerts you need to address down to 5,000.


However, resolving 5,000 alerts can still add up. Here’s why: When the OM enterprise console presents an alert to a Tier 1 Operations team, they manually turn to reference documentation such as runbooks, knowledge bases, or their own tribal knowledge (or maybe just a note tacked up on their cubicle wall—don’t kid yourself, it happens).


 Operations Management.png

Fig. 1: How manual incident remediation processes work with HP Operations Management.


But what if first responders can’t resolve the event? Then Tier 1 must escalate to Tier 2 subject matter experts for manual troubleshooting, triage and (ideally) repair (Figure 1, above). Even then, some alerts will not get resolved, at which point Tier 2 administrators create an incident that is routed to an Infrastructure or Applications team to investigate further.


Clearly this can be a long, manual process of investigations, trial-and-error fixes and hand-offs by one or several IT personnel.


How OM and Operations Orchestration fully automate incident resolution


Operations Orchestration (OO) can replace many of the most repetitive processes that Tier 1 and Tier 2 administrators use for investigation and repair (Figure 2).


 Process Automation.png

Fig. 2: How process automation remediates incidents with HP Operations Management and HP Operations Orchestration


When OM registers an event, it will use policies with criteria you set to trigger OO automated processes for incident resolution. Depending on the event and the policies, OO launches step-by-step logical flows for diagnosis and self-healing repair, delivering acknowledge/annotate alert messages with detailed information that can be reviewed by operators (Figure 3). OO records all flow execution activity for auditing and reporting, and when necessary will automatically create enriched incident tickets to the Service Desk.


Operations Orchestrations Flow.png


Fig. 3: Example of an HP Operations Orchestration flow 


Operator-Assisted Incident Resolution


One variation to this fully automated model is to incorporate operator assistance. In this scenario, the OM event alert goes to Tier 1 Operations, which may choose to launch “guided” HP OO flows from the enterprise console menu and make decisions interactively.


Of course, not every event will be resolved through OO incident remediation flows, but they can address the vast majority of them in a consistent, standardized way. For example, the HP customer I mentioned above was able to reduce it to a much more manageable 1,500 alerts. Integrating OM and OO allows Tier 1 and Tier 2 personnel to focus their efforts.


Experience HP Operations Orchestration for free

The new HP Operations Orchestration Community Edition is a free download of the OO platform with out-of-the-box content packs for automating incident remediation. Designed for easy self-installation, you will be able to begin experiencing within two hours the power of IT process automation and IT operations orchestration.


 HP OO-CE.png

About the Author


This account is for guest bloggers. The blog post will identify the blogger.

Leave a Comment

We encourage you to share your comments on this post. Comments are moderated and will be reviewed
and posted as promptly as possible during regular business hours

To ensure your comment is published, be sure to follow the Community Guidelines.

Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.
1-3 December 2015
Discover 2015 London
Discover 2015 in London, the ultimate showcase technology event for business and IT professionals to learn, connect, and grow.
Read more
November 2015
Software Online Expert Days
Join us online to talk directly with our Software experts.
Read more
View all