Digital Transformation
Showing results for 
Search instead for 
Did you mean: 

Shine a light on operations blind spots with Big Data


HP20150615037 (1).jpg

By Gerben Verstraete

I was recently working with an IT team at a global transportation enterprise to discuss their Operations practice strategy. They are doing everything right in terms of creating a big-picture, long-term strategy for how to prevent outages and become more predictive and agile in protecting business-critical applications. But in the meantime, they’ve got all these fires to put out.

This is the case at so many enterprises. They’ve put all these monitoring tools in place, but are still facing outages. When an outage occurs, it can take a long time for IT to pinpoint the cause. And the situation is only going to get worse as far as gaining visibility in the services IT provides—to the extent where traditional management tools are no longer providing the full picture. IT environments are more complex than ever. Applications are constantly changing. With initiatives such as DevOps, we know apps will be released more frequently. From quarterly releases we’ve gone to monthly releases and from there to weekly releases. We expect daily (if not more frequent) releases in the future. Given this complexity, what can you do to protect your business-critical applications?

An operations analytics solution can be the answer—but you have to set it up right. In HPE Software Services, we’ve seen tremendous ROI from our operations analytics solutions. But we’ve also gone into organizations and seen teams who have a Big Data tool for their operations, but don’t get the value out of it. Collecting data is one thing, but if you don’t know what to ask, you’ll find yourself in a situation where you are just a tool richer but with no value to add. Here are three keys to success.


  1. Become application-centric

In general, IT operations teams are still very infrastructure-focused. This has to change—Ops teams and App teams have to collaborate closely. Today’s composite applications are so complex, you will certainly miss something if you don’t focus your monitoring and analysis around your applications. To become application-centric you need to develop your applications with monitoring in mind. Ensure instrumentation is in place and develop (Big Data) maps and pattern analysis around your applications. Lastly, use trend analysis to create a feedback loop back to development teams about how applications—as well as the users using them—behave in the real world, so that your apps continually improve.    


2. Capture all relevant data 

You want to capture as much relevant operations data as possible. In today’s complex environments, there are things going on that you simply don’t know about and won’t find through monitoring. You might not know that one application affects another application because they appear to have nothing to do with one another. But there might be some point in the underlying infrastructure where traffic patterns do collide, as well as other “noise” that systems can generate. You want to capture operations events, metric data, transaction data, your systems and applications logs, your network traffic, your network devices, and so on. Now you need to correlate what you have using pattern analyses. You start making this dark data visible.

We saw one case, for example, where a virtual machine was deployed but nobody knew it was configured to be in debug mode. This particular system was causing applications to perform poorly. When looking at traditional operations monitoring tools, everything still looked fine as things started to fail. It was only through capturing that system log that they were able to see that the machine was not properly configured for production. You can’t tell that through traditional monitoring. (If you want to learn more about how this works, check out this white paper: Analyzing machine data—the best way forward.)


3. Use maps and models to visualize trends

The more data you can gather, the more intelligent you can be and the more predictable you can be. But proper visualization is key to achieving these benefits.

You also need to create a model. This means grouping data in ways that matter to you, which for most organizations is around the services they provide. But you can get very creative here—just don’t lose sight of the use cases you are trying to address. Bottom line is that in order to get value out of a Big Data system, you need to understand how to use it. How will you be grouping and visualizing your data? What are the patterns you want to start with so that you can then ask questions to quickly find root-cause, trends, predict and prevent issues, etc.? For example, in the HPE Operations Analytics solution, we use patterns that we’ve developed with HPE Labs.

That’s what we help our customers with, which has resulted in significant outcomes for them and has repaid their investment several times over within a short timeframe. Putting the right foundations in place, as well as showing simplicity and ease of use for level 1-2 operators and application support staff with the ability to drill down without having to engage with SMEs is where I’ve seen adoption resulting in savings.


How analytics corrects your blind spots

Many enterprises I meet with are very much focused on improving their current monitoring. Which is good. But as important as monitoring is, you can often make bigger strides, and achieve faster ROI, by investing in an analytics solution.

I’ll give you an example of how analytics helps. We experienced a problem with a large Microsoft Exchange deployment. End users were affected, so it was clear something was going on. Yet the monitoring tools said everything was fine. Note in some cases they can also light up, however both typically result in expensive war room calls with multiple teams trying to find who is responsible, which is not only a costly exercise but also can cost the business revenue, as a critical system impacts the company’s ability to effectively operate their business.

In this particular case, a single operator was able to quickly drill down to the root cause, leveraging Operations Analytics and quickly correlating metric, event, and log data to find the issue at hand.  Again, finding that starved port through traditional methods would have taken hours or even days, consuming many resources—essentially it would have been like searching for a needle in a haystack. But with an analytics solution, such a search can take less than 30 minutes in this particular case. This is a great example of how reducing your MTTR and increasing services availability can lead to a direct cost reduction in your operations, as fewer staff is required for firefighting. Leveraging the predictive analytics patterns allows organizations to take additional action in order to increase their service levels.

Contact HPE Software Services to learn more about how we can help you with operations analytics. Or download this white paper: Analyzing machine data—the best way forward.


Gerben headshot.jpg

Gerben Verstraete works in the CTO office of HPE Software Services, with a focus on BSM, security, and the transformation of IT operations. Follow him on Twitter at @GerbenVerstraet or connect with Gerben on LinkedIn.





Related links:



About the Author


Jan 30-31, 2018
Expert Days - 2018
Visit this forum and get the schedules for online HPE Expert Days where you can talk to HPE product experts, R&D and support team members and get answ...
Read more
See posts for dates
HPE Webinars - 2018
Find out about this year's live broadcasts and on-demand webinars.
Read more
View all