Protect Your Assets
Showing results for 
Search instead for 
Do you mean 

Big Data Security Analytics Part 3: Data science & Putting Structure to the Problem

Kerry_Matre ‎05-08-2014 09:30 AM - edited ‎06-09-2015 11:28 AM

If you go back and read Part 1 and Part 2 of this series, you’ll see that we’re discussing the possibilities and realities of big data security analytics. And, with discussion, come questions. So how do we answer those questions? Various types of security questions can be answered based on the disciplines of data science:


  • Classification: Allows events to be grouped into like sets for context.
  • Correlation: Real-time (HP ArcSight) & historical associations can be recognized, providing context and relational understanding.
  • Clustering: Data point similarity detection across large collections provides a straightforward, yet confident, way to derive true understanding of many events.
  • Affinity Grouping: Similar to clustering, but this can take the context of each data point as it pertains to users, systems, attacks and their interactions. Provides excellent context between multiple, seemingly disparate, data points.
  • Aggregation: Allows a high level view of large amounts of data, distilling often complex sets into simple numerical quantities, e.g. Did this bad event happen often enough in an hour to be of concern?
  • Statistical Analysis: Provides methods for dealing with uncertainty within the data sets yielding a confidence for comprehension.


A “Why” versus “What” mapping will help organize the approach to security analytics.  The “why” half of the mapping lays out the purpose of the inquiry. These typically fall under detection, operations & analytics and compliance. The “what” half of the mapping describes the data source used in the analytics. These can include business systems, applications & databases, servers & desktops, network security appliances and various other sources.


 why v what.png


This “Why” vs. “What” mapping is then turned into a use case taxonomy.  Below is a sample taxonomy for real-time correlation within a SIEM. It flows from purpose to deployment method, incorporating event context and an event threshold. The result is a defined action to be taken by the security analyst or an automated system.


use case taxonomy.png


This method of breaking down questions into categories, then mapping the “Why” vs. “What, and finally determining use cases is a way to ensure that the results produced by the security analytics solution are fully utilized by the business and the existing processes and procedures.


See how HP HAVEn can help answer your data security questions.


Check out part 4 of this series: Big Data Analytics Part 4: Visualization is Key

About the Author


Leave a Comment

We encourage you to share your comments on this post. Comments are moderated and will be reviewed
and posted as promptly as possible during regular business hours

To ensure your comment is published, be sure to follow the Community Guidelines.

Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.
January 2016
Software Expert Days - 2016
Join us online to talk directly with our Software experts during the online Expert Days - see details below. Software experts do not monitor this foru...
Read more
See board event postings
Vivit Events - 2016
Learn about upcoming Vivit webinars and live events in 2016.
Read more
View all