Servers & Systems: The Right Compute
1756777 Members
2121 Online
108852 Solutions
New Article
PankajGoyal

AI can be scary. But choosing the wrong partners can be mortifying!

Choosing the right partners is critical so your AI solution won’t look like a Frankenstein! (To help make AI less frightening, it’s good to know that HPE and Intel have a long track record of success.)

As you continue to dive deeper into AI, you will discover it is more than just deep learning. AI is an extremely complex set of machine learning, deep learning, reinforcement, and analytics algorithms with varying compute, storage, memory, and communications needs. AI models are shifting in complexity—and real-world deployments need not only training but inference.

You can future proof your data center with modernization investments that address the diverse requirements of a broad range of analytics workloads, including AI. A modern infrastructure built on industry-standard hardware can also help maximize utilization to achieve your TCO objectives and eliminate complexities introduced by new architectures.

With this in mind, HPE and Intel have partnered to help broaden the portfolio, because we know this shift is occurring and the need for various choices and solutions is imminent.

Accelerate AI applications

Just as you consider which superhero costume you want wear on Halloween, one has to explore the right hardware for your deep learning training. You must consider how often you need to train, what type of data you have (structured, unstructured, type of image, voice, text, etc.), and how much time you can tolerate between each run.

For example, some accelerators can work well for tasks like image recognition and were once the only deep learning training acceleration option. However, for memory-intensive data (including massive amounts of unstructured data), sparse data, and annual training exercises, CPUs perform well.

When supported on HPE Apollo and HPE ProLiant family of servers, Intel® Xeon® Scalable processors are enhanced with substantial improvements in software optimizations and hardware instructions, more complex, hybrid applications can be accelerated, including larger, memory-intensive models. Here, deep learning applications can run alongside other applications on the same analytics infrastructure for higher overall utilization. On-premise and/or in the cloud, AI can be done well on the architecture you already know. Upgrading to Intel® Xeon® Scalable processors in your data center enables you to maximize utilization of existing, familiar infrastructure by running high-performance data center and AI applications side by side.

In addition to CPUs, HPE will be supporting Field Programmable Gate Arrays (FPGA). Complementary to CPUs, FPGAs allow specific workload acceleration, such as database acceleration, financial back-testing of trading algorithms, and Big Data process acceleration. FPGAs are providing significantly reduced power usage, increased speed, lower materials cost, minimal implementation real estate, and increased possibilities for reconfiguration on the fly to run different algorithms that can be changed in real time. Be on the lookout for HPE’s announcement at SC18 and the availability for next-gen Intel Arria FPGA and support on select HPE ProLiant Gen10 Servers.

Scale with a high-speed interconnect

With trick or treating where the bigger the bag, the more candy you can carry. The same applies to the need to have a high-speed interconnect. It is critical to scale and push data to the servers, CPUs, and FPGAs to crunch data for deep learning algorithms. When AI systems grow and scale up, as they often do, the fabric that stitches that system together must be able to grow seamlessly too—to maintain its speed, security, agility, versatility, and robustness throughout.

Intel Inside_jpg.pngIntel® Omni-Path Architecture (Intel® OPA) is a high-speed interconnect developed originally for high-performance computing (HPC) clusters, whose efficiency and speed in this domain improves scalability and increases density, while reducing latency, cost, and power on the frontiers of AI as well. Moreover, clusters built with OPA can occupy a versatile niche in which they run HPC workloads during the day and compute-intensive deep learning training workloads at night.

By combining the HPE Apollo Systems, HPE SGI 8600 and HPE ProLiant servers interface seamlessly with Intel OPA fabric, you now have an interconnect solution that spans entry-level clusters all the way through to supercomputers. With highly power efficient and price optimized solutions, HPE and Intel OPA can meet the needs of HPC or AI customers. Whether customers are seeking entry-level, rack scale systems that are air cooled or a high-end liquid cooled system, HPE has the HPC and AI solution for you—with Intel OPA optimized across the portfolio. 

Open software, libraries, and tools to speed deployment and optimize performance

Frameworks and libraries are of the utmost importance in moving AI forward. Application developers need software tools that are easy to use, speed up the workflow, and come with ecosystem support that helps them through the rough patches.  Intel’s Deep Learning Software Optimization MKL-DNN OpenSource Framework is optimized with the popular deep learning frameworks like TensorFlow* and MXNet* to consistently deliver more optimizations and performance as the software continues to evolve along with the AI landscape. Optimizations across hardware and software have dramatically extended the capabilities of Intel® Xeon® Scalable platforms for deep learning, already resulting in more than 240x performance gains for training, and nearly 280x inference across many popular frameworks[i].

HPE has validated Intel’s Deep Learning Software Optimization MKL-DNN (TensorFlow, Mxnet and OpenVino) OpenSource Framework on the HPE Apollo and ProLiant systems and can be downloaded directly from these Intel sites:

HPE and Intel: Better together

No one wants the surprise of a yucky apple in their trick or treat bag! That is why HPE and Intel have partnered together to help remove the surprises from your next AI project with the release of new Intel CPU-based AI Inference Bundles from HPE[ii]. These AI solutions are based on HPE ProLiant DL360 Gen10 compute platforms, Intel® Xeon® Scalable processors, Intel OPA®  fabric along with optional downloadable Intel’s Deep Learning Software Optimization MKL-DNN (TensorFlow, Mxnet & OpenVino) OpenSource Frameworks.

So whether are you just getting started or are looking to scale, HPE and Intel can help ensure your AI project is not bewitched. Let us take the fear out of your next AI project! We have something that will fit your needs.

For additional info, please reach out to your HPC/AI specialist or contact AI_MadeEasy@hpe.com and someone will be in touch with you.

Learn more at Supercomputing 18

If you happen to be in Dallas, Texas on November 12-15, be sure to stop by the HPE booth #2429 at SC18 showcasing all these Intel technologies. And learn how HPE and Intel can help you on your next AI or HPC project.


IIntel Performance results are based on testing as of 06/15/2018 (v3 baseline), 05/29/2018 (241x) & 6/07/2018(277x) and may not reflect all publically available security updates. See configuration disclosure for details. No product can be absolutely secure. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.  Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions.  Any change to any of those factors may cause the results to vary.  You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit:  http://www.intel.com/performance.

Configurations for Inference throughput

Tested by Intel as of  6/7/2018:Platform :2 socket Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz / 28 cores HT ON , Turbo ON Total Memory 376.28GB (12slots / 32 GB / 2666 MHz),4 instances of the framework, CentOS Linux-7.3.1611-Core , SSD sda RS3WC080 HDD 744.1GB,sdb RS3WC080 HDD 1.5TB,sdc RS3WC080 HDD 5.5TB , Deep Learning Framework caffe version: a3d5b022fe026e9092fc7abc7654b1162ab9940d Topology:GoogleNet v1 BIOS:SE5C620.86B.00.01.0004.071220170215 MKLDNN: version: 464c268e544bae26f9b85a2acb9122c766a4c396 NoDataLayer. Measured: 1449 imgs/sec  vs Tested by Intel as of 06/15/2018 Platform: 2S Intel® Xeon® CPU E5-2699 v3 @ 2.30GHz (18 cores), HT enabled, turbo disabled, scaling governor set to “performance” via intel_pstate driver, 64GB DDR4-2133 ECC RAM. BIOS: SE5C610.86B.01.01.0024.021320181901, CentOS Linux-7.5.1804(Core) kernel 3.10.0-862.3.2.el7.x86_64, SSD sdb INTEL SSDSC2BW24 SSD 223.6GB. Framework BVLC-Caffe: https://github.com/BVLC/caffe, Inference & Training measured with “caffe time” command.  For “ConvNet” topologies, dummy dataset was used. For other topologies, data was stored on local storage and cached in memory before training.  BVLC Caffe (http://github.com/BVLC/caffe), revision 2a1c552b66f026c7508d390b526f2495ed3be594

Configuration for training throughput:

Tested by Intel as of  05/29/2018 Platform :2 socket Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz / 28 cores HT ON , Turbo ON Total Memory 376.28GB (12slots / 32 GB / 2666 MHz),4 instances of the framework, CentOS Linux-7.3.1611-Core , SSD sda RS3WC080 HDD 744.1GB,sdb RS3WC080 HDD 1.5TB,sdc RS3WC080 HDD 5.5TB , Deep Learning Framework caffe version: a3d5b022fe026e9092fc7abc765b1162ab9940d Topology:alexnet BIOS:SE5C620.86B.00.01.0004.071220170215 MKLDNN: version: 464c268e544bae26f9b85a2acb9122c766a4c396 NoDataLayer. Measured: 1257 imgs/sec  vs Tested by Intel as of 06/15/2018 Platform: 2S Intel® Xeon® CPU E5-2699 v3 @ 2.30GHz (18 cores), HT enabled, turbo disabled, scaling governor set to “performance” via intel_pstate driver, 64GB DDR4-2133 ECC RAM. BIOS: SE5C610.86B.01.01.0024.021320181901, CentOS Linux-7.5.1804(Core) kernel 3.10.0-862.3.2.el7.x86_64, SSD sdb INTEL SSDSC2BW24 SSD 223.6GB. Framework BVLC-Caffe: https://github.com/BVLC/caffe, Inference & Training measured with “caffe time” command.  For “ConvNet” topologies, dummy dataset was used. For other topologies, data was stored on local storage and cached in memory before training.  BVLC Caffe (http://github.com/BVLC/caffe), revision 2a1c552b66f026c7508d390b526f2495ed3be594

[ii] Note: AI Inference promotional bundles are currently available in North America only.



Pankaj Goyal
VP Artificial Intelligence & Strategy/Operations,
Hewlett Packard Enterprise

twitter.gif @pango
linkedin.gif goyalpankaj

0 Kudos
About the Author

PankajGoyal

Pankaj is building HPE’s Artificial Intelligence business. He is excited by the potential of AI to improve our lives, and believes HPE has a huge role to play. In his past life, he has been a computer science engineer, an entrepreneur, and a strategy consultant. Reach out to him to discuss everything AI @HPE.