Behind the scenes at Labs
Showing results for 
Search instead for 
Did you mean: 

Is Apache Spark the best candidate for a distributed deep learning platform?



By Curt Hopkins, Managing Editor, Hewlett Packard Labs

A tutorial created by Labs senior research engineer Alexander Ulanov is now available on O’Reilly’s Data Tools webcast series.

Distributed deep learning on Spark” addresses the popular area of machine learning, but with a twist.

“Deep learning models that are used in practice for image classification and speech recognition contain a huge number of weights, require a lot of computations, and are trained with large datasets,” said Ulanov.

Training models with such complexity can take days – months even – on a single machine. Ulanov’s tutorial explores how to scale out the training using distributed computations and data processing.

Specifically, Ulanov looks at Apache Spark as a contender for such a distributed training platform. He offers an overview and comparison of a number of different tools and frameworks that have been proposed for performing deep learning on Spark and compares them and explores the limitations of distributed training itself.

0 Kudos
About the Author


Managing Editor, Hewlett Packard Labs