Advancing Life & Work

Labs researcher wins ICDE Ten-Year Influential Paper Award

isis-franca-641217-unsplash (Custom).jpg

 By Curt Hopkins, Managing Editor, Hewlett Packard Labs

IEEE’s 35th IEEE International Conference on Data Engineering has awarded their Ten-Year Influential Paper Award to Labs principal research scientist Harumi Kuno and her co-authors for their 2009 paper, “Predicting Multiple Metrics for Queries: Better Decisions Enabled by Machine Learning.”

The award is meant to highlight milestones in computer research whose influence has proven both prescient and enduring.

Kuno and her co-authors – Archana Ganapathi, Armando Fox, Michael Jordan, and David Patterson of UC Berkeley's RAD Lab; and Umeshwar Dayal, Janet L. Wiener, of Hewlett-Packard Labs – developed a process that used machine learning to predict the performance characteristics – resource usage and runtimes – of queries for mixed workloads (both transactional/decision support) for large-scale parallel databases.

“We have developed a system that uses machine learning to accurately predict the performance metrics of database queries whose execution times range from milliseconds to hours,” wrote the authors in their abstract.

Every database vendor struggles with managing unexpectedly long-running queries. When these long-running queries can be identified before they start, they can be rejected or scheduled when they will not cause extreme resource contention for the other queries in the system. Second, deciding whether a system can complete a given workload in a given time period (or a bigger system is necessary) depends on knowing the resource requirements of the queries in that workload.

“What was novel,” says Kuno, “was that this was one of the first attempts to use machine learning to make a complicated system work better. It was a very early effort to leverage machine learning to model how to set the knobs on a complex software system running on special hardware!”

Today, machine learning is one of the most publically-discussed and popular innovations in computing, used for classification, prediction, and optimization across subject areas and device types. But when the system was developed, none of this was obvious. The researchers identified a likely development and walked that development forward systematically.

The initial reaction in 2009 to the paper’s presentation was, as Kuno says, full of “hot questioning and scepticism.” But time has proven the authors’ approach not just valid but essential.

The main change Kuno says she would make, were time travel an option, is to make the data engineering behind the system – from the generation of training data to the development and application of the models -- more accessible. It was, she says, “not easy for others to pick up our system and run with it.”

Given that writing training data bases remains among the most problematic aspects of machine learning to this day, that hardly seems to dim the light the IEEE International Conference on Data Engineering has recognized in this paper, the system it describes, and the minds behind it.

Photo by Isis França on Unsplash

Curt Hopkins
Hewlett Packard Enterprise



0 Kudos
About the Author


Managing Editor, Hewlett Packard Labs