Behind the scenes at Labs
cancel
Showing results for 
Search instead for 
Did you mean: 

An Oral History of The Machine—Chapter Six: Moving from imperative to declarative

Curt_Hopkins

Joan-Varvenne-and-Herry-Herry-discussing-Loom.gifJoan Varvenne and Herry Herry discussing Loom

Chapter Six: : Moving from imperative to declarative

By Curt Hopkins, Managing Editor, Hewlett Packard Labs

The Machine, Hewlett Packard Labs’ capstone project for a new computing architecture, was not cut from whole cloth. It was the pulling together of a loom’s worth of thread. More than that, that loom itself had been assembled years earlier.

We spoke to several dozen researchers – programmers, architects, open source advocates, optical scientists, and others – to construct an oral history of the years-long process of innovating the first real change in computing in 70 years.

These men and women are not only scientists, they are also compelling story tellers with an exciting history to relate. If you’re interested in how differently we will be gathering, storing, processing, retrieving, and applying information in the near future, or you just enjoy good stories about science and discovery, read on.

This chapter delves into security and manageability, their interdependence, and provides a blueprint for further development.  

If you would like to read other entries in the series, click here

JULIO GUIJARRO

Research Director, Manageability and Cloud

How do you tame the complexity of a system so you need less human effort?

The Machine is very complex. Our original target was to be able to scale to 1 million nodes. The number of people you would need to manage The Machine at that size would be more than you could handle. It wouldn’t be cost effective. The number of things that need to be managed in the world are growing but the number of people that can deal with them is shrinking. We realized we were going to need a fully automated system and help those people to achieve their tasks.

We focused on this before the industry focused on it. We saw more and more hardware and more complex systems coming to the fore, and recognized we were seeing too many things for one person to deal with. So we asked, how do we make it simple and resilient?

MARTIN SADLER

VP, Director, Security and Manageability Lab

When Martin Fink came on board at Labs, he had this problem in mind and asked us, as hardware came into focus, to solve it, which we did with the change from the imperative to the declarative model.

JULIO GUIJARRO

Research Director, Manageability and Cloud

In declarative management, you tell the device what you want, not how to do it. Normal systems are imperative, you tell them what to do, what steps to follow. Most things are also not state-full and idempotent (you ask for something 20 times and you always get the same result and not 20 different results).

Think about the remote control on your stereo: in most cases, if you click it once, it turns it on; click it twice, it turns off. That’s imperative, you tell the machine what to do at each step. Most people focus on workflow to simplify things. In contrast, with declarative management, we can tell the system, “I want 40 nodes up and running.” That’s all you say, and the system figures out how to do it. You don’t have to lead it by the hand at each stage. The Machine self-corrects at any failure.

RYCHARDE HAWKES

Distinguished Technologist, Visualization Lead

Because there is so much data with a system like The Machine, you need a new way of interfacing. We focused on visualization techniques. Behind the scenes, you have big data, but if we find the right visualization, don’t need a PhD in data science to run it. We developed Loom, which creates a graph of the system of interconnected elements. The human mind can deal with aggregated elements, which Loom uses to allow users to organize information in a way that makes sense to them.

Rycharde Hawkes on Loom

MARTIN SADLER

VP, Director, Security and Manageability Lab

In detective show on television you see the detectives organizing information on a case – evidence, locations, suspects – with photos and string – essentially pointing out what are the interesting objects. Loom is like an automated assistant to this kind of process.

JULIO GUIJARRO

Research Director, Manageability and Cloud

Labs has a long history of large-scale management. We started working with telecoms before moving into utility computing in 2001, then into cloud before Open Stack, and had to manage it along the way. We had to maintain deployment efficiently, as we were a small team. Our third attempt produced the Loom design language. As I mentioned, we had a one million node system as a goal and nothing could manage that density. You can’t put a million things on a screen. We knew that Memory-Driven Computing was good for graphs so we applied the same techniques, using the problem to solve the problem. Now we could process data in near real time.

Protect detect recover

FRASER DICKIN

Senior Research Engineer, Security and Manageability

Martin Fink recognized in an increasingly hostile world an opportunity to design security in from the outset. For most people, security is largely an afterthought.

FRASER DICKIN

Senior Research Engineer, Security and Manageability

In the early state of The Machine program, we thought we would create our own operating system. We soon realized that was a task too big for an organization our size to bite into, and we could adapt Linux to our needs. So we had to ask ourselves how we would change Linux to create, among other things, these recovery elements we planned. We’ve gotten a lot of traction in splitting the kernel, in understanding how you provide controls to identify bits that need replacing.

RICHARD BROWN

Senior Research Manager, Security and Manageability

The system we developed envisaged the following major elements: Hardware-validated boot flow, runtime kernel integrity checking, recovery, and data protection.

FRASER DICKIN

Senior Research Engineer, Security and Manageability

The Machine security system detects deviations from a “known good state” and protects data, both in non-volatile memory (NVM) and while being communicated across the infrastructure. Error signals propagate back to the node manager, which communicates them to the manageability console. The node is flagged as not having loaded properly which triggers a recovery mechanism. Once booted, runtime kernel integrity checking constantly monitors the code executing on a node, triggering an alert if malicious code is detected. On today’s systems, sophisticated malware can cheat the OS, but on The Machine we run kernel integrity checking on a separate secure part of the chip, which is never at risk.

Because The Machine uses NVM, allowing for an always-on state when it comes to the data, we built a bridge firewall that connects one device to another and knows who is allowed to access what. We also had to implement the capability of doing encryption/decryption at memory speed so that if someone takes out a memory module from the rack, they will not be able to read it.

RICHARD BROWN

Senior Research Manager, Security and Manageability

A lot of these techniques can be reapplied to other types of systems. Mechanisms like kernel inspection could be done on today’s computers, but may require some kind of hardware support. The problem with most of today’s computers is their architecture is 20 or 30 years old. They never talked about security at that time, so it wasn’t built in.

To read other entries in the series, click here

About the Author

Curt_Hopkins

Managing Editor, Hewlett Packard Labs

Comments
Mahmoud Ibrahim

 Where can I download the software?

martina_trucco

Hi @Mahmoud Ibrahim - thanks for reading! You can view all the currently available open source packages here: https://www.labs.hpe.com/the-machine/the-machine-distribution

We'll continue to add more as it becomes available.

Thanks!

Martina from Hewlett Packard Labs