- Integrated Systems
- About Us
- Integrated Systems
- About Us
How my supercomputer journey led me into the COVID-19 fight
Dr. Joe Landman shares his research history—and how HPE and the University of Alabama in Huntsville are now using supercomputers to shorten the time required to perform drug discovery research.
A few months ago, several of us at HPE started thinking about ways to help scientists who are researching COVID-19. It wasn't initially apparent that we could help, as HPE is a technology solutions company and hands-on scientific research in biochemistry is not something we usually do. But we know computational science and are good at building and helping our users make effective use of supercomputers. Could this help researchers? In short, yes, it could. To explain this, I want to share my research history.
In the early 1990s, I was working with a research group simulating defect formation and migration in gallium arsenide-based (GaAs) semiconductors. We were limited in the range of answers we could explore due to the technology available. We had workstations and supercomputers. The workstations were prevalent but slower for calculations. The supercomputers were faster but a highly contended-for resource, with high demand and long user queues.
What we were looking for was a defect migration over a barrier and measuring the energy heights. This was for an important problem in building gallium arsenide-based CPUs.
Unfortunately, one calculation that we were interested in was simply out of reach from a computational standpoint when we started. Our code was too slow, the workstations too slow, and the systems were unable to model a big enough system due to memory constraints.
So, I worked on it.
With my code optimizations, including parallelization and vectorization, the available supercomputers allowed us to perform our calculations in about 1 CPU year.
The tools we used—Cray supercomputers and SGI multi-CPU workstations, combined with some smart physics and initial conditions—enabled us to reach further than we had been able to before. We computationally simulated something complicated and expanded our understanding of defect formation and migration in gallium arsenide.
We learned that a supercomputer is a tool that enables you to reach further—to explore far beyond what you can do on a desktop machine.
Of course, a lack of GaAs CPUs is not a crisis, but the COVID-19 pandemic certainly has been. Given the combination of virality and lethality of this disease, this is an urgent problem, and finding some way to mitigate what this virus does is needed—quickly.
So, what can we do with a supercomputer to assist with this?
Thanks to the partnership between Microsoft Azure and Cray (and now HPE) for providing real, significant supercomputing computational power to Azure customers, a Cray XC50 supercomputer—Sentinel—is sitting in a Microsoft Azure data center. We have set up the ancillary systems and support infrastructure to enable researchers to access the system for COVID-19 research.
For the COVID-19 situation, Drs. Baudry and Byler at the University of Alabama in Huntsville are participating with Oak Ridge National Lab on searching for a set of small molecules that will bind to active sites in the SARS-CoV-2 virus proteins. These proteins are part of the molecular machinery that the virus uses to enter the cell and produce more copies of itself. The search for a set of small molecules that could interrupt this process is an important step in developing mitigation and therapies for COVID-19.
The research team had been able to process about 20,000 molecular dockings of ligands (small molecules) against a single protein target in a day (about 86,400 seconds or 24 hours) on their existing system. Initial estimates of how long it might take to achieve useful information to send to medicinal chemists for analysis and synthesis suggested that many weeks to months would elapse before they would have a set of actionable information.
So, at HPE, we started working with Drs. Baudry and Byler.
I worked with Dr. Byler on getting their Autodock Vina program, input cases, data sets, and a run script. I added instrumentation to watch the Autodock Vina run and see where it spent the most time. I also explored its scalability, that is, how its run-time varied as a function of the number of CPUs that you use.
We started with optimizing to run as efficiently as possible with a small number of CPUs. On a single CPU, Autodock Vina completed in about 90 seconds. With ten CPUs, it was down to about 20 seconds. Looking at the source code, with some effort, I was able to update the source code to use newer versions of libraries compilers.
These changes resulted in run times of about seven seconds on ten CPUs for the same program that took about 20 seconds with the older executable.
Now that we had each instance of the program running efficiently, it was time to leverage Sentinel’s full power. Instead of running a single job per node in the supercomputer, we needed to run several and effectively fill up the supercomputer.
My initial attempt was unsatisfactory. I was able to double the speed of the researchers' machine on Sentinel, bringing the time to process 20,000 ligands to about 11 hours (40,000 seconds), but I was running into configuration issues.
The team, led by Dr. Shenghua Li of HPE, reworked the queuing system and job scheduler, taking the time down from 40,000 seconds to around 4,000 seconds for 20,000 ligands docking against a protein.
I thought this was good. We had reduced the time to process 20,000 ligands to a little over an hour, but a few things were concerning me. First, it appeared each run was running slowly. I watched on a node as a job was running, and it was slow!
As it turns out, on supercomputers, you use a few different mechanisms to start jobs, and we needed to use a different mechanism. Even though I had told the job to grab ten processors, the job start mechanism had restricted the program to run on a single processor! With a few final fixes we enabled the job enqueuing for a full run of 20,000 ligand dockings against a protein.
Roughly 470 seconds. A bit better than the original 86,400 seconds (one day).
With these and several other changes, the research group can now perform roughly 1.2 million molecular dockings per day. They are leveraging Sentinel, the Cray XC50 supercomputer, to help find small molecules that might one day be turned into therapies for COVID-19.
In this way, our team at Cray/HPE helps researchers reach further and achieve better result faster—for urgent problems that face us all. And that is how my supercomputer journey led me into the COVID-19 fight.
For more information about the research being made to the fight against COVID-19, follow Dr. Baudry’s journey.