Toward Distributed Quantum Simulation

Labs_Editorial · ‎11-18-2024

Guest Authors: Grace Johnson, Aniello Esposito, Xin Zhan, and Masoud Mohseni

Quantum computing has attracted significant interest in the HPC community as an important tool in the future of supercomputing. Quantum computers could provide significant speedups compared to their classical counterparts for a range of applications, with the largest potential impacts in materials design, drug discovery, and cryptography. Instead of replacing classical computers as general-purpose systems, quantum computers can be better understood as accelerators that efficiently carry out specialized tasks within an HPC framework. Hybrid quantum-classical algorithms and frameworks will be crucial not only in the near term—the noisy intermediate-scale quantum, or NISQ, era—but also for future fault-tolerant quantum computers. In the NISQ era, distributing computation across multiple quantum processing units (QPUs) will be necessary to study systems larger than the ~100-qubit processors available today. In the fault-tolerant era, parallelizing quantum error correction over several QPUs will also be necessary and perhaps even more efficient than using a single larger QPU.

To scale quantum computers and achieve practical high-performance quantum computing, efficiently partitioning and distributing quantum tasks is not only essential in the short-term, but may even be optimal in the long-term. At Hewlett Packard Labs, we are tackling the problem by developing a family of methods for quantum workload distribution called adaptive circuit knitting. We are using the NVIDIA CUDA-Q platform to develop and test our methods, and planning to scale quantum simulations to hundreds of qubits.

Distributing quantum computation is a highly non-trivial task. Quantum entanglement—long-range correlation between qubits that are not necessarily close in physical space—exists and is in fact fundamental to the power of quantum computing. But these quantum correlations can be difficult to characterize, and for a given problem it may not be known beforehand which quantum correlations are important to keep (i.e., which qubits should be located on the same QPU) and which are safe to ignore (a good place to partition). We need a method for partitioning quantum circuits that can adaptively adjust to changing entanglement patterns as the circuit evolves. A technique for partitioning quantum circuits called circuit knitting has recently emerged [1-3], and is of particular interest in HPC because it lends itself to classical parallelization. Figure 1 shows an example of cutting quantum gates between qubits. In circuit knitting, a measured observable is reconstructed by sampling sub-circuits of the original circuit multiple times. This result gives us insight into the behavior of the quantum system—the observables can be chosen to reveal, for example, properties of a molecule, chemical reaction, or quantum matter near a phase transition. Unfortunately, the observable reconstruction comes at an exponential post-processing cost, potentially robbing us of the advantages of the quantum computation. This overhead can be reduced by minimizing the number of cuts that produce the sub-circuits [3-6]. However, it would likely be even more efficient to learn a compact representation of the quantum circuit that has a built-in capability of capturing quantum entanglement, then learn optimal partitions on the fly in an adaptive fashion.

FIGURE 1: Cutting gates in circuit knitting. After cutting, sub-circuits must be sampled many times to reconstruct (knit) the observable.

To address these challenges of building a distributed quantum computer, we introduce a new type of circuit knitting called adaptive circuit knitting that decreases the sampling overhead of circuit knitting by cutting gates in locations that minimize entanglement between partitions. Here we illustrate our adaptive circuit knitting strategy by constructing a compact representation from tensor network (TN) approaches developed in the quantum physics and quantum chemistry communities [7-8]. Tensor networks represent quantum states in a particular compressed form that can provide a useful structure for characterizing entanglement patterns. In the context of quantum circuits, a TN can be efficiently expressed as a circuit of linear depth with respect to the amount of necessary quantum correlations that should be considered (bond dimension). This linear scaling was shown by Lin et. al [9] for matrix product states (MPS), a type of TN widely used to study 1D quantum systems. In this context, QPUs can be seen as custom hardware that can accelerate approximate classical TN operations, in analogy to GPUs accelerating linear algebra operations. Combining the structure of TNs with linear depth quantum circuits is the basis for our method. Figure 2 provides a schematic for adaptive circuit knitting.

FIGURE 2: Schematic of the adaptive circuit knitting method. In the inner loop, a variational optimizer finds circuit parameters U(θ) for partitions of a quantum system (based on a tensor network) in parallel. In the outer loop, an adaptive procedure finds cuts which minimize entanglement between partitions. After the best cuts are found, observables are reconstructed via circuit knitting.

We demonstrate the utility of our adaptive circuit knitting method by simulating the dynamics of quantum spin systems. Simulating quantum systems for materials science or quantum chemistry is perhaps the most promising application for quantum computers. Spin-lattice systems are well-studied in materials science, and despite their simplicity can exhibit complex quantum phenomena that are difficult to simulate classically [10-11]. As a prototype system, we apply our adaptive circuit knitting method to simulating the non-equilibrium dynamics of a strongly-disordered spin chain evolving under an Ising model with transverse and longitudinal fields given by the Hamiltonian H = – ∑^N-1_i=1J_{i , i}₊₁ σ ^z_iσ^z_i₊₁ – ∑^N_i₌₁ g_i σ_i^x – ∑^N_i₌₁ h_iσ_i^zwhere σ^zand σ^xare the Pauli Z and X matrices, respectively, i indexes lattice site, and J s, g s, and h s are real-valued parameters. We study strongly-disordered systems (where parameters are varied at each lattice site) because they are naturally occurring systems when representing discrete optimization and machine learning applications, they are important for understanding exotic states of matter, they can be difficult to study, and because they lead to many-body localization effects that could be exploited for more efficient simulation.

FIGURE 3: (a) Example of a 20-qubit spin chain where the adaptive cut is chosen at the minimum entanglement entropy. (b) Histogram of sampling overheads resulting from adaptive and load-balanced baseline cuts for an ensemble of 32-qubit strongly disordered spin chains.

Figure 3 provides a summary of results for a disordered system—an ensemble of 32-qubit spin chains each time-evolving under a Hamiltonian with different parameters. We partition each system into two sub-circuits and compare the overhead of circuit knitting for a cut randomly chosen near the middle of the chain (our 'load-balanced' baseline choice) vs. a cut recommended by the entropy heatmap from the adaptive algorithm. Figure 3a gives a schematic for a single instance on a smaller 20-qubit case. Figure 3b shows the distribution of overheads for reconstructing an observable for the adaptive and load-balanced (baseline) cuts. Both cuts are similarly accurate, but in most cases the adaptive cut results in a much lower overhead—the green distribution is clearly shifted to the left. On a case-by-case basis, the median reduction in cost was 15x, while the 75^th and 95^th percentiles were 59x and 450x, respectively.

To carry out these simulations on supercomputing systems at scale, we used the GPU-accelerated quantum circuit simulators available on the CUDA-Q platform. The cuStateVec backend enabled our highly computationally intensive simulations to be distributed across multiple NVIDIA GH200 Grace Hopper superchips. These superchips employ a breakthrouigh design for a high bandwidth connection between the NVIDIA Grace CPU and NVIDIA Hopper GPU to help scientists and researchers scale complex applications, enabling noiseless simulations that are useful for prototyping, especially while access to high-quality QPUs remains limited. While the scale of these simulations is impressive, we still need to demonstrate practical usefulness of these methods. Future work will include employing AI techniques for more sophisticated optimization, investigating fast entanglement measure evaluations, and studying 2D systems with higher order TN techniques. The simulation of 2D quantum systems is where many classical methods underperform and quantum computers could likely provide the largest impact. With these advances, a high-performance implementation in CUDA-Q, access to supercomputers like Perlmutter [11] and higher-quality near-term quantum devices from experimental partners, it may be possible to simulate new physics beyond the reach of current classical tensor network methods.

While our work demonstrates the power of using CUDA-Q and classical HPC to simulate quantum systems—especially in the near-term as QPUs are limited in size and fidelity—it is only the first step. As we progress toward using quantum computers to address real-world problems, HPC will play an integral role in hybrid algorithms, quantum circuit synthesis, device control, and error correction in addition to distributing quantum workloads with methods like adaptive circuit knitting. As the systems integrator that has delivered the world’s highest performing (and most sustainable) classical supercomputers, HPE is eager to bring our expertise to the challenge of quantum-HPC integration at scale.

[1] S. Bravyi, G. Smith, and J. A. Smolin, “Trading classical and quantum computational resources,” Physical Review X, vol. 6, no. 2, p. 021043, 2016.

[2] T. Peng, A. W. Harrow, M. Ozols, and X. Wu, “Simulating large quantum circuits on a small quantum computer,” Physical review letters, vol. 125, no. 15, p. 150504, 2020.

[3] C. Piveteau and D. Sutter, “Circuit knitting with classical communication,” IEEE Transactions on Information Theory, 2023.

[4] W. Tang, T. Tomesh, M. Suchara, J. Larson, and M. Martonosi, “CutQC: Using small quantum computers for large quantum circuit evaluations,” in Proceedings of the 26th ACM International conference on architectural support for programming languages and operating systems, 2021, pp. 473–486.

[5] W. Tang and M. Martonosi, “ScaleQC: A scalable framework for hybrid computation on quantum and classical processors,” arXiv preprint arXiv:2207.00933, 2022.

[6] S. Basu, A. Das, A. Saha, A. Chakrabarti, and S. Sur-Kolay, “FragQC: An efficient quantum error reduction technique using quantum circuit fragmentation,” Journal of Systems and Software, vol. 214, p. 112085, 2024

[7] S. R. White, “Density-matrix algorithms for quantum renormalization groups,” Physical review B, vol. 48, no. 14, p. 10345, 1993.

[8] H.-D. Meyer, U. Manthe, and L. S. Cederbaum, “The multi-configurational time-dependent hartree approach,” Chemical Physics Letters, vol. 165, no. 1, pp. 73–78, 1990.

[9] S.-H. Lin, R. Dilip, A. G. Green, A. Smith, and F. Pollmann, “Real-and imaginary-time evolution with compressed quantum circuits,” PRX Quantum, vol. 2, no. 1, p. 010342, 2021.

[10] M. Heyl, “Dynamical quantum phase transitions: a review,” Reports on Progress in Physics, vol. 81, no. 5, p. 054001, 2018.

[11] “HPC for simulating quantum circuits,” 2023. [Online]. Available: https://community.hpe.com/t5/advancing-life-work/hpc-for-simulating-quantum-circuits/ba-p/7200536

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Toward Distributed Quantum Simulation

Labs_Editorial

Author

Kudos