Path: blob/main/chemistry-simulations/qmmm.ipynb
1145 views
QM/MM Hybrid Workflows - Combining VQE with a Polarizable Embedding Framework
This lab presents a hybrid workflow called QM/MM commonly used by computational chemists to tackle challenging multi-scale problems such as chemical reaction within solution or drug-protein binding. You will learn how CUDA-Q can be used to combine quantum computing methods like VQE with classical polarizable forcefields into a single flexible workflow for QM/MM problems.
What you will do:
Learn the concepts behind QM/MM
Learn how to compute one-particle reduced density matrices with CUDA-Q
Implement a QM/MM workflow using VQE for the QM part.
Run VQE QM/MM with an active space to study a larger molecule.
Combine ADAPT-VQE with QM/MM to improve the workflow.
CUDA-Q syntax you will use
cudaq.spinopcudaq.observeexpectation()
Prerequisites: This lab assumes learners have a basic understanding of chemical terminology and quantum computing concepts like VQE, expectation values, and qubit Hamiltonians as well as some experience working with CUDA-Q. More advanced concepts will be introduced along the way and learners will be directed to additional resources where they can dive deeper. The final exercise assumes users have completed the ADAPT-VQE CUDA-Q academic lab. If you have not done this, you can come back and finish that part later.
Run the cell below to load the packages you will be using. You may need to restart the kernel after running. Note that you may need to run apt-get update followed by apt-get install python3-dev and apt-get install build-essential to install the prerequistes needed for the CPPE library.
An overview of QM/MM
QM/MM is a multi-scale technique most useful in situations where the key properties of a chemical reaction are dominated by the interactions of a small number of atoms within an environment of many atoms. Ideally, one would simulate the entire environment with high accuracy methods. However, this requires simulation of highly complex quantum systems which quickly become too complex for even the best supercomputers.
One could choose to ignore the environment, but it does impact the results and can lead to qualitatively wrong predictions. A reasonable approximation is to split the problem into a high accuracy QM (quantum mechanics) region within a low accuracy MM (molecular mechanics/classical) region. This compromise can often result in more reliable predictions with a fraction of the compute resources.
The QM part is often simulated with something like DFT, Hartree-Fock, Coupled Cluster, etc, while the MM part is described by classical molecular mechanics using force fields. However, this depends on the specific scales in question.
Common situations for QM/MM are a drug binding to a large protein or a chemical reaction occurring in solution. The image below shows an example from "A Projector-Embedding Approach for Multiscale Coupled-Cluster Calculations Applied to Citrate Synthase" where the authors studied the first step of the citric acid cycle using high-accuracy QM simulations within the environment of an enzyme modeled with lower quality methods.

The trickiest part of QM/MM is determining how the two regions interact. All schemes introduce some sort of error. For example, one approach is called subtractive QM/MM where the entire system is computed with MM methods, then the small region alone is computed with MM methods and subtracted, finally the small region is computed with QM and added back in.
The problem here is that both the MM and QM computations on the small region ignore interactions with the large region so error is introduced. There are a number of similar ways, but all hinge on the nature of the interaction between the regions.
A more sophisticated approach is to set up the problem such that the QM and MM regions interact through a polarizable embedding scheme. The steps for this are:
Define the geometry of all molecules and specify the molecules in the QM and MM regions.
Using an electronic structure method, compute the ground state of the QM region using an adjusted Hamiltonian that depends on the density of the MM region.
Use the resulting wavefunction to compute the one particle reduced density matrix (1-RDM).
Compute the induced dipoles in the MM region based on the 1-RDM.
Recompute the energy of the QM region using the new MM dipoles.
Repeat until self-consistent field (SCF) convergence of the energy and density.
This procedure allows the two regions to interact and provides a more realistic approximation of the physics within the QM region. To get a sense of how this works with a visualization, try the widget linked here.
Some Technical Prerequisites (Adjusted Hamiltonians and RDMs)
Construction of the QM/MM workflow requires a few technical prerequisites worth reviewing. The starting point is the standard molecular Hamiltonian for the a molecule in a vacuum (), which contains the typical electron and nuclear interaction terms. However, to compute the total energy of the QM region, an additional term is needed to capture the interaction between the electrons in the QM region and the induced dipoles of the MM region.
is the bridge which allows the SCF procedure to optimize the QM ground state and the MM region dipoles simultaneously.
Most of the technical details related to computing the induced dipoles and the coefficients are beyond the scope for this lab and will be handled by external tools like the CPPE library. Curious readers can see the full derivation in "The variational quantum eigensolver self-consistent field method within a polarizable embedded framework".
One particular item that does warrant additional discussion is the one-particle reduced density matrix (1-RDM). This matrix contains information about all single electron properties of the QM wavefunction and is the key quantity needed from the QM wavefunction to update . Given a wavefunction composed of spin orbitals , the 1-RDM is the matrix .
The matrix elements can be computed using a quantum computer and evaluating the following observables with a prepared state.
If , then
If , then
If , then
can be decomposed into where each is computed based on the index corresponding to the and spins.
Exercise 1
Before you build the entire workflow, you need a couple of functions to compute the 1-RDM. First, define a function which returns the spin operator necessary to compute . Then, define a function which takes a from the preceding VQE procedure and returns the full 1-RDM in the molecular orbital basis.
Use the "TODO" flags to guide your work.
Building the QM/MM workflow
A minimal QM/MM example is a LiH molecule in the presence of two water molecules. You can initialize a molecule using the auxiliary get_mol_pe_hamiltonian function which takes the geometry of a molecule, and a file containing the positions of the water, their multipoles, and polarizabilities. These quantities describe the distribution of charge around the atoms and the proclivity for the electron distribution to be perturbed in an electric field.
There are two files in the cell below which only differ in the locations of the two water molecules. The cell below classically computes the Hartree Fock and CCSD energies for LiH in the presence of the solvent with the energy contribution from the solvent printed as well.
Exercise 2
Try computing the polarizable embedding energies for LiH in using the two different water potential files and comment on the magnitudes of the polarizable embedding energies. Does the result make sense?

The previous cell also saves a number of key molecular quantities obtained from the classical preprocessing. This includes the standard integrals, and the integrals from the PE model. These can be used with the function below to build a spin operator corresponding to the standard Hamiltonian (spin_mol_ham) and the additional term (spin_pe_ham) which describes excitations caused by interactions with the water molecules.
We also need the Hartree Fock coefficients from an SCF run of the molecule in the solvent. This is computed below where mf_pe produces the HF coefficients of the molecule within the polarized-embedding environment of the water molecules.
You are almost ready to construct the main QM/MM loop which corresponds to the figure below.

The loop will continue until the final VQE energy no longer changes between iterations, i.e. the orbitals (within the filed of the induced dipoles) is converged. One slight caveat is that the final VQE energy is not the final energy of the procedure. First, the interaction term needs to be subtracted from the VQE energy, leaving only . Then, the term is added directly from the CPPE library output. This ensures the final result properly sums the energy of the QM region, the MM region, and the interaction between the two without double counting.
the cell below defines the convergence criteria for the procedure and the initial parameters for the UCCSD ansatz, obtained from the auxiliary uccsd_parameter_size function. The parameters conv_tol and vqe_tol determine when the outer SCF loop and the inner VQE procedure converges, respectively.
Exercise 3
Now, you are ready to run the full QM/MM loop started for you below. You will be filling in the sections labeled TODO. The first step is to run VQE using the modified (PE included) Hamiltonian. The code below then saves that state which is used as an input for your `compute_1_rdm` function that returns the appropriate 1-RDM computed in the molecular orbitals basis. The following steps are needed:
(TODO - 1) Convert the 1-RDM from the molecular orbital (MO) basis to the atomic orbital (AO) basis. Recall this is done by performing a similarity transform where 'mf_pe' form the matrix . Thus, .
(Done for you) The function called takes the 1-RDM in the AO basis and, using CPPE, computes a new set of induced dipoles and updates the Hamiltonian term , resulting in a new
spin_pe_hamvariable. It also computes a new energy of the classical MM region .(TODO - 2) Compute the energy for the current step. That is and update the change in energy.
(TODO - 3) Compute the final energy after the main QM/MM loop is finished. That is, .
After you complete these steps, run the code and make sure it works. Note that occasionally, VQE will get stuck trying to converge. If this happens, usually restarting fixes the problem.
Flexibility and Scalability
One of the key benefits of QM/MM methods is flexibility. The QM region can be expanded as needed to more faithfully capture the chemistry of the system and, in principle, the QM and the MM parts of the code can swap with many different methods to optimize tradeoffs for accuracy and speed as needed.
This section will explore two such approaches for modifying your current workflow to tackle larger problems. The first demonstration will simulate the NH molecule within solution (modeled by 46 adjacent water molecules in the MM region.). The classical cost to include the extra water molecules is minimal but can provide a better model for a solution phase molecule compared to the LiH example above with only two water molecules.
The costly change here is moving from LiH to NH which would require 30 qubits to simulate with a 6-31G basis set. One solution is to use an active space where only a subset of the highest energy occupied orbitals and lowest energy virtual orbitals are included. This approximation greatly reduces the number of qubits while, hopefully, still capturing enough of the physics to produce a "good enough" result.
In practice, active space selection is a non-trivial task and should be undertaken with great care, especially if high accuracy is required. It can usually provide a significant reduction in cost with minimal loss in accuracy if done properly.
To visualize active space selection for the NH molecule with a 6-31G basis set, try this widget.
Exercise 4
Look through the code below, and find the comments labeled "Look Here" and note where all of the adjustments are made to the code to account for the active space. Notice how the changes are rather minimal and do not change the overall workflow.
Try rewriting the problem with the following cases.
4 electrons in 4 orbitals
2 electrons in 4 orbitals
How much does the energy change between each scenario? Is this evidence that we are using a good approximation in this case if we are targeting chemical accuracy (0.001 hartree).
The second example demonstrating the flexibility of QM/MM considers replacement of the VQE part of the code with a more resource efficient method like ADAPT-VQE or the Generative Quantum Eigensolver. As long as a method can produce an approximate ground state of , it can swap places with VQE with no significant impact on the rest of the QM/MM code loop, while providing potentially greater accuracy and speed.

Exercise 5
If you have already completed the ADAPT-VQE lab, try to insert your code into the QM/MM workflow. This will allow the QM/MM procedure to leverage all of the convergence benefits of ADAPT-VQE.
The code cell below is blocked where you need to input the main ADAPT loop and the ADAPT preliminary functions. Make sure you make all necessary edits to the QM/MM loop, so states and energies are handled correctly.
Summary
You have just leanred how to use CUDA-Q to construct a QM/MM workflow that takes advantage of the strengths of quantum and classical computing. Advanced users are encouraged to build on this example and explore different combinations of methods based on this QM/MM workflow.