Algorithmic fault-tolerance for low overhead quantum computing

WO2026084738A3PCT designated stage Publication Date: 2026-06-18PRESIDENT & FELLOWS OF HARVARD COLLEGE +1

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
PRESIDENT & FELLOWS OF HARVARD COLLEGE
Filing Date
2025-03-28
Publication Date
2026-06-18

AI Technical Summary

Technical Problem

Current quantum computing architectures face significant challenges in constructing large-scale systems due to high error correction overheads, requiring resources that exceed the scale of currently available systems, and there is a need for novel approaches to reduce these overheads and enable utility-scale quantum computation.

Method used

The use of transversal gates and correlated decoding techniques in quantum error correction, combined with dynamically reconfigurable neutral atom arrays, reduces error correction overhead from O(d³) to O(d²) and enables scalable quantum computation by allowing fewer rounds of error correction per gate, utilizing non-local connectivity and efficient hardware implementations.

🎯Benefits of technology

This approach significantly reduces the space-time volume required for fault-tolerant quantum computation, enabling the construction of utility-scale quantum processors with reduced resource requirements and improved computational efficiency.

✦ Generated by Eureka AI based on patent content.
Patent Text Reader

Abstract

Error-corrected quantum computation using transversal gates and correlated decoding is provided. A first and second logical qubit is encoded into physical qubits according to a quantum error correcting code. Based on the quantum error correcting code, a bipartite decoding graph is constructed corresponding to the first and the second logical qubits, the bipartite decoding graph comprising a plurality of detector nodes and a plurality of error nodes, each error node corresponding to an error mechanism. A transversal gate is applied to the first and the second logical qubits. Syndrome measurement of the first and the second logical qubits is performed. For each of the plurality of detector nodes affected by the corresponding error mechanism of one of the plurality of error nodes, an edge is generated on the bipartite decoding graph therebetween. A physical error configuration is determined from the bipartite decoding graph.
Need to check novelty before this filing date? Find Prior Art

Description

ALGORITHMIC FAULT-TOLERANCE FOR LOW OVERHEAD QUANTUM COMPUTINGCROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of U.S. Provisional Application No. 63 / 572,229, filed March 30, 2024, which is hereby incorporated by reference in its entirety.STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

[0002] This invention was made with government support under 1745303 and 2012023 and 2313084 awarded by National Science Foundation (NSF) and under HR0011-23-3-0030 awarded by U.S. Department of Defense / Defense Advanced Research Projects Agency (DOD / DARPA) and under W911NF-23-2-0219 and W911NF-20-1-0082 and W911NF-20-1-0021 awarded by U.S. Army Research Office (ARO). The government has certain rights in this invention.BACKGROUND

[0003] Embodiments of the present disclosure relate to quantum computation, and more specifically, to error-corrected quantum computation using transversal gates and correlated decoding.BRIEF SUMMARY

[0004] According to embodiments of the present disclosure, methods of performing quantum computation are provided. At least a first plurality of physical qubits is provided. A first logical qubit is encoded into the at least first plurality of physical qubits according to a quantum error correcting code. A second logical qubit is encoded into the at least first plurality of physical qubits according to the quantum error correcting code. Based on the quantum error correcting code, a bipartite decoding graph is constructed corresponding to the first and the second logical qubits, the bipartite decoding graph comprising a plurality of detector nodes and a plurality of error nodes, each error node corresponding to an error mechanism. A transversal gate is applied to the first and the second logical qubits. A first round of syndrome measurement of the first and the second logical qubits is performed. For each of the plurality of detector nodes affected by the corresponding error mechanism of one of the plurality of error nodes, an edge is generated on thebipartite decoding graph therebetween. A physical error configuration is determined from the bipartite decoding graph.

[0005] In some embodiments, at least one gate is applied to the first plurality of physical qubits and the second plurality of physical qubits to correct the physical error configuration.

[0006] In some embodiments, the quantum error correcting code is a surface code.

[0007] In some embodiments, the transversal gate is a Clifford gate. In some embodiments, the Clifford gate is a CNOT gate.

[0008] In some embodiments, a third logical qubit is encoded into the at least first plurality of physical qubits according to the quantum error correcting code. In some embodiments, the transversal gate is a non-Clifford gate. In some embodiments, the non-Clifford gate is a CCZ gate.

[0009] In some embodiments, determining the physical error configuration comprises maximizing an error probability on the bipartite decoding graph. In some embodiments, maximizing the error probability comprises solving a mixed-integer programming problem corresponding to the error probability.

[0010] In some embodiments, determining the physical error configuration comprises determining one or more subgraphs of the bipartite decoding graph corresponding to the physical error configuration. In some embodiments, determining the one or more subgraphs comprises defining a subgraph for each detector node having a detected error and expanding each such subgraph until it encompasses error nodes, which, if they had occurred, would result in syndrome measurements consistent with the observed syndrome.

[0011] In some embodiments, determining the physical error configuration comprises determining one or more subgraph of the bipartite decoding graph by backpropagation from a plurality of logical measurements. In some embodiments, the plurality of logical measurements is chosen such that their product commutes with all logical Pauli stabilizers of the quantum error correcting code.

[0012] In some embodiments, determining the physical error configuration further comprises applying a decoder to the one or more subgraph iteratively. In some embodiments, determining the physical error configuration further comprises applying a decoder to the one or more subgraph in parallel. In some embodiments, the decoder comprises a matching decoder. In some embodiments, the matching decoder comprises minimum weight perfect matching on the one ormore subgraph. In some embodiments, said backpropagation is performed prior to applying the decoder.

[0013] In some embodiments, each qubit of the first plurality of physical qubits and the second plurality of physical qubits is a neutral atom.

[0014] In some embodiments, the at least first plurality of physical qubits comprises a second plurality of physical qubits, and the first logical qubit is encoded in the first plurality of physical qubits and the second logical qubit is encoded in the second plurality of physical qubits.

[0015] In some embodiments, the at least first plurality of physical qubits comprises a second plurality of physical qubits and a third plurality of physical qubits, and the first logical qubit is encoded in the first plurality of physical qubits, the second logical qubit is encoded in the second plurality of physical qubits, and the third logical qubit is encoded in the third plurality of physical qubits.

[0016] In some embodiments, applying the transversal gate comprises placing the first and second pluralities of physical qubits such that each physical qubit of the first plurality of physical qubits is within a blockade radius of exactly one corresponding physical qubit of the second plurality of physical qubits and illuminating the first and second plurality of physical qubits with a first laser.

[0017] In some embodiments, one or more additional transversal gates is are applied to the first and the second logical qubits alternately with applying one or more additional rounds of syndrome measurement of the first and the second logical qubits.

[0018] According to embodiments of the present disclosure, quantum processors are provided. A first array of optical traps is disposed in an active zone. A second array of optical traps is disposed in a readout zone. A first laser is configured to illuminate the active zone and to drive a transition to a Rydberg state. A second laser is configured to illuminate the active zone and to drive a transition between hyperfine states. A third laser is configured to illuminate the readout zone. A fourth laser is configured to adiabatically move neutral atoms between the optical traps of the active zone and the readout zone. A camera is configured to capture an image of the readout zone. The quantum processor is configured to: provide at least a first plurality of neutral atoms in the active zone, each in a respective optical trap of the first array; encode a first logical qubit into the at least first plurality of neutral atoms according to a quantum error correcting code by the first and second lasers; encode a second logical qubit into the at least first plurality ofneutral atoms according to the quantum error correcting code by the first and second lasers; based on the quantum error correcting code, constructing a bipartite decoding graph corresponding to the first and the second logical qubits, the bipartite decoding graph comprising a plurality of detector nodes and a plurality of error nodes, each error node corresponding to an error mechanism; place the first and second pluralities of neutral atoms in the active zone such that each neutral atom of the first plurality of neutral atoms is within a blockade radius of exactly one corresponding neutral atom of the second plurality of neutral atoms; illuminate the first plurality of neutral atoms and the second plurality of neutral atoms while in the active zone by at least the first or second laser, thereby applying a transversal gate to the first and second logical qubits; performing a first round of syndrome measurement of the first and the second logical qubits; for each of the plurality of detector nodes affected by the corresponding error mechanism of one of the plurality of error nodes, generating an edge on the bipartite decoding graph therebetween; and determining a physical error configuration from the bipartite decoding graph.

[0019] According to embodiments of the present disclosure, methods of performing a quantum computation are provided. At least a first plurality of physical qubits is provided. A first logical qubit and a second logical qubit are encoded into the at least first plurality of physical qubits according to a quantum error correcting code. A first transversal gate is applied to one or more of the first and second logical qubits. A first round of syndrome measurement of the first and / or second logical qubit is performed. A first measurement of the second logical qubit is obtained. Based on the quantum error correcting code, a first bipartite decoding graph is constructed corresponding to the first and second logical qubits and the first transversal gate, the bipartite decoding graph comprising a plurality of detector nodes and a plurality of error nodes, each error node corresponding to an error mechanism. For each of the plurality of detector nodes affected by the corresponding error mechanism of one of the plurality of error nodes, an edge is generated on the first bipartite decoding graph therebetween. A first physical error configuration is determined from the first bipartite decoding graph and the first round of syndrome measurements. A second transversal gate is applied to at least one of the first and second logical qubits. A second round of syndrome measurement of the first and / or second logical qubits is performed. A second measurement of the first and second logical qubits is obtained. Based on the quantum error correcting code, a second bipartite decoding graph is constructed corresponding to the first and second logical qubits and the first and second transversal gates, thebipartite decoding graph comprising a plurality of detector nodes and a plurality of error nodes, each error node corresponding to an error mechanism. For each of the plurality of detector nodes affected by the corresponding error mechanism of one of the plurality of error nodes, an edge is generated on the second bipartite decoding graph therebetween. A second physical error configuration is determined from the second bipartite decoding graph and the second round of syndrome measurements.

[0020] In some embodiments, the second transversal gate is conditional on the first measurement.

[0021] In some embodiments, a first value of the first logical qubit is decoded from the first measurement based on the first bipartite decoding graph; a second value of the first logical qubit is decoded from the second measurement based on the second bipartite decoding graph; and one or more logical operator is applied to the first and / or second logical qubits according to a disparity between the first and second values.

[0022] In some embodiments, at least one gate is applied to the first plurality of physical qubits to correct the first and / or second physical error configuration.

[0023] According to embodiments of the present disclosure, methods of performing a quantum computation are provided. At least a first plurality of physical qubits is provided. A first logical qubit and a second logical qubit is encoded into the at least first plurality of physical qubits according to a quantum error correcting code. A first transversal gate is applied to one or more of the first and second logical qubits. A first measurement of the second logical qubit is obtained. Based on the quantum error correcting code, a first bipartite decoding graph is constructed corresponding to the first and second logical qubits, the first transversal gate, and the first measurement, the bipartite decoding graph comprising a plurality of detector nodes and a plurality of error nodes, each error node corresponding to an error mechanism. For each of the plurality of detector nodes affected by the corresponding error mechanism of one of the plurality of error nodes, an edge is generated on the first bipartite decoding graph therebetween. A first physical error configuration is determined from the first bipartite decoding graph. A second transversal gate is applied to at least one of the first and second logical qubits. A second round of syndrome measurement of the first and / or second logical qubits is performed. A second measurement of the first and second logical qubits is obtained. Based on the quantum error correcting code, a second bipartite decoding graph is conducted corresponding to the first andsecond logical qubits and the first and second transversal gates, the bipartite decoding graph comprising a plurality of detector nodes and a plurality of error nodes, each error node corresponding to an error mechanism. For each of the plurality of detector nodes affected by the corresponding error mechanism of one of the plurality of error nodes, an edge is generated on the second bipartite decoding graph therebetween. A second physical error configuration is determined from the second bipartite decoding graph and the second round of syndrome measurements.BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

[0024] Fig. 1 is a schematic view of a quantum information architecture according to embodiments of the present disclosure.

[0025] Fig. 2 is a level diagram showing key87Rb atomic levels according to embodiments of the present disclosure.

[0026] Fig. 3 is a schematic view of the implementation of the toric code according to embodiments of the present disclosure.

[0027] Fig. 4 is a schematic view of a quantum processing unit (QPU) according to embodiments of the present disclosure.

[0028] Fig. 5 is a schematic view of logical qubits, illustrating efficient control according to embodiments of the present disclosure.

[0029] Fig. 6 is a schematic view of logical qubits, illustrating the application of a transversal CNOT gate according to embodiments of the present disclosure.

[0030] Fig. 7 is a schematic view of a portion of a processor core according to embodiments of the present disclosure.

[0031] Fig. 8 is a schematic view of a portion of a processor core suitable for use in implementing a repetition code according to embodiments of the present disclosure.

[0032] Fig. 9 is a schematic view of a portion of a processor core suitable for use in implementing a surface code according to embodiments of the present disclosure.

[0033] Fig. 10 is a schematic view of a method of active, feedforward QEC according to embodiments of the present disclosure.

[0034] Fig. 11 is a schematic view of an apparatus for quantum computation according to embodiments of the present disclosure.

[0035] Fig. 12 is an illustration of logical operation time for lattice surgery and transversal CNOTs according to embodiments of the present disclosure.

[0036] Figs. 13A-C are illustrations of error propagation in a transversal CNOT and the resulting detector error model according to embodiments of the present disclosure.

[0037] Figs. 14A-F illustrate the reduction in spacetime cost of logical algorithms according to embodiments of the present disclosure.

[0038] Figs. 15A-B illustrates a logical S gate according to embodiments of the present disclosure.

[0039] Fig. 16 illustrates a magic state distillation circuit according to embodiments of the present disclosure.

[0040] Figs. 17A-F illustrate an exemplary neutral atom layout for magic state distillation according to embodiments of the present disclosure.

[0041] Figs. 18A-B illustrate circuits for the ripple-carry adder and QROM according to embodiments of the present disclosure.

[0042] Figs. 19A-B illustrate transversal algorithmic fault-tolerance according to embodiments of the present disclosure.

[0043] Figs. 20A-D illustrate decoding strategy according to embodiments of the present disclosure, (a-c) Illustration of

[0044] Figs. 21A-B compare transversal algorithmic fault-tolerance according to embodiments of the present disclosure and lattice surgery.

[0045] Fig. 22 shows graphs of physical error rate versus heralded failure rates for repeated Bell measurements with injected initialization states.

[0046] Fig. 23 illustrates a circuit for distillation of |Y ) magic states according to embodiments of the present disclosure.

[0047] Fig. 24 shows an illustration of the unrotated surface code according to embodiments of the present disclosure.

[0048] Fig.25 illustrates decoding transversal algorithm(s) according to embodiments of the present disclosure.

[0049] Figs.26A-C illustrate matchable decoding graphs and iterative decoding according to embodiments of the present disclosure.

[0050] Figs. 27A-C illustrate benchmarking decoding strategies according to embodiments of the present disclosure.

[0051] Fig.28A-C illustrate magic state distillation according to embodiments of the present disclosure.DETAILED DESCRIPTION

[0052] Large-scale quantum computers have the potential to solve problems that are intractable for classical processors. While exciting progress has been made on developing small- and medium-scale quantum processors and applying them to study physical phenomena in complex quantum systems in a regime that is difficult to simulate classically, it is unclear if and how truly large scale quantum processors can be constructed and applied to solving general purpose, computationally hard problems, whose value exceeds the cost of construction. For example, current estimates for the resources required to realize one of the most prominent high value applications, Shor’s factoring algorithm, require around five thousand logical qubits, and a few billion non-Clifford gates with error probability below 10-12to break 2048-bit RSA encryption. Alternatively, using conventional error correction methods, 20 million superconducting qubits with realistic gate errors (0.1%) could be used, which exceeds the scale of currently available, well-controlled systems by nearly six orders of magnitude.

[0053] This necessitates the development of novel, unconventional approaches to utility-scale quantum computation, likely involving a synergistic combination of new hardware architectures that significantly reduce the costs of error correction and computation, new resource-efficient approaches to algorithm development co-designed with the hardware, as well as large scale engineering efforts to construct practical systems. Furthermore, any such large-scale development will also require extensive effort to verify and validate.

[0054] To address these and other shortcomings of alternative approaches, the present disclosure provides systems and methods for reducing overhead in error-corrected quantum computing using transversal gates and correlated decoding.

[0055] In particular, the present disclosure describes how an error correction architecture based on transversal gates can lead to substantial reductions in the space-time volume of fault-tolerant quantum computation, from O(d3) in alternative lattice-surgery-based schemes, to potentially O(d2) in a transversal-gate-based architecture. Algorithmic building blocks are described, witha particular focus on joint, correlated decoding across multiple logical qubits that allows the use of fewer than d rounds of quantum error correction per transversal gate. An exemplary implementation of this in neutral atom array systems, one of the leading fault-tolerant quantum computing architectures, is also provided.

[0056] A quantum bit (qubit) is the fundamental building block for a quantum computer. By analogy to classical bits which are used to store information in traditional computers (each bit is 0 or 1), qubits can occupy two distinct states labeled |0) and |1), or any quantum superposition of the two states. In various applications, multiple qubits are entangled in order to build multiqubit quantum gates.

[0057] Bits and qubits are each encoded in the state of real physical systems. For example, a classical bit (0 or 1) may be encoded in whether a capacitor is charged or discharged, or whether a switch is ‘on’ or ‘off.

[0058] The term qudit (quantum digit) denotes the unit of quantum information that can be realized in suitable d-level quantum systems. A collection of qubits that can be measured to N states can implement an / V-level qudit.

[0059] Quantum bits are encoded in quantum systems with two (or more) distinct quantum states. There are many physical realizations that may be employed. One example is based on individual particles such as atoms, ions, or molecules which are isolated in vacuum. These isolated atoms, ions, and molecules have many distinct quantum states that correspond to different orientations of electron spins, nuclear spins, electron orbits, and molecular rotations / vibrations.

[0060] In principle, a qubit may be encoded in any pair of quantum states of the atom / ion / molecule. In practice, a key parameter of qubits is described by their quantum coherence properties. Coherence measures the lifetime of the qubit before its information is lost. It has a close analogy with classical bits: if you prepare a classical bit in the 0 state, then after some time it may randomly be flipped to 1 due to environmental noise. Quantum mechanically, the same error may occur: |0) may randomly flip to |1) after some characteristic timescale. However, qubits may suffer from additional errors: for example, a superposition state (|0)+|1)) / √2 may randomly flip to (|0)-|l))A / 2. In real quantum computers, the qubits must be encoded in quantum states which have long coherence properties.

[0061] Quantum computers generally can contain many qubits, each encoded in its own atom / molecule / ion / etc.. Beyond simply containing the qubits, the quantum computer should be able to (1) initialize the qubits, (2) manipulate the state of the qubits in a controlled way, and (3) read out the final states of the qubits. When it comes to manipulation of the qubits, this is usually broken down into two types: one type of qubit manipulation is a so-called single-qubit gate, which means an operation that is applied individually to a qubit. This may, for example, flip the state of the qubit from |0) to |1), or it may take |0) to a superposition state (|0)+|1)) / √2. The second necessary type of qubit manipulation is a multi-qubit gate, which acts collectively on two or more qubits, including those that are entangled. A multi-qubit gate is realized through some form of interaction between the qubits. The various quantum computing platforms (having various physical encodings of qubits) rely on different physical mechanisms both for single-qubit gates as well as multi-qubit gates according to the physical system that is storing the qubit.

[0062] In various embodiments of a quantum computer, a qubit is encoded in two near-groundstate energy levels of an atom, ion, or molecule. An example of this is a hyperfine qubit. Such a qubit is encoded in two electronic ground states that differ by the relative orientation of the nuclear spin with respect to the outer electron spin. Pairs of such states can be chosen so that they are particularly robust / insensitive to environmental perturbations, leading to long coherence times. These states are split in energy by the hyperfine interaction energy of the atom / ion / molecule, which is the interaction energy between the nuclear spin and the electron spin. The robustness of the qubit can be understood as the energy splitting between the two states being particularly stable. For this reason, such states are called clock states because the stable energy splitting can form an excellent frequency-reference and as such forms the basis for atomic clocks. Typical hyperfine splitting between these qubit states is in the 1 - 13 GHz frequency range.

[0063] To perform single-qubit gates on such a hyperfine qubit, it is possible to apply coherent microwave radiation at the exact frequency of the energy splitting between states. However, there are two drawbacks to this approach. First, microwaves cannot be applied to just one qubit without affecting adjacent qubits. This is because qubits are encoded in particles that are typically just a few microns apart from one another, and micro waves cannot be focused to such a small scale due to their large wavelength. Second, the microwave intensity is fairly limited and as such the maximum speed of single-qubit gates is correspondingly limited.

[0064] An alternative approach is based on stimulated Raman transitions. In this case, a laser field is applied to the atoms / ions / molecules. The laser field is nearly (but not exactly) resonant with an optical transition from one of the ground states to an optically excited state. The laser contains multiple frequency components separated in frequency by exactly the amount equal to the hyperfine splitting of the qubit. The atom / ion / molecule can absorb a photon from one frequency component and coherently emit into a different frequency component, and in doing so it changes its state. This approach benefits from the capability of focusing the laser field onto individual particles or subsets of particles in the quantum computer. The laser field can also be applied with high intensity, allowing much faster gate operations.

[0065] Neutral atom quantum computers encode qubits in individual neutral atoms. The neutral atoms are trapped in a vacuum chamber and levitated by trapping lasers. Most commonly, the trapping lasers are individual optical tweezers, which are individual tightly focused laser beams that trap an individual atom at the focus. Alternatively, individual atoms may be trapped in an optical lattice, which is formed from standing waves of laser light which produce a periodic structure of nodes / antinodes.

[0066] A typical approach for encoding a qubit in neutral atoms is the hyperfine qubit approach, in which two ground states split by several GHz form the qubit. Multi-qubit gates in neutral atom quantum computers are realized using a third atomic state, which is a highly-excited Rydberg state. When one atom is excited to a Rydberg state, neighboring atoms are prevented from being excited to the Rydberg state. This conditional behavior forms the basis for multiqubit gates, such as a controlled-NOT gate. The Rydberg state is used temporarily to mediate the multi-qubit gate, and then the atoms are returned back from the Rydberg state to the ground state levels to preserve their coherence.

[0067] Trapped ion quantum computers use atomic species that are ionized, meaning they have a net charge. In most cases, many ions are trapped in one large trapping potential formed by electrodes in a vacuum chamber. The ions are pulled to the minimum of the trapping potential, but inter-ion Coulomb repulsion causes them to form a crystal structure centered in the middle of the trapping potential. Most commonly, the ions arrange into a linear chain. Other ways to trap ions are also possible, such as using optical tweezers, or trapping ions individually with local electric fields with a more complex on-chip electrode structure.

[0068] Qubits are encoded in trapped ions in multiple ways. One common approach is to use ground-state hyperfine levels, as described for neutral atoms. In trapped ions with hyperfine-qubit encoding, as with neutral atoms, single-qubit gates may use microwave radiation or stimulated Raman transitions.

[0069] Unlike in neutral atoms, trapped ion hyperfine qubits rely heavily on stimulated Raman transitions for performing multi-qubit gates. Stimulated Raman transitions may be used to control both the hyperfine state of the ion but also to change the motional state of the ion (z.e., add momentum). This can be understood as absorbing a photon moving in one direction and emitting a photon in a different direction, such that the difference in photon momentum is absorbed by the ion. Since many ions are often trapped in one collective trapping potential and are mutually repelling one another, changing the motional state of one ion affects other ions in the system, and this mechanism forms the basis for multi-qubit gates.

[0070] According to various embodiments of a quantum computer, individual particles (atoms / ions / molecules) can first be trapped in an array and arranged into particular configurations. Next, one or more particles are prepared in a desired quantum state. Quantum circuits can then be implemented by a sequence of qubit operations acting on individual qubits (single-qubit gates) or on groups of two or more qubits (multi-qubit gates). Finally, the state of the particles can be read out in order to observe the result of the quantum circuit. The readout can be accomplished using an observation system that typically includes an electron-multiplied CCD (EMCCD) camera image to detect particles’ loaded positions, and a second camera image to read out the particles’ final states by, for example, detecting fluorescence emitted by the particles in their final states.

[0071] Quantum information platforms rely on interactions between qubits, either for performing quantum gates or for performing analog many-body simulation. Qubits often interact in a local way, however, which limits the connectivity of the circuit or the analog simulation and constrains the possible computations. While some platforms can communicate in a nonlocal way through the use of a shared bus (e.g., trapped ions), these shared-bus approaches are limited to small systems and thus still require a way to dynamically move qubits around in order to truly scale up the platform.

[0072] Neutral atom arrays can be dynamically reconfigured while preserving quantum coherence and entanglement between qubits, by storing quantum information in hyperfine statesand shuttling atoms in optical tweezers. This approach offers a scalable way to realize a quantum information system with large numbers of qubits and arbitrary programmability -where any qubit can perform an entangling gate with any other qubit in the array. Using high-fidelity two-qubit Rydberg gates, various quantum information circuits are described herein that leverage the programmability and nonlocal connectivity achievable with these approaches. An example of high fidelity Rydberg gates is described in Levine, et al., Parallel Implementation of High-Fidelity Multiqubit Gates with Neutral Atoms, Phys. Rev. Lett., vol. 123, issue 17, https: / / link.aps.org / doi / 10.1103 / PhysRevLett.123.170503, which is hereby incorporated by reference.

[0073] As set out in more detail below, the methods provided herein enable a variety of computational scenarios. In some scenarios, a plurality of neutral atom are moved in parallel between multiple regions in space. For example, a source of illumination may be directed to a first region, and atoms are moved in and out of that region between the application of pulses by the source of illumination. Similarly, a camera may be directed to an imaging region, and atoms are moved in and out of that imaging region for imaging. Similarly, atoms may be moved in and out of the blockade radius of other atoms, thereby allowing the application of gates to the different groups of atoms at different stages of an algorithm or layers of a quantum circuit.

[0074] It will be appreciated that various stabilizer codes entail the readout of ancilla qubits, and the present disclosure allows the physical relocation of ancilla qubits to an imaging region separate from the data qubits. In this way, readout of ancilla qubits may be provided without destruction of the data qubits.

[0075] More generally, an array of atoms may be moved between multiple arrangements to facilitate both digital gates between different selections of atoms and analog evolution of the array as a whole. As used herein, an arrangement of an array of atoms or a plurality of atoms refers to the positioning of those atoms relative to each other. It will be appreciated that certain arrangements provide connectivity between qubits that enable particular gates or analog evolution according to a particular Hamiltonian. One advantage of the methods provided herein is that atoms may be moved into proximity of atoms that were not adjacent within an array. A non-adjacent atom is one that is not within a unit cell in a regular lattice or that is not a nearest neighbor in an irregular array. For example, in a rectangular lattice, each atom has eight atoms that are within a unit cell thereof, and thus has eight adjacent atoms (disregarding edges).

[0076] As defined further below, atoms are moved adiabatically in order to preserve entanglement. As used herein, the term adiabatic movement refers to movement that avoids a transition of the subject atom within its trap. For example, where the first time-derivative of the acceleration of the subject atom is not greater than a predetermined value, the movement is considered adiabatic. Typically, adiabatic movement occurs when jerk < (size of atom-) X (trap frequency)3. In physics, jerk or jolt is the term given to the rate at which an object's acceleration changes with respect to time.

[0077] In addition to adiabatic movement, in some embodiments dynamical decoupling is applied during the movement. As set out further below, a rr-pulse during movement cancels out dephasing induced by the trap differential light shift. The trap differential light shift changes when the atom is moving (depending on its acceleration) because it will move in the trap, and so sample a different portion of the light intensity and hence have a different differential light shift.

[0078] Generally speaking, the more pulses applied, the more decoupling from fluctuations. For example, fluctuations may come from laser intensity fluctuations at different displacement positions of the atom, or different magnetic fields in space.

[0079] In embodiments where acceleration and deceleration are symmetric, both change the differential light shift in the same way. Accordingly, in such embodiments it is advantageous to apply a rr-pulse at the midpoint of the motion. In this way, the changes in differential light shift induced by acceleration and deceleration cancel each other out.

[0080] Current fault-tolerant quantum computing architectures are primarily based on surface codes and lattice surgery, due to the convenience of such operations to be laid out in a 2D-local planar architecture. However, neutral atom arrays, trapped ions, silicon quantum dots, photonics, and other systems may allow non-local connectivity, opening up new architectural possibilities. This application describes how transversal gates in such an architecture can lead to substantial reductions in the space-time volume of fault-tolerant quantum computation.

[0081] Conventional lattice surgery schemes with 2D topological codes require O(d³) space per logical qubit, and each logical operation typically requires O(d) time to be fault-tolerant. Here, d is the code distance, characterizing the error correcting capability of a given code. For example, both a CNOT gate and an S gate in the surface code require this O(d) time cost when implemented with lattice surgery. On the other hand, transversal gates, such as the transversalCNOT in CSS codes, only require a single time step yet are still fault-tolerant. This suggests a reduction of the time cost of running a logical operation from 0(d) to 0(1), and a corresponding space-time overhead reduction of O(d3) to O(d2). In the deep fault-tolerant regime, code distances on the order of d ~ 30 are expected to obtain sufficient error suppression, which means that this method can give rise to a 3 Ox reduction in resources required.In the present application, various building blocks are provided that support and enable this architecture. This includes:• Reduction of error-correction overhead from O(d3) to O(d2) by using transversal gates.• Correlated decoding techniques to enable this in the context of transversal algorithm execution.• The use of such techniques in key algorithmic gadgets such as magic state factories and quantum arithmetic.• The direct logical qubit connectivity enabled by this approach, which can reduce routing overhead.• The hardware-efficient implementation of this architecture in neutral atoms.

[0082] Referring to Fig. 1, a quantum information architecture enabled by coherent transport of neutral atoms is illustrated. Qubits are transported to perform entangling gates with distant qubits, enabling programmable and nonlocal connectivity. Atom shuttling is performed using optical tweezers, with high parallelism in two dimensions and between multiple zones allowing selective manipulations. The inset shows the atomic levels used: the |0), |1) qubit states refer to the mF= 0 clock states of87Rb, and |r) is a Rydberg state used for generating entanglement between qubits, which are further described with regard to Fig.2.

[0083] Fig.2 is a level diagram showing key87Rb atomic levels used. The Rydberg excitation scheme from |1) to |r) is composed of a two-photon transition driven by a 420-nm laser and a 1013-nm laser. A DC magnetic field of B = 8.56 is applied throughout this work.

[0084] As noted above, quantum information systems derive their power from controllable interactions that generate quantum entanglement. However, the natural, local character of interactions limits the connectivity of quantum circuits and simulations. Nonlocal connectivity can be engineered via a global shared quantum data bus, but these approaches are limited in either control or size.

[0085] According to various embodiments of the present disclosure, this long-standing challenge is addressed through dynamically reconfigurable arrays of entangled neutral atoms, shuttled by optical tweezers in two spatial dimensions. Hyperfine states are used for storing and transporting quantum information in between quantum operations, and excitation into Rydberg states is used for generating entanglement. Highly parallel operations are enabled via selective qubit operations in distinct zones that qubits are dynamically shuttled between. Taken together, these ingredients enable a powerful quantum information architecture, which is employed to realize applications including entangled state generation, creation of topological surface and toric code states, and hybrid analog-digital quantum simulations.

[0086] Within this architecture, programming a specific quantum circuit entails control over only a few optical degrees of freedom. Arbitrary tweezer positions in space are controlled by a computer-generated hologram, hundreds of atoms are dynamically reconfigured in parallel by two waveforms in a 2D acousto-optic deflector (AOD), and qubit operations are realized by pulsing optical beams. This flexible optical control enables sophisticated quantum circuits with only a few classical controls. This architecture enables an inherently scalable approach: larger codes require no increase in the number of classical controls.

[0087] Various quantum circuits are realizable with this approach, including quantum error correction (QEC) codes such as the surface and Steane codes, with fidelities in this disclosure already comparable to state-of-the-art experiments in other platforms. Moreover, the parallelized, nonlocal connectivity is used to create the toric code state on a torus.

[0088] Fig. 3 illustrates implementation of the toric code state encoding two protected qubits obtained using mobile ancilla qubit arrays. The top illustrates a graph state realizing the two logical-qubit product state |+)H+)f of the toric code upon projective measurement of the ancilla qubits in the X-basis. The bottom includes images showing the movement steps implemented in creating and measuring the toric code state. Shading in the final image represents a local rotation on the data qubit zone.

[0089] Referring to Fig. 4, a quantum processing unit (QPU) according to the present disclosure is illustrated. This design is centered around efficient classical control over many logical qubits in parallel using optical beams. Single-qubit logical gates can be realized transversally, for example, by illuminating all physical qubits within the same logical qubit block by an opticalbeam. Two-qubit logical gates can also be realized transversally, by interlacing two logical arrays of qubits and applying a global optical pulse for entangling each twin of the pair.

[0090] Neutral atom systems have the potential for utility scale computing: for example, millions of identical neutral atom qubits may be trapped in mm-scale regions of space. The key challenge is the classical control required to assemble these qubits into a large-scale quantum processor. Full programmability of single physical qubits generally requires highly complicated classical control techniques in order to operate on millions of qubits. In contrast, the architectures provided herein allow for full programmability of single logical qubits while only requiring a few classical controls per logical qubit. This enables reaching utility-scale by encoding logical qubits into blocks that can be efficiently controlled in parallel. Using advanced optical microscopy systems (such as those utilized for modern industrial-scale lithography) with high numerical aperture and large field of view exceeding several millimeters, and appropriately scaled trapping laser power, direct trapping and manipulation of over a million qubits is possible. Further scaling is possible by creating 10-100 such processing units, each under its own microscope objective, and then connecting these units together utilizing photonic links and / or optical lattice transport. This allows for sufficient space, resolution, and power density for enacting high-fidelity control over 10M qubits and beyond.

[0091] QPU 400 is segmented into several key zones: a storage zone 411, entangling zone 412, readout zone 413, atom loading zone 404, and remote entangling zone 405. Storage zone 411, entangling zone 412, and readout zone 413 form processor core 401, which in some embodiments contains 104to 106qubits in a footprint of 0.5-5mm. Fresh atoms are continuously reloaded from distant atom loading zone 404, and a distant remote entangling zone 405 (using optical interconnects and / or lattice transport) delivers remote Bell pair entanglement resources.

[0092] In storage zone 411, idle logical qubits are stored for long times, utilizing the long qubit coherence times and high fidelity single-qubit gates, such that an error-correction cycle is only required before a logical two-qubit gate. For coherence times of 10-100 second, and assuming performance lOx below threshold, then roughly 1% single-qubit dephasing errors can be tolerated before a round of d cycles of error correction. This corresponds to approximately 0.1-1 second of allowed storage time before the requirement for correction. Due to the all-to-all connectivity provided by the presently described architectures, idled logical qubits can simply be kept in the storage zone, safe from additional errors. Logical qubits are thus stored in denseblocks, shuttled out when they are needed in the algorithm, and only error-corrected before a two-qubit gate, greatly reducing the error correction overhead. In various exemplary devices, atoms are stored at densities of approximately 1 / (2μm)2in the dense storage zone, and densities of approximately 1 / (10μm)2in the active zone.

[0093] The active logical qubits are manipulated in active zone 412. By utilizing qubit transport, all combinations of two-qubit gates can be performed in a fixed region of space. This significantly reduces the classical control complexity. For example, all two-qubit gates can be performed using a single, global optical beam, which is dramatically simpler than calibrating each individual qubit. This exceptional degree of parallelism for logical qubit control is a significant advantage of the present architecture relative to alternatives such as those involving individual control of atomic qubits.

[0094] Readout zone 413 allows selectively reading out a subset of qubits mid-circuit without disturbing the other qubits. This readout happens in parallel with a global beam and a camera, again requiring only one set of classical controls.

[0095] Outside of the core processor 401, atoms are constantly reloaded from loading zone 404 and transported into the core processor for running arbitrarily long circuits. Remote Bell pairs with other processing units are generated using optical links and / or optical lattice transport 405, and are shuttled into the core processor 401 for creating remote logical entanglement. This allows interconnection of 10-100 single processing units into one error-corrected, utility-scale quantum computer.

[0096] The architecture provided above allows for mid-circuit readout. In particular, this architecture may be paired with fast imaging in the readout zone and a classical control loop. In addition, various methods may be used to suppress crosstalk errors and detect / correct for loss. Arbitrarily long circuit depths may be achieved with continuous reloading of atoms and further crosstalk suppression.

[0097] To connect multiple units, many high-fidelity, long-distance Bell pairs may be generated in parallel, using lattice transport and / or photonic links.

[0098] It will be appreciated that the present architecture is suitable for logical state preservation by repetitive mid-circuit measurement and correction. In addition, a surface code logical qubit may be implemented, for example by moving ancillas from a storage zone reservoir, entangling with data qubits for syndrome extraction, and moving to the readout zone. This allows fast mid-circuit readout and feedback while preserving coherence on data qubits. In various embodiments, the data qubits are protected by placing the imaging zone ~50 microns away, thereby suppressing crosstalk from the readout beam and scattered light by the ancilla atoms.

[0099] In various embodiments, a fast classical control loop uses ancilla measurements to determine errors on the data qubits, and to detect and correct qubit loss. Lost qubits may then be replaced with reservoir atoms. In order to reach surface code distances several times larger than the largest codes created in alternative systems, local detuning patterns may be utilized for spaceefficient use of the entangling zone.

[0100] The presently described architectures may also be used to perform algorithms with logical qubits. The zoned approach combined with efficient optical control over many logical qubits in parallel allows construction of large-scale processors. In an exemplary use case, ~10 logical qubits are encoded in the active zone and moved to the storage zone. After encoding all logical qubits, the algorithm is run with appropriate logical single-qubit and logical two-qubit gates. The flexible, local single-qubit control required for logical single-qubit gates is implemented with Raman light from a 2D AOD illuminating the grid of a single code block. Logical two-qubit gates are realized transversally in the entangling zone. Mid-circuit readout is used for the non-Clifford gate-teleportation sequence, followed by fast feedback for logical single-qubit rotation.

[0101] It will be appreciated that the while certain operating parameter are provided below by way of example, increased fidelity in two-qubit gate errors may be achieved through various further optimizations. For example, increasing Rydberg laser power and detuning will reduce laser scattering errors and also suppress other errors by increasing gate speed. Cooling atoms to the motional ground state (thereby suppressing Doppler dephasing errors), and utilizing lOx higher laser power, theoretically results in >99.8% gate fidelities. Further improvements can be made with continued increases in laser power, but alternative routes such as single-photon excitation to Rydberg P states or alkaline-earth-based systems, are also available. Processor speed can be increased to a ~10 microsecond logical qubit cycle time by increasing collection efficiency or utilizing cavity-based or ensemble-based readout schemes, or by increasing movement speed with deeper optical tweezers.

[0102] To reach arbitrarily deep circuits, atoms may be continuously reloaded. Accordingly, some embodiments employ loading into a distant magneto-optical trap (MOT) and transporting atoms in an optical lattice conveyor belt.

[0103] In various embodiments, cross-talk during readout is suppressed by moving the ancilla atoms away from the data qubits.

[0104] Further scaling of the quantum processors can be achieved by connecting more than one microscope objective, either through atom transport or optical communication links. In various embodiments, the first approach utilizes the novel capabilities of atom rearrangement, combined with the use of optical lattice conveyor belts to coherently transport qubits between multiple active optical control regions and distribute entanglement. In various embodiments, the second approach utilizes photon-mediated entanglement between distinct atom array nodes with >104qubits. High entanglement rates can be achieved through parallel nanophotonic or bulk optical cavities, and the large sizes of atom arrays can provide further parallelism. This approach also enables modular construction of quantum processor units, flexibly rewired and linked together.

[0105] Referring to Fig. 5, a schematic view is provided of logical qubits, illustrating efficient control of single logical qubits by parallelized optical control of the physical qubit blocks that constitute the logical qubits. Logical qubits 501, 502, 503, 504 are each made up of 13 atomic qubits (shown as circles). It will be appreciated that the number and arrangement of qubits is purely exemplary. A variety of qubit blocks are known in the art and are suitable for use as described herein. For example, the 2D surface code and the 2D color code are particularly suitable due to their high thresholds and simplistic 2D structure. However, a variety of other codes are available that include the transversal CNOT, such as the 3D color code and 3D toric code. In various embodiments, a single laser beam is configured to illuminate a given logical qubit when positioned in an active zone of a processor (e.g., active zone 412). This is illustrated by beam 511 illuminating logical qubit 501. In some embodiments, a single laser beam is configured to illuminate a plurality of logical qubits when positioned in an active zone of a processor (e.g., active zone 412). This is illustrated by beam 513 illuminating logical qubits 503, 504, 505.

[0106] This structure allows for the application of transversal logical gates. To apply a transversal logical gate on one logical qubit, the corresponding physical qubit gate is performed on each physical qubit in the block making up the logical qubit.

[0107] For example, to do a transversal single-qubit gate, the same single-qubit rotation is applied to each physical qubit in the block by illuminating that entire spatial block (e.g., 501) with one beam that covers all physical qubits (e.g., 511). In various embodiments, this is realized by creating a grid of Raman beams using a crossed AOD device in order to illuminate a grid of one surface code. This single-qubit example is shown with beams 511, 512, where surface code blocks 501, 502 (the connected grid of 13 atoms) are illuminated with beams that come out of a microscope objective and into the plane of the atoms. In this example, two logical blocks 511, 512 are illuminated in parallel. This is advantageous in various use cases, but one code block may also be illuminated at a time.

[0108] This structure allows for the application of transversal logical multi-qubit gates between two or more logical qubits. A transversal logical multi-qubit gate is a logical gate wherein each atomic (i.e., physical) qubit of one logical qubit is coupled to only one atomic qubit of another logical qubit, and therefore errors do not spread to other atomic qubits by propagation. Referring to Fig. 6, a schematic view is provided of logical qubits, illustrating the application of a transversal controlled-NOT (CNOT) gate. As in Fig. 5, each of a plurality of logical qubits is made up of 13 atomic qubits (shown as circles). To perform a transversal CNOT on two logical qubits (e.g., 601, 602) a physical qubit CNOT is performed on each pair of the two logical blocks. The architectures provided herein enable moving groups of atoms in parallel in order to efficiently perform logical transversal CNOTs between any two logical qubit blocks.

[0109] In particular, a logical qubit block is picked up with a crossed AOD, moved to interlace with another logical qubit within the same 2D plane, which is stored in a different set of optical tweezers (e.g., a backbone SLM grid). When interlaced, each atomic qubit of one logical qubit is within a blockade radius of exactly one corresponding atomic qubit of the other logical qubit. A single pulse of a global Rydberg laser is applied (e.g., beam 611). This realizes a transversal CNOT between the two logical qubits in a single, parallel step. Transversal CNOTs may be performed in parallel on multiple logical qubits at the same time, as is shown in Fig. 6.

[0110] Transversal CNOTs are allowed, fault tolerant operations, between any two Calderbank-Shor-Steane (CSS) codes, which is a broad class of codes encompassing surface codes, colorcodes (e.g., Steane code), hypergraph product low density parity check (LDPC) codes, etc. The key intuition is that a CNOT propagates X on a first qubit to X on a second qubit, and Z on the first qubit to Z on the second qubit. For a CSS code, the logical qubit operators are products of X and Z, and the logical CNOT is formed of products of physical qubit CNOTs. Accordingly, logical X on the first logical qubit will propagate to being logical X on the second logical qubit, and logical Z on the first logical qubit will propagate to logical Z on the second logical qubit. This follows the rules of a CNOT on the logical qubit level. Accordingly, this implements a transversal logical CNOT.

[0111] More particularly, a controlled-NOT can be performed bitwise on any CSS code.Consider the operations on and I®M. In the first case, if M is an X generator, it becomes M®M. Since both the first and second blocks have the same stabilizer, this is an element of 5 x 5. If M is a Z generator, M®I becomes M®I again. Similarly, if M is an X generator, I®M becomes I®M, and if M is a Z generator, I®M becomes M®M, which is again in 5 x 5. For an arbitrary CSS code, the Xtoperators are formed from the product of all Xs, and the Ztoperators are formed from the product of all Zs. Therefore:'***w?**® 1e / ■X I S Ot?' X- / ®x******* *f2!*r*I 0 2^Equation 1

[0112] Thus, the bitwise CNOT produces an encoded CNOT for every encoded qubit in the block.

[0113] Without a transversal CNOT, entangling operations between logical qubits often have to be done with braiding or lattice surgery. These are significantly less efficient than the transversal CNOT. For example, both braiding and lattice surgery require doing d rounds of stabilizermeasurement in order for them to be actually fault-tolerant, whereas no rounds of stabilizer measurement are required to make the transversal CNOT fault-tolerant — its fault-tolerance is already guaranteed by the fact that that the CNOT is transversal. Transversal gates are inherently fault-tolerant, because, as described above, no error can spread from one qubit in the block to another qubit in the same block. Braiding and lattice surgery thus are much more resource intensive in requiring multiple rounds of stabilizer measurement, being slower and also being lower in their threshold.

[0114] The threshold of a 2Q gate when doing a transversal CNOT is given by the threshold for perfect syndrome extraction and such should be roughly 10% (surface code), whereas the 2Q gate threshold for doing repeated syndrome extraction is roughly 1% (surface code). The ability to perform this transversal CNOT efficiently by only using a few classical controls, in a way that is independent of code size, is key to simplifying the classical controls required for building a large-scale quantum computer.

[0115] The atom movement and parallel optical control of logical qubit blocks greatly simplifies the controls required for realizing logical quantum computation. In various embodiments, logical qubits are multiplexed into grids, and each logical qubit block behaves much like one large atom. To perform a single-qubit rotation on a logical qubit block, it is illuminated with one beam. To perform a CNOT between two logical qubit blocks, they are moved together and pulsed with one Rydberg beam. With this highly efficient parallelized control, logical qubit algorithms can be performed on logical qubits.

[0116] In exemplary embodiments, a two-qubit CZ gate is implemented by two global Rydberg pulses, with each pulse at detuning Δ and length τ, and with a phase jump ξ between the two pulses. The pulse parameters are chosen such that qubit pairs, adjacent and under the Rydberg blockade constraint, will return from the Rydberg state back to the hyperfine qubit manifold with a phase depending on the state of the other qubit.

[0117] Referring to Fig. 7, a schematic view of a portion of a processor core, such as processor core 401, is provided with exemplary measurements and qubit arrangements. In this example, storage zone portion 701 measures 145 X 40 / rm, active zone portion 702 measures145 x 40 / rm, and readout zone portion 703 measures 145 X 20 / rm. Storage zone portion 701 is separated from active zone portion 702 by a 20 / rm buffer. Active zone portion 702 is separated from readout zone portion 703 by a 20 / rm buffer.

[0118] In this example, active zone portion 702 has 50 positions, separated by 16 / rm in one dimension and 10 / rm in the other dimension. Each dot represents an atomic qubit, and so, in this example, qubits are located proximate to each other when interlaced to perform a bitwise operation. Storage zone portion 701 has 250 positions, separated by 8μm in one dimension and 4 / rm in the other dimension. Accordingly, storage zone portion 701 has a higher density than active zone portion 702.

[0119] Alternative arrangements are provided in Figs. 8-9. For example, the arrangement in Fig.8 is suitable for use in implementing a repetition code. In another example, the arrangement in Fig. 9 is suitable for use in implementing a surface code.

[0120] Referring to Fig. 10, a schematic view of a method of active, feedforward QEC is provided. In this example, a portion of a quantum processor (such as depicted in Fig. 4) is depicted, including a reservoir 1001 (such as loading zone 404), an active zone 1002 (such as active zone 412), and a readout zone 1003 (such as readout zone 411). Ancilla qubits are continually replenished 1004 from reservoir 1001 to replace ancilla qubits that are moved 1005 from active zone 1002 to readout zone 1003 to be measured. Using the zoned architecture provided herein enables a complete QEC round within 1ms.

[0121] Transversal Logical Gates

[0122] Transversal logical gates, which act by applying the same gate on each individual physical qubit of a single code patch or between multiple code patches, are widely regarded as the simplest and best-performing technique to achieve fault-tolerant entangling gates. This is because by acting only on individual physical qubits within a code patch, they do not spread errors within each patch and are inherently fault tolerant. Moreover, unlike lattice surgery, which often require d rounds of repetition in time to achieve fault-tolerance, transversal CNOTs only require a single time step (see Fig. 12). This reduces the number of physical gates applied, potentially reducing the number of errors incurred. Thus, the use of transversal CNOTs has the potential to both reduce the space-time resources required and the logical error rate for FTQC.

[0123] Referring to Fig. 12, an illustration is provided of logical operation time for lattice surgery and transversal CNOTs. The latter does not require d rounds of repetition to be fault-tolerant.

[0124] There is little detailed analysis of transversal CNOT gate performance, primarily because of implementation challenges in conventional 2D planar architectures. However, breakthroughs in neutral atom array and trapped ion technologies have enabled the possibility of long-range connectivity via dynamic reconfigurability, making it possible to directly bring two code patches together and execute a transversal CNOT gate.

[0125] Of particular interest are dynamically-reconfigurable neutral atom arrays, where qubits are encoded in long-lived hyperfine states with second-long coherence times, and entangling gate operations are mediated by transient excitation into strongly-interacting Rydberg states. Parallel shuttling of atoms using acousto-optic deflectors (AODs) then allows the efficient reconfiguration of atom locations. Qubit numbers as large as 289 have been experimentally demonstrated, and can be readily scaled to the thousands by increasing laser power. Using the dynamical reconfigurability of neutral atoms, a variety of quantum error correction are possible, including the toric code on a torus.

[0126] A key feature of this platform is its high degree of parallelism and low control overhead, which is naturally suited to quantum error correction: two control wires are sufficient to program a pair of AODs and generate a large grid of traps, allowing the manipulation of an entire logical qubit consisting of hundreds of physical qubits with only a handful of control channels. With this approach, performing a transversal CNOT is as simple as interleaving two code patches using parallel AOD control and shining a global Rydberg beam that performs a physical CNOT between each pair of qubits in the two code patches. This highly resource efficient implementation of the transversal CNOT may significantly simplify the realization of FTQC, and reduce the overhead of quantum algorithms.

[0127] These considerations enable reducing the unit space-time cost of logical operations from O(d3) in lattice surgery, to O(d2) with transversal gates, a factor that can be on the order of 30 for typical code distances d « 30 assumed in the fault-tolerant regime.

[0128] However, a key question is how to appropriately decode such a circuit to maintain the full code distance and have good logical error performance. The correlated decoding methods described below, which jointly decode multiple logical qubits in a quantum algorithm, allow minimal degradation in the threshold and obtain promising logical error rates even when there are a relatively small number of rounds of error correction (much less than d) between each pair of transversal operations, thus fulfilling the promise of large overhead reductions.

[0129] Heuristic Scaling Analysis

[0130] In this section, a heuristic estimate is provided of the expected logical error rate as a function of the number of CNOTs per syndrome extraction round. This heuristic estimate depends on several assumptions, but provides some intuition for the expected performance and overhead reduction of the architectures provided herein.

[0131] For a regular surface code memory, the logical error rate per round of syndrome extraction can be well-approximated by the scaling formulad+1p_v 2™PL = CPt JEquation 2where d is the code distance (assumed to be odd for simplicity), pthis the error threshold, p is the physical error rate, and C is a constant.

[0132] Separating this expression in terms of underlying CNOT gate error rates PCNOTand other operation error rates p0allows rewriting the above expression asd+1^PCNOT + PPo 2™PL = CYPthEquation 3

[0133] The number 4 reflects the fact that 4 CNOTs are performed on each data qubit in one round of syndrome extraction, and the prefactors (3 and y are scaling factors that can be obtained from numerical simulations.

[0134] This analysis is now generalized to the case of transversal CNOTs with interleaved rounds of syndrome measurement. Assuming x CNOTs are performed per syndrome measurement round, the following ansatz is used for the logical error per logical qubit per CNOT:d + 1C / (ax + 4)pCN0T+ M “2“PLx \ YPthEquation 4

[0135] The factor a is added to account for the extra error due to the decoding problem being a bit more complex. Note that the constants may be modified from the memory setting, and someof the coefficient dependencies may be more complicated than are described here. However, as a heuristic estimate, this is a good starting point. This formula can be understood as follows: the rate of syndrome measurements sets how often one is extracting entropy out of the system, so one may expect to have a comparable threshold but an elevated effective physical error rate, as captured by the ax term. As one performs x CNOTs per round of syndrome measurement, one divides x outside to estimate the logical error per logical qubit per CNOT. The spacetime cost per CNOT can then be estimated to beV = 1 + - d2xjEquation 5

[0136] The alternative method of fault-tolerance, which involves a full d rounds of syndrome measurement for each transversal gate, corresponds to the case where x = -

[0137] Having established a heuristic scaling formula, the following questions may be posed: for a given target logical error rate, what is the smallest spacetime cost per CNOT? How many rounds of QEC does this correspond to? If the optimal number of rounds of QEC is less than d, then this would provide an indication that one can indeed substantially reduce the resource overhead required.

[0138] One can perform the above optimization problem, by solving for the code distance that provides sufficient error suppression for a given set of parameters, and then calculate the spacetime cost. For a particular choice of / ? = 1 and y = 12, motivated by counting of error channels in some error models, one finds the following condition for the optimal number of CNOTs per round:2Pane i ^Penot 4~ Upcnot Xlog4-2pthttPcnot x(4 + x) fog (^)Pane 4 Pcnut (4 "t" ttx)PLX \ ' Pane + 4pe.no t P Opcnot X 4 X 2 log lew 3D 1c 12pthEquation 6

[0139] This analysis will also hold for other, more accurate parameter choices. To simplify the above expression, the following definitions are made, (^Lx\z = log —\ c / Equation 7, (Po T ^Pcnot T CtpCTLOtX\w = logV 12pth)Equation 8

[0140] Although these are in principle functions of x, since they are inside a log, they may be treated as roughly constant. One then hasw2= ZCTX(4 + x) — (4 + x — z)wEquation 9

[0141] To understand the asymptotics, it is noted that increasing the logical error requirements will scale z, so one wants to match the linear coefficients of that. Thus, the condition is observed that asymptotically,«PcnotX(4 + x)+ w = 0.Po + Pcnot(4 + ax)Equation 10

[0142] For fixed error rates, this is just a fixed equation for x that does not scale with the code distance or target logical error rate, indicating that performing a constant number of CNOTs per round of error correction minimizes the spacetime volume. Thus, this heuristic analysis provides evidence for O(d2) spacetime overhead with the use of transversal gates. It also suggests that as one lowers the physical error rates, one can do more transversal CNOTs per round of syndrome extraction.

[0143] Referring to Fig. 13, an illustration is provided of error propagation in a transversal CNOT, and the resulting detector error model, which has weight 3 hyperedges that make decoding more challenging. Fig. 13C illustrates the expansion process in the generalized unionfind decoder described herein.

[0144] Correlated Decoding to Enable Low-Overhead Algorithm Execution

[0145] One challenge of applying and decoding transversal gates is error propagation between code patches. As illustrated in Fig. 13, an X error on the control code patch is propagated to apair of X errors on both patches through the CNOT, while a Z error on the target patch propagates to a pair of Z errors on both patches. Consequently, this leads to additional hyperedges in the decoding graph (also known as the detector error model), which can be challenging to decode. Indeed, a naive decoding strategy that independently performs matching on the two code patches is not expected to yield a threshold, because many accumulated errors on one side may be copied over, causing the decoder to fail.

[0146] Although this error propagation increases the density of errors, crucially, it happens in a deterministic fashion set by the logical gates applied. Thus, by decoding qubits in a way that accounts for physical error propagation in the specific implemented algorithm, the effects of such spreading could possibly be reduced or even utilized, as physical errors on a given logical qubit contain information about which physical errors occurred on other logical qubits.

[0147] To understand the challenges of decoding in the presence of transversal gates better, consider the detector error model in more detail, which describes how individual error events trigger detectors (checks), which are products of stabilizer measurement results that are deterministic in the absence of errors. For example, Fig. 13B illustrates an X error before the transversal CNOT, which creates two X errors across the two code patches and triggers 4 detectors. As another example, Fig. 13C illustrates the pattern of detectors triggered by a single ancilla measurement error on the control patch, which triggers weight 3 errors. The weight 3 errors are particularly challenging to deal with, as they are odd weight and cause the error decomposition heuristics commonly employed by software packages such as Stim to fail. This is because the heuristics try to decompose errors into lower weight existing errors, but in the bulk of the detector error model, there are no weight- 1 errors (all errors in the bulk produce pairs of syndromes), so there is no clear way to perform the decomposition.

[0148] To address this challenge, several custom decoders are provided that differ substantially from alternative minimum-weight perfect-matching decoders employed in various surface code simulations and experiments. This discussion focuses on two decoders, one based on phrasing the decoding problem as an optimization problem, thereby allowing use of mature integer programming solvers such as Gurobi to perform decoding, and the other based on a generalization of union-find. A few additional decoders are also considered, including a decoding strategy that first performs matching of X errors on the control patch and then updates the detectors on the target patch, as well as a strategy based on belief propagation + orderedstatistics decoding, a decoder for quantum low-density parity-check (qLDPC) codes. However, these additional decoders each suffered from some downsides so far; the first strategy only works when there are no loops of CNOTs with at most d gates in between, while the latter strategy ran too slow for the particular open-source implementation used.

[0149] The decoders are now described in more detail. The first algorithm uses a state-of-the-art mixed-integer programming solver to exactly or approximately solve for the most-likely error consistent with the measured syndrome. The goal of a mixed-integer programming problem is to maximize an objective function, subject to constraints. The objective function being optimized encodes the total probability of a candidate set of physical error mechanisms occurring in the logical circuit, and the constraints will ensure that the candidate error set is consistent with the measured syndrome. Concretely, each physical error source Ej in the circuit is associated with a binary variable that is equal to one if that error occurred, and zero otherwise. Each error source Ej occurs with some known probability Pj determined by the circuit noise model. The goal will be to find an assignment of error variables such that the resulting error probabilityis maximized, subject to the constraint that the error is consistent with the measured syndrome. To be consistent with the measured syndrome, the parity of the detector must match the parity of the errors connected to that detector by a hyperedge in the decoding graph. Concretely, let f be a map from each detector Dtto the subset of error mechanisms that flip its parity, the most-likely error is given by the solution to the following mixed-integer program:maximizeilo9(Pj)Ej + - P / ) (1-Ej)subject to I1Ejef(D0Ej - 2Ki = DiVi = 1. NEj G {0, 1} Vj = 1,..., MKt G Z>0Vi = 1,..., N

[0150] Each variable KLensures that the parity of the error variables associated with Dtmatches Dt. One can verify that the objective function evaluates to the logarithm of the probability of the assigned error configuration. One may then solve the mixed-integer program to optimality using Gurobi, a state-of-the-art solver, and apply the correction string associated with the error indices j for which Ej = 1 in the optimal assignment. For worst-case circuits, this algorithm can have an exponential runtime. This is consistent with the fact that finding the most likely correlated erroris NP-hard and thus unlikely to admit a polynomial time algorithm. Nevertheless, in many practical cases, the algorithm still has a very fast run time.

[0151] For the generalized union-find (UF) decoder, the termination condition is replaced by a linear system of equations condition. The UF decoder proceeds in two steps: first, it starts from existing syndromes and attempts to grow an envelope of errors that contains an error configuration consistent with the observed syndromes; ideally, the resulting envelope is separated into multiple disjoint clusters. Second, it applies the peeling decoder to each of the clusters and finds a correction that is consistent with the observed syndrome. A key component of this decoder is the termination condition; for the UF decoder on the surface code, this is typically chosen to be that the total error weight is even since errors in the surface code generate pairs of detector events. However, for transversal CNOTs, the presence of high- weight errors mean that this condition must be generalized. In the approached provided herein, this condition is replaced by the satisfiability of a linear system of equations, which allows direct checking of whether a satisfied solution exists for the current cluster. An additional benefit of this approach is that solving the linear system of equations automatically provides us with the corrections to be applied in the second step of the algorithm.

[0152] The generalized union-find decoder expands clusters on decoding graphs. The decoding graph is a bipartite graph whose vertices are divided into two types: detector nodes and error nodes. Each detector event corresponds one-to-one to a detector node in the decoding graph, and each weight- / c error event one-to one corresponds to a degree- / c error node in the decoding graph. Each edge in the decoding graph indicates that the corresponding detector event can be triggered by the corresponding error event. Edges are weighted according to the error rate of its connecting error event. Clusters are defined as subsets of vertices in the decoding graph. Given a sample of detector events, a cluster is called satisfiable if the inner error nodes in this cluster give an error configuration consistent with the given detector events in the cluster. Otherwise, it is called unsatisfiable. The satisfiability of a cluster can be checked by a linear system solver over the binary field.

[0153] The generalized union-find decoder initializes clusters to individual subsets of vertices, each of which contains only one triggered detector event. When there exist any unsatisfiable clusters, one of the minimal sizes is chosen and it is expanded by adding one boundary node into the cluster according to the weight of the boundary edge. If the boundary node belongs toanother cluster, these two clusters are merged into a larger cluster. Once all clusters are satisfiable, a decoded error configuration is obtained.

[0154] Numerical experiments are conducted to demonstrate the performance of the decoders on the transversal CNOT decoding problem. The logical error rate is estimated at different physical error rates in the circuit. First, the generalized union-find decoder is simulated. The circuit consists of only one transversal CNOT, along with rounds of syndrome measurement before and after it. The weight of an edge is set to be w = log / r<?, where p is the error rate of the connecting error and e is a hyperparameter of the decoder. Different e values give different thresholds: when e = —1(0), the threshold is around 0.51%(0.76%). This difference originates from the different behavior of the cluster expansion. A cluster has a higher priority to expand through high-rank errors at lower e, making it easier to become larger. A larger cluster makes it more likely for the decoding to fail because a global solution is less likely than a local solution to provide a high-weight solution in the decoding graph.

[0155] The performance of these decoders is also studied in the full context of a quantum circuit involving multiple operations, as shown in Fig. 14. By studying the number of rounds of error correction that are performed following each CNOT gate, one finds that the logical error rate appears to be minimized for the integer programming decoder with a constant number of rounds following each CNOT, suggesting that lower spacetime volume is indeed attainable. The generalized union find decoder also has an optimal number of rounds that is smaller than the code distance d, although further optimizations and use of belief-propagation pre-processing are needed to further improve the performance.

[0156] The present disclosure demonstrates that these decoders substantially improve the performance of algorithm decoding, and serve as a key enabling piece to the low-overhead transversal architecture.

[0157] Referring to Fig. 14, the reduction in spacetime cost of logical algorithms in various embodiments is illustrated. Fig. 14A shows that when a transversal CNOT is performed between rounds of noisy syndrome extraction, measurement errors on X (Z) stabilizers directly before the CNOT generate order-three hyperedges on the decoding hypergraph of the control (target) logical qubit. Fig. 14B shows that because these order-three hyperedges cannot be decomposedinto existing edges, naively applying MWPM leads to a reduced threshold (pth— 0.49%) compared to hypergraph MLE (pth— 1.0%), whose threshold is unaffected by the transversal CNOT. Because the runtime of hypergraph MLE is exponential in the worst-case, a modified hypergraph UF algorithm is also implemented that runs in polynomial time but maintains a similar threshold (pth— 0.81%). Fig. 14C explores deep logical Clifford circuits with 32 layers of random transversal Pauli and CNOT operations interlaced with n of noisy syndrome extraction between adjacent layers. Fig. 14D shows that, because transversal operations are inherently fault-tolerant, n = 1 rounds of noisy syndrome extraction are sufficient to obtain a thresholds of pth— 0.45% and 0.80% in hypergraph MLE and UF, respectively. Fig. 14E shows that when optimizing the logical fidelity over the number of rounds of syndrome extraction per CNOT, n = 1 — 2 rounds are optimal for hypergraph MLE. Fig. 14F shows that, similarly, n « d rounds are optimal for hypergraph UF.

[0158] Basic Building Blocks: Gates and Gadgets

[0159] The following considers the basic building blocks employed in fast transversal gate architectures according to the present disclosure. These building blocks and capabilities are directly inspired by the hardware capabilities of neutral atom platforms.

[0160] This discussion considers logical qubits that are encoded in individual rotated surface codes, although other quantum error-correcting codes can also be employed. The square layout of such logical qubits makes them very easy to manipulate with optical tools, such as tweezer arrays generated with crossed acousto-optic deflectors (AODs).

[0161] To perform syndrome extraction, one can bring in a set of ancilla qubits, perform the desired physical CNOT gates for syndrome extraction (either via local Rydberg addressing or atom moving), and then move the ancilla qubits out to a separate readout zone. With appropriate choice of gate ordering, this procedure can be made fault-tolerant and realizes the full circuitlevel code distance.

[0162] For logical gate operations, the following basic operations are realized: Hadamard gate, S gate, CNOT gate, and state injection. This will be further supplemented by the operation of resizing a logical code patch. With suitable decoding techniques, and by making use of perfect syndrome information when performing transversal readouts, one can reduce the number of rounds of quantum error correction following each logical gate. To maintain code distance at thealgorithmic level, any undetectable error that anti-commutes with any logical operators have weight at least d. Since errors in the bulk always extend the syndrome, the only place where errors can terminate are on spatial boundaries and time boundaries. Thus, one may expect to require time boundaries to be separated by at least distance d, although for specific circuits, these undetectable operations may commute with all logical operators, thereby further relaxing the separation requirements between boundaries.

[0163] The logical Hadamard gate can be implemented transversally for the surface code with physical Hadamards and a 90 degree rotation of the code patch. While rotations themselves are not a native operation with existing fast optical tools, by using a recursive decomposition into 2-by-2 blocks at each layer, one can perform a rotation in log depth.

[0164] The logical S gate is usually implemented via lattice surgery, requiring O(d3) spacetime volume. However, by folding the code patch on top of itself, as shown in Fig. 15, executing a CZ gate between the marked qubit pairs, and S gates on the d qubits along the fold axis, one can implement a logical S gate on the rotated surface code. This approach has the advantage of nearly preserving the distance (during the operation, the code distance is reduced by 1) and not requiring any deformation of the code patch, thereby avoiding initialization of new stabilizer values during deformation that may take multiple logical cycles.

[0165] The logical CNOT gate can be readily implemented by interlacing two logical qubits and pulsing an entangling operation between them with a global laser pulse.

[0166] State injection can be performed, which can then be used to build magic state factories, as discussed in the following section.

[0167] One can also create bridging Bell pairs to connect multiple logical subcircuits, thereby allowing connections between different circuit fragments that can now be executed out of time order. The Bell pair creation and Bell basis measurements can in principle be done in a singleshot fashion, although similar to the above, more detailed analysis is required to understand the lowest weight logical error behavior.

[0168] Referring to Fig. 15, an illustration is provided of a fold-transversal S gate for the rotated surface code. This implementation does not require any patch deformation to be applied, which makes it much faster. In Fig. 15A, the data qubits in the bottom-right (dashed box) are mirrored over the top-right qubits to execute CZ gates. This is easier to perform with a diagonally-oriented AOD. In Fig. 15B a few S gates along the diagonal complete the logical gate.

[0169] Algorithmic Building Block: Magic State Distillation Factory

[0170] One of the most important sub-routines of FTQC is magic state distillation (MSD). This key subroutine allows the preparation of high-fidelity T states or CCZ states, which enable the implementation of non-Clifford gates. It is also one of the most expensive subroutines in quantum computation, as distillation requires a much larger space-time footprint than Clifford gates.

[0171] Referring to Fig. 16, a magic state distillation circuit is illustrated.

[0172] Referring to Fig. 17, an exemplary neutral atom layout for magic state distillation is illustrated, showing the high degree of parallelism available.

[0173] Consider the 15-to-l magic state distillation circuit shown in Fig. 16. As shown, the circuit is highly structured, involving 4 layers of transversal CNOTs, each layer connecting logical qubits at a fixed distance from each other, and finally an additional T gate layer on a subset of the qubits that can be executed via gate teleportation. As mentioned above, each transversal gate only involves a single circuit layer, rather than the d layers required for lattice surgery, significantly reducing the time cost.

[0174] This circuit has a particularly natural implementation with neutral atom array systems, as illustrated in Fig. 17. This panel shows how the logical qubits can be laid out in a 4-by-4 grid, and how movements can be executed to perform the desired circuit. The movement is highly parallel, naturally allowing optical multiplexing with very few control channels. The longest distance that any codeblock needs to travel within the magic state distillation factory is only the span of two logical qubits, indicating that the movement time required will still remain very short.

[0175] In the circuit shown, to guarantee fault-tolerance of the Clifford codeword preparation portion by itself, each logical qubit requires d rounds of syndrome measurement for fault-tolerant state preparation. Each transversal layer involves a single CNOT layer that is executed in parallel between all logical qubits. Since one round of syndrome measurement involves 4 entangling gates and 1 measurement and reset, the time cost will be substantially larger than the CNOT operation. Thus, the MSD factory based on transversal gates requires a time cost on the order of d + 1, substantially lower than the 6 d or 13d numbers required for magic state distillation based on lattice surgery.

[0176] The time cost may be further reduced to d / 2 or less: to ensure that the smallest error chain has weight at least d, it is sufficient to ensure that the distance between two boundaries where error strings can terminate is at least d. Since time-like error strings that cause a logical error can't terminate on a transversal measurement, which provides perfect measurement results, they can only join two of the state prep boundaries, thereby reducing the distance requirement to d / 2. If such error chains do not cause a logical flip, the spacetime cost can be further reduced and only a constant distance may be necessary.

[0177] In terms of spatial overhead, a feature of the neutral atom architecture is that unlike conventional lattice surgery, which requires additional rows and columns for logical qubit access, parallel atom movement via AODs can mediate transversal gates without the need of any additional ancilla qubits, thus saving space overhead by around 2x compared to existing schemes.

[0178] Algorithmic Building Block: Quantum Adder and Quantum Read-Only Memory

[0179] The preceding analysis can also be generalized to various algorithmic gadgets, such as the ripple-carry adder and quantum read-only memory (QROM).

[0180] The circuits for the ripple-carry adder and QROM are shown in Fig. 18. By rewiring the circuit in spacetime using bridge qubits, one can separate it into smaller gadgets that act on a few qubits; for example, one can separate out the first 3 qubits of the adder and implement the subcircuit, with some additional bridge qubits on the parts that connect to other qubits, so that different circuit gadgets can be stitched together. Similar to the discussions in the previous sections, by replacing the circuit components with transversal gates, one can significantly speed up their execution.

[0181] For the single control, multi-target CNOT shown in the QROM circuit in Fig. 18, one can employ constant-time Pauli product measurement methods, where an appropriate ancilla is prepared and then directly split up and interacted with each of the target blocks of the CNOT. Although this discussion has focused on particular well-structured gadgets, the ability to move different codeblocks without needing to worry about their position and possible intersection of paths can also significantly simplify routing challenges that can occur in planar architectures.

[0182] Physical Implementation

[0183] The flexible connectivity required to implement transversal gates is in principle accessible to a variety of physical platforms, including neutral atoms, trapped ions, silicon quantum dots via qubit shuttling, or photon-based schemes via switch networks. However, this discussion focuses on neutral atom array systems, which have experimentally realized system sizes in the hundreds to thousands and high-fidelity single-qubit and two-qubit gate operations. The flexibility and power of qubit shuttling in neutral atom arrays enables complex logical qubit algorithms consisting of at least 48 logical qubits, 228 logical two-qubit gates and 48 logical CCZ gates.

[0184] Particularly powerful in this approach is the high degree of parallelism: despite the complexity of the quantum circuit being executed, the number of independent control channels used in this experiment was around 5, thanks to the high degree of parallelism afforded by optical multiplexing and gate operation via global laser excitation.

[0185] This naturally extends to the execution of large-scale fault-tolerant algorithms, significantly ameliorating the control challenges. For example, controlling individual logical code patches consisting of hundreds to thousands of physical qubits will still only require one pair of crossed AODs. Moreover, as illustrated in Fig. 17, at the algorithmic level there is also significant structure that allows exploitation of the parallelism of optical tools. In that example, the logical qubit moves are all performed in groups of rows and columns, and therefore again can be carried out using a single pair of crossed AODs.

[0186] The time required to move a logical code patch across a certain distance scales as the square root of the move distance, similar to a bounded acceleration profile. Thus, longer moves do have higher cost, but the cost grows relatively slowly. The architecture provided herein make extensive use of moves of intermediate distance, spanning a few logical qubits, thus making use of the slightly more favorable movement times at intermediate distances while avoiding bottlenecks related to longer-range shuttling and tweezer handoff if a large number of very long range moves are required. Moreover, by making use of well-structured circuits as shown in the previous section, this can also help address challenges associated with the qubit routing time overhead when performing fast transversal gates, ensuring that qubit routing can keep pace with the computer clock speed.

[0187] Dynamic reconfiguration in 2D tweezer arrays

[0188] Exemplary experiments utilize the apparatus described below. Inside the vacuum cell,87Rb atoms are loaded from a magneto-optical trap into a backbone array of programmable optical tweezers generated by a spatial light modulator (SLM). Atoms are rearranged in parallel into defect-free target positions in this SLM backbone by additional optical tweezers generated from a crossed 2D acousto-optic deflector (AOD). Following the rearrangement procedure, selected atoms are transferred from the static SLM traps back into the mobile AOD traps, and then these mobile atoms are moved to their starting positions in the quantum circuit. During this entire process, the atoms are cooled with polarization gradient cooling. Before running the quantum circuit, a camera image of the atoms in their initial starting positions is taken.Following the circuit, a final camera image is taken to detect qubit states |0) (atom presence) and |1) (atom loss, following resonant pushout). All data are postselected on finding perfect rearrangement of the AOD and SLM atoms before running the circuit. In some embodiments, each atom remains in a single static or single mobile trap throughout the duration of the quantum circuit.

[0189] The crossed AOD system is composed of two independently controlled AODs (AA Opto Electronic DTSX-400) for x and y control of the beam positions. Both AODs are driven by independent arbitrary waveforms which are generated by a dual-channel arbitrary waveform generator (AWG) (M4i.6631-x8 by Spectrum Instrumentation) and then amplified through independent MW amplifiers (Mini circuits ZHL-5W-1). The time-domain arbitrary waveforms are composed of multiple frequency tones corresponding to the x and y positions of columns and rows, which are independently changed as a function of time for steering around the AOD-trapped atoms dynamically; the full x and y waveforms are calculated by adding together the time-domain profile of all frequency components with a given amplitude and phase for each component. For running quantum circuits, the positions of the AOD atoms at each gate location are programmed and then smoothly interpolate (with a cubic profile) the AOD frequencies as a function of time between gate positions. The cubic profile enacts a constant jerk onto the atoms, which allows movement of roughly 5 — 10 X faster (without heating and loss) than if moving at a constant velocity (linear profile). In the movement protocol, stretches, compressions, and translations of the AOD trap array are applied: i.e., the AOD rows and columns never cross eachother in order to avoid atom loss and heating associated with two frequency components crossing each other.

[0190] The AOD tweezer intensity is homogenized throughout the whole atom trajectory in order to minimize dephasing induced by a time-varying magnitude of differential light shifts. To this end, a reference camera is used in the image plane to gauge the intensity of each AOD tweezer at each gate location and homogenize by varying the amplitude of each frequency component; during motion between two locations the amplitude of each individual frequency component is interpolated.

[0191] The SLM tweezer light (830 nm) and the AOD tweezer light (828 nm) are generated by two separate, free-running Ti:sapphire lasers (M Squared, 18-W pump). Projected through a 0.5 NA objective, the SLM tweezers have a waist of roughly ~ 900nm (~ lOOOnm for AODs). When loading the atoms, the trap depths are ~ 2TT X 16MHZ, with radial trap frequencies of ~ 2TT X 80kHz, and when running quantum circuits the trap depths are ~ 2TT X 4MHz, with radial trap frequencies of ~ 2TT x 40 kHz.

[0192] Raman laser system

[0193] Fast, high-fidelity single-qubit manipulations are critical ingredients of the quantum circuits demonstrated in this work. To this end, a high-power 795-nm Raman laser system is used for driving global single-qubit rotations between mF= 0 clock states. This Raman laser system is based on dispersive optics. 795-nm light (Toptica TA pro, 1.8W) is phase-modulated by an electro-optic modulator (Qubig), which is driven by micro waves at 3.4 GHz (Stanford Research Systems SRS SG384) that are doubled to 6.8 GHz and amplified. The laser phase modulation is converted to amplitude modulation for driving Raman transitions through use of a Chirped Bragg Grating (Optigrate). IQ control of the SG384 is used for frequency and phase control of the microwaves, which are imprinted onto the laser amplitude modulation and thus give us direct frequency and phase control over the hyperfine qubit drive.

[0194] The Raman laser illuminates the atom plane from the side in a circularly polarized elliptical beam with waists of 40 / rm and 560 / rm on the thin axis and the tall axis, respectively, with a total average optical power of 150mVF on the atoms. The large vertical extent ensures < 1% inhomogeneity across the atoms, and shot-to-shot fluctuations in the laser intensity are also < 1%. The Raman laser is operated at a blue-detuned intermediate-state detuning of 180 GHz,resulting in two-photon Rabi frequencies of 1 MHz and an estimated scattering error per TT pulse of 7 x 10-5(z.e. 1 scattering event per 15000 TT pulses).

[0195] Qubit coherence and dynamical decoupling

[0196] In the 830-nm traps, hyperfine qubit coherence is characterized by T2‘=4T? TS (not plotted here), T2= 1.5s (XY16 with 128 total TT pulses), and 7\ = 4 s (including atom loss). The experiments described herein are performed in a DC magnetic field of 8.5 Gauss. Coherence can be further improved by using further-detuned optical tweezers (with trap depth held constant, the tweezer differential lightshifts decrease as 1 / A and 1 / T1decreases as 1 / Δ3) and shielding against magnetic field fluctuations. For practical QEC operation, atom loss can be detected in a hardware-efficient manner and the atom then replaced from a reservoir, which could in principle be continuously reloaded by a MOT for reaching arbitrarily deep circuits.

[0197] The transport sequences are accompanied with dynamical decoupling sequences. The number of pulses used is a tradeoff between preserving qubit coherence while minimizing pulse errors. In various embodiments, there is an interchange between two types of dynamical decoupling sequences: XY8 / XY16 sequences, composed of phase-alternated individual TT-pulses which are self-correcting for amplitude and detuning errors, and CPMG-type dynamical decoupling sequences composed of robust BB1 pulses. The CPMG-BB1 sequence is more robust to amplitude errors but incurs more scattering error. The sequence may be empirically optimized for any given experiment by choosing between these different sequences and a variable number of decoupling TT pulses, optimizing on either single-qubit coherence (including the movement) or the final signal. Typically, decoupling sequences are composed of a total 12-18 TT pulses.

[0198] Movement effects on atom heating and loss

[0199] The following discusses the effects of movement on atom loss and heating in the harmonic oscillator potential given by the tweezer trap. Motion of the trap potential is equivalent to the non-inertial frame of reference where the harmonic oscillator potential is stationary, but the atom experiences a fictitious force given by F(t) = — m where m is the mass of the particle and a(t) is the acceleration of the trap as a function of time. The average vibrational quantum number increase A. N is given by|ã(ω0)|2AN =(2xzpfω0)2Equation 11where ã(ω0) is the Fourier transform of a(t) evaluated at the trap frequency m0, and the zero point size of the particle xzpf≡ √(ℏ / (2mω0)). AM is the same for all initial levels of the oscillator. Experimentally, an acceleration profile a(t) = jt is applied to the atom, from time — T / 2 to +T / 2 to move a distance D with constant jerk j. Calculating |ã(ω)|2, simplify using ω0T ≫ 1, and assume a small range of trap frequencies to average the oscillatory terms, results in6D 21 ^zpfAM = - 2Equation 12

[0200] Several relevant insights can be gleaned from this formula. First, this expression indicates the ability to move large distances D with comparably small increases in time T.Furthermore, to maintain a constant AM, the movement time T ∝ ω0-3 / 4. Moreover, to perform a large number of moves k for a deep circuit, ΔN ∝ k / T4can be estimated, suggesting that the number of moves can be increased from, e.g., 5 to 80 by slowing each move from 200 / rs to 400 / rs. Move speed could be further improved with different a(t) profiles, but inevitably with finite resources such as trap depth, quantum speed limits will eventually prevent arbitrarily fast motion of qubits across the array.

[0201] Equation 12 is now compared to experimental observations. Atom loss is observed with movement of 55 / rm in 200 / rs under a constant negative jerk. This speed limit is consistent with the above estimates: using ω0= 2π × 40 kHz and xzpf= 38nm, it is predicted that ΔN ≈ 6 for this move, corresponding to the onset of tangible heating at this move speed. More quantitatively, a Poisson distribution is assumed with mean N and variance N and integrate the population above some critical Nmaxupon which the atom will leave the trap. From this analysis, atom retention is given by ½(1 + erf[Nmax-N / √(2N)])•

[0202] Additional heating and loss during the circuit can also be caused by repeated short drops for performing two-qubit gates, where the tweezers are briefly turned off to avoid anti-trapping of the Rydberg state and light shifts of the ground-Rydberg transition. However, drop-recapture measurements suggest the 500-ns drops used experimentally have a negligible effect untilhundreds of drops per atom (corresponding to hundreds of CZ gates). Atom loss and heating as a function of number of drops are well-described by a diffusion model, which would then predict that reducing atom temperature by a factor of 2 x (reducing thermal velocity by √2 ×) and reducing drop time tdropby 2 x, together would increase the number of possible CZ gates per atom to thousands.

[0203] Two-qubit CZ gates implementation

[0204] Two-qubit gates and calibrations may be implemented using the techniques provided herein. Specifically, the two-qubit CZ gate is implemented by two global Rydberg pulses, with each pulse at detuning A and length T, and with a phase jump f between the two pulses. The pulse parameters are chosen such that qubit pairs, adjacent and under the Rydberg blockade constraint, will return from the Rydberg state back to the hyperfine qubit manifold with a phase depending on the state of the other qubit. The numerical values for these pulse parameters are:Δ = −0.377371Ωξ = −0.621089 × (2π)τ = 0.683201 / [Ω / (2π)]

[0205] Exemplary experiments are operated with a two-photon Rydberg Rabi frequency of Ω / 2π = 3.6MHz, giving a theoretical T = 190ns and a theoretical A / (2TT) = — 1.36MHz. The negative detuning sign is chosen to help minimize excitation into the mj = +1 / 2 Rydberg state which is detuned by about 24 MHz under the field of 8.5 G (and experiences a 3 x lower coupling to the Rydberg laser than the desired mj = —1 / 2 state due to reduced Clebsch-Gordan coefficients). In this work strong blockade between adjacent qubits is provided, with Rydberg-Rydberg interactions V0 / 2π ranging from 200 MHz to 1 GHz.

[0206] Managing spurious phases during CZ gates

[0207] The two-qubit gate induces both an intrinsic single-qubit phase, as well as spurious phases which are primarily induced by the differential light shift from the 420-nm laser. Under certain configurations, the 420-nm-induced differential light shift on the hyperfine qubit can be exceedingly large (> 8MHz), yielding phase accumulations on the hyperfine qubit of « 6TT. Small, percent-level variations of the 420-nm intensity can thus lead to significant qubit dephasing.

[0208] This 420-induced-phase issue may be addressed by performing an echo sequence: after the CZ gate, the 1013-nm Rydberg laser is turned off, a Raman it pulse is applied, and then the420-nm laser is pulsed again to cancel the phase induced by the 420 light during the CZ gate. This method echoes out the 420-induced phase, but comes at a cost of a factor of two increase in the 420-induced scattering error, which is the dominant source of error in two-qubit CZ gates.

[0209] Echo between CZ gates. To address these various issues, a Raman it pulse is performed between each CZ gate to echo out spurious gate-induced phases on the hyperfine qubit. This approach has several advantages. The 420-induced phase is now cancelled by pairs of CZ gates, without explicitly applying additional 420-nm pulses to echo each individual CZ gate, thereby reducing the scattering error of the CZ gate in this work by a factor of approximately two. This echo technique, having reduced the scattering error incurred during each gate, roughly compensates the increased scattering rate incurred by spreading optical power over more space in 2D, thereby giving comparable gate fidelites to the two-qubit CZ gate fidelities of > 97.4(2)%. Further, the echo between CZ gates also cancels the intrinsic single-qubit phase of the CZ gate, removing errors in the calibration of this parameter, as well as canceling any other gate-induced spurious single-qubit phases such as a ~ 0.01 rad phase induced by pulsing the traps off for 500 ns for the two-qubit gate. In instances where the number of CZ gates is odd, the echo for the final CZ gate is performed.

[0210] Sign of intermediate-state detuning. To further suppress the effect of the spurious, 420-induced phase, the 420-nm laser is operated to be red-detuned (by 2 GHz) from the 6P3 / 2transition. For red detunings, the light shift on the |0) state and the |1) state are of the same sign, minimizing the differential light shift, while for blue detunings < 6.8GHz, the light shift on the |0) state and the |1) state have opposite signs and amplify the differential light shift.

[0211] Sensitivity to axial trap oscillations

[0212] In typical Rydberg excitation timescales with optical tweezers, the axial trap oscillation frequencies of several kHz are inconsequential. Here with circuits running as long as 1.2 ms, with Rydberg pulses throughout, the axial trap oscillations can have important effects. In particular, the axial oscillations cause the atoms to make oscillations in / out of the Rydberg beams: at estimated axial temperature of ~ 25μK and axial oscillation frequency of 6kHz, an axial spread √⟨z2⟩ ≈ 1.3μm is estimated. For 20-micron-waist beams, the effect of this positional spread is relatively small on the pulse parameters of the CZ gate, but can be significant on the sensitive 420-induced phase that should be canceled by echoing out the phase induced by CZ gates separated by ~ 200 / rs. When using 20-micron-waist beams, and a 2.5-GHz bluedetuning of the 420-nm laser, the dephasing due to the axial trap oscillations is significant. To remedy this deleterious effect, the beam waist of the 420-nm laser is increased to 35 microns (while maintaining constant intensity) and the laser frequency is changed to be 2-GHz red-detuned, together resulting in a significant reduction in the dephasing associated with improper echoing of the 420-nm pulse.

[0213] Rydberg beam shaping and homogeneity

[0214] The Rydberg beams are shaped into tophats of variable size through wavefront control using the phase profile on a spatial light modulator (SLM). This ability allows matching the height of the beam profile to the experiment zone size of any given experiment, thereby maximizing the 1013-nm light intensity and CZ gate fidelities. The Rydberg beam homogeneity is optimized until peak-to-peak inhomogenities are below <1%. To this end, all aberrations are corrected up to the window of the vacuum chamber, which yields an inhomogeneity on the atoms of several percent that is attributed to imperfections of the final window. To further optimize the homogeneity, aberration corrections are tuned on the tophat through Zemike polynomial corrections to the phase profile in the SLM plane (Fourier plane). With this procedure peak-to-peak inhomogeneities are reduced to <1% over a range of 40-50 / rm in the atom plane.

[0215] Coherent mapping protocol

[0216] A coherent mapping protocol is provided to transfer a generic many-body state in the (|1), |r)} basis to the long-lived and non-interacting (|0), |1)} basis. To achieve this mapping, immediately following the Rydberg dynamics, a Raman it pulse is applied to map |1) -» |0), and then a subsequent Rydberg it -pulse to map |r) -» |1).

[0217] Even for perfect Raman and Rydberg it pulses (on isolated atoms), there are three key sources of infidelity associated with this mapping process:(1) Any population in blockade-violating states (z.e., two adjacent atoms both in |r)) will be strongly shifted off-resonance for the final Rydberg it pulse. As such, this atomic population will be left in the Rydberg state and lost.(2) Long-range interactions, e.g., from next-nearest-neighbors, will detune the final Rydberg it pulse from resonance and thus reduce pulse fidelity. Since the long-range interactions are not the same for all many-body microstates, this effect cannot be mitigated by a simple shift of the detuning.(3) Dephasing of the state occurs throughout the duration of the Raman TT pulse, predominantly from Doppler shifts between the ground states |0), |1) and the Rydberg state |r). Although these random on-site detunings are also present during the manybody dynamics, turning the Rydberg drive fl off allows the system to freely accumulate phase and makes us particularly sensitive to dephasing errors.

[0218] The above error mechanisms are mitigated as follows. To minimize errors from (1), Ω2many-body dynamics are performed with — - « 0.01. This minimizes the probability of an atom 2V0to violate blockade to be of order 1%. To help minimize errors from (2), the amplitude of the 420-nm laser is increased for the final TT pulse by a factor of 2 x, such that (VNNN / Ω) = 0.005 (where VNNNare the interactions with next-nearest neighbors), reducing pulse errors from long-range interactions to order 1%. Finally, to reduce errors from (3), a fast Raman TT pulse is performed, leaving only 150 ns between ending the many-body Rydberg dynamics and beginning the Rydberg TT pulse. The 150-ns gap is comparably short relative to the ~ 3 − 4μs of the (|Ω⟩, |r⟩) basis, leading to a random phase accumulation of order ~ 0.02 X 2TT rad. per particle, but is further compounded by having entangled states of N particles in one copy accumulating a random phase relative to entangled states of N particles in the second copy.

[0219] The global Raman beam induces a light-shift- induced phase shift of « TT on |0), |1) relative to |r) during the Raman TT pulse. Similarly, the global 420-nm laser also induces a light-shift-induced phase shift of ~ TT between |0) and |1) during the Rydberg TT pulse. While the measurements performed here are interferometric (in other words, the singlet state measured is invariant under global rotations) and thus not affected by these global phase shifts, these phase shifts can be measured and accounted for where relevant.

[0220] Formation of Array of Particles Using Optical Tweezers

[0221] Optical trapping of neutral atoms is a powerful technique for isolating atoms in vacuum. Atoms are polarizable, and the oscillating electric field of a light beam induces an oscillating electric dipole moment in the atom. The associated energy shift in an atom from the induced dipole, averaged over a light oscillation period, is called the AC Stark shift. Based on the AC Stark shift induced by light that is detuned (z.e., offset in wavelength) from atomic resonance transitions, atoms are trapped at local intensity maxima (for red detuned, that is, longerwavelength trap light), because the atoms are attracted to light below the resonance frequency. The AC Stark shift is proportional to the intensity of the light. Thus, the shape of the intensity field is the shape of an associated atom trap. Optical tweezers utilize this principle by focusing a laser to a micron-scale waist, where individual atoms are trapped at the focus. Two-dimensional (2D) arrays of optical tweezers are generated by, for example, illuminating a spatial light modulator (SLM), which imprints a computer-generated hologram on the wavefront of the laser field. The 2D array of optical tweezers is overlapped with a cloud of laser-cooled atoms in a magneto-optical trap (MOT). The tightly focused optical tweezers operate in a “collisional blockade” regime, in which single atoms are loaded from the MOT, while pairs of atoms are ejected due to light-assisted collisions, ensuring that the tweezers are loaded with at most single atoms, but the loading is probabilistic, such that the trap is loaded with a single atom with a probability of about 50-60%.

[0222] To prepare deterministic atom arrays, a real-time feedback procedure identifies the randomly loaded atoms and rearranges them into pre-programmed geometries. Atom rearrangement requires moving atoms in tweezers which can be smoothly steered to minimize heating, by using, for example, acousto-optic deflectors (AODs) to deflect a laser beam by a tunable angle which is controlled by the frequency of an acoustic waveform applied to the AOD crystal. Dynamic tuning of the acoustic frequency translates into smooth motion of an optical tweezer. A multi-frequency acoustic wave creates an array of laser deflections, which, after focusing through a microscope objective, forms an array of optical tweezers with tunable position and amplitude that are both controlled by the acoustic waveform. Atoms are rearranged by using an additional set of dynamically moving tweezers that are overlaid on top of the SLM tweezer array.

[0223] Exemplary Hardware

[0224] Optical tweezer arrays constitute a powerful and flexible way to construct large scale systems composed of individual particles. Each optical tweezer traps a single particle, including, but not limited to, individual neutral atoms and molecules for applications in quantum technology. Loading individual particles into such tweezer arrays is a stochastic process, where each tweezer in the system is filled with a single particle with a finite probability p<l, for example p~0.5 in the case of many neutral atom tweezer implementations. To compensate forthis random loading, real-time feedback may be obtained by measuring which tweezers are loaded and then sorting the loaded particles into a programmable geometry. This may be performed by moving one particle at a time, or in parallel.

[0225] Parallel sorting may be achieved by using two acousto-optic deflectors (AODs) to generate multiple tweezers that can pick up particles from an existing particle-trapping structure, move them simultaneously, and release them somewhere else. This can include moving particles around within a single trapping structure (e.g., tweezer array) or transporting and sorting particles from one trapping system to another (e.g., between one tweezer array and another type of optical / magnetic trap). This sorting is flexible and allows programmed positioning of each particle. Each movable trap is formed by the AODs and its position is dynamically controlled by the frequency components of the radiofrequency (RE) drive field for the AODs. Since the RE drive of the AODs can be controlled in real time and can include any combination of frequency components, it is possible to generate any grid of traps (such as a line of arbitrarily positioned traps), move the rows or columns of the grid, and add or remove rows and columns of the grid, by changing the number, magnitude, and distribution of the frequency components in the RE drive fields of the AODs.

[0226] In an exemplary embodiment, an optical tweezer array is created using a liquid crystal on silicon spatial light modulator (SLM), which can programmatically create flexible arrangements of tweezers. These tweezers are fixed in space for a given experimental sequence and loaded stochastically with individual atoms, such that each tweezer is loaded with probability p - 0.5. A fluorescence image of the loaded atoms is taken, to identify in real-time which tweezers are loaded and which are empty.

[0227] After detecting which tweezers are loaded, movable tweezers overlapping the optical tweezer array can dynamically reposition atoms from their starting locations to fill a target arrangement of traps with near-unity filling. The movable tweezers are created with a pair of crossed AODs. These AODs can be used to create a single moveable trap which moves one atom at a time to fill the target arrangement or to move many atoms in parallel.

[0228] Referring to Fig. 11, a schematic view is provided of an apparatus 1100 for quantum computation according to embodiments of the present disclosure. As shown in Fig. 11, using a beam generated by a light source 1102 (for example, a coherent light source, in some example embodiments - a monochromatic light source), SLM 1104 forms an array of trapping beams(z.e., a tweezer array) which is imaged onto trapping plane 1108 in vacuum chamber 1110 by an optical train that, in the example embodiment shown in Fig. 11, comprises elements 1106a, 1106c, 1106d, and a high numerical aperture (NA) objective 1106e. Other suitable optical trains can be employed, as would be easily recognized by a person of ordinary skill in the art. Using a beam generated by light source 1112 (for example, a coherent light source; in some example embodiments - a monochromatic light source), a pair of AODs 1114 and 1116, having nonparallel directions of acoustic wave propagation (for example, orthogonal directions) creates dynamically movable sorting beams. By using the optical train, such as the one depicted in Fig.11 (elements 1117, 1106b, 1106c, 1106d, and 1106e), the sorting beams are overlapped with the trapping beams. It is understood that other optical train can be used to achieve the same result. For example, source 1102 and 1112 can be a single source, and the trapping beam and the sorting beam are generated by a beam splitter.

[0229] The dynamic movement of the steering beams is accomplished by employing two nonparallel AODs 1114, 1116, arranged in series. In the example embodiment depicted in Fig. 11, one AOD defines the direction of “rows” (“horizontal” - the ‘X’ AOD) and the other AOD defines the direction of “columns” (“vertical” - the ‘Y’ AOD). Each AOD is driven with an arbitrary RF waveform from an arbitrary waveform generator 1120, which is generated in realtime by a computer 1122 which processes the feedback routine after analyzing the image of where atoms are loaded. If each AOD is driven with a single frequency component, then a single steering beam (“AOD trap”) is created in the same plane 1108 as the SLM trap array. The frequency of the X AOD drive determines the horizontal position of the AOD trap, and the frequency of the Y AOD drive determines the vertical position; in this way, a single AOD trap can be steered to overlap with any SLM trap.

[0230] In Fig. 11, laser 1102 projects a beam of light onto SLM 1104. SLM 1104 can be controlled by computer 1122 in order to generate a pattern of beams (“trapping beams” or “tweezer array”). The pattern of beams is focused by lens 1106a, passes through mirror 1106b, and is collimates by lens 1106c on mirror 1106d. The reflected light passes through objective 1106e to focus an optical tweezer array in vacuum chamber 1110 on trapping plane 1108. The laser light of the optical tweezer array continues through objective 1124a, and passes through dichroic mirror 1124b to be detected by charge-coupled device (CCD) camera 1124c.

[0231] Vacuum chamber 1110 may be illuminated by an additional light source (not pictured). Fluorescence from atoms trapped on the trapping plane also passes through objective 1124a, but is reflected by dichroic mirror 1124b to electron-multiplying CCD (EMCCD) camera 1124d. In this example, laser 1112 directs a beam of light to AODs 1114, 1116. AODs 1114, 1116 are driven by arbitrary wave generator (AWG) 1120, which is in turn controlled by computer 1122.Crossed AODs 1114, 1116 emit one or more beams as set forth above, which are directed to focusing lens 1117. The beams then enter the same optical train 1106b...1106e as described above with regard to the optical tweezer array, focusing on trapping plane 1108.

[0232] It will be appreciated that alternative optical trains may be employed to produce an optical tweezer array suitable for use as set out herein.

[0233] Partial Decoding and Transversal Algorithmic Fault Tolerance

[0234] Fast logical operations are critical for fault-tolerant quantum computers, as they can improve the speed of quantum computer operation. However, it has long been believed that d rounds of syndrome extraction are required for state initialization and gate operations, d being the code distance, in a broad class of quantum error correcting codes including the paradigmatic surface code. In this example, it is shown that contrary to this common belief, only a constant number of rounds of syndrome extraction are required per operation, even for the surface code, using a new fault-tolerance strategy that is referred to herein as “transversal algorithmic fault tolerance.” Through the combination of transversal gate operations and transversal measurements, and developing strategies for correlated decoding despite only having access to partial syndrome information, it is proven that the deviation from the ideal measurement result distribution can be made exponentially small in the code distance. This is supplemented with circuit-level simulations of the protocol, including a circuit-level simulation of |T ) magic state factories, demonstrating competitive thresholds for the decoding approach. This work sheds new light on the theory of fault-tolerance, and has the potential to reduce the practical cost of fault-tolerant quantum computation by over an order of magnitude.

[0235] Quantum computers have the potential to solve computational problems that are classically intractable. However, many known applications require quantum computers with extremely low error rates, many qubits, and fast operation speeds. Quantum error correction (QEC) provides a possible solution to this challenge: By redundantly encoding logical qubits intomany physical qubits, one can suppress errors exponentially in the system size, thereby achieving very low logical error rates. In practice, however, the space and time overhead of QEC is still significant, motivating research into better QEC schemes with lower overhead.

[0236] A key requirement for any QEC scheme is fault-tolerance, such that errors happening anywhere in the quantum circuit can be accounted for. Traditional strategies for fault-tolerance analyze each circuit operation individually, and typically require on the order of d rounds of syndrome extraction, d being the code distance, in order to gain confidence about the measured syndromes and achieve fault-tolerance. This is the case, for example, for the surface code, one of the leading candidates for practical QEC due to its simple 2D layout and competitive thresholds. In typical compilations based on lattice surgery or braiding, even in the case where non-local connectivity is allowed, each logical operation requires d rounds of syndrome extraction, necessitating a time overhead that is on the order of d ~ 30 in typically-analyzed fault-tolerant settings.

[0237] To address the challenge of time overhead, various alternative strategies may be used. In single-shot quantum error correction, check redundancies allow the repair of noisy stabilizer measurement results, thereby allowing the use of only a constant number of syndrome extraction rounds. However, the redundancy also introduces additional spatial overhead and layout complexities, often requiring a number of qubits scaling as d3and significantly reducing the threshold.

[0238] It may be shown numerically that by performing correlated decoding between multiple logical qubits, it is possible to reduce the number of syndrome extraction rounds following each gate operation in a transversal circuit to 0(1), while still maintaining a threshold and competitive space-time overhead. However, in exemplary embodiments of such an approach, all qubits were measured, only classically-simulable Clifford circuits were considered, and no feed-forward operations were performed. This implies that the decoder had access to the full syndrome information, and there is no point at which an outgoing qubit has less than d rounds of syndrome information. In practice, these restrictions can severely limit the achievable space-time volume saving, as many key circuits involve non-Clifford inputs and ancilla qubits, and often have small separation between their initialization and measurement.

[0239] In the present example, a strategy is provided for fault-tolerant quantum computing that is referred to as “transversal algorithmic fault-tolerance,” and it is shown to allow logical operationsin the surface code to be executed fault-tolerantly in a single-shot fashion. Contrary to the common belief, it enables the use of only one round of syndrome extraction following each gate, even in the absence of any single-shot QEC property of the underlying code, assuming the existence of efficient decoders. The construction makes use of transversal gates and transversal measurements, supplemented with state injection to allow universal quantum computation. It relies on a decoding strategy that builds on top of correlated decoding, but with additional modifications to ensure consistency between multiple rounds of decoding and conditional feedforward operations.

[0240] Referring to Fig. 19, transversal algorithmic fault-tolerance is illustrated. As shown in Fig.19A, conventional fault-tolerance analysis separately examines each gadget in the circuit and ensures they are individually fault-tolerant. As shown in Fig. 19B, Transversal algorithmic fault-tolerance directly uses all accessible syndrome information up to a logical measurement, and guarantees faulttolerance of the measurement result, even if the gadgets are not individually fault-tolerant.

[0241] An important intuition is that although state initialization and gate operations with only a single round of syndrome measurement may not be fault-tolerant when viewed individually (Fig.19A), they become so when viewed holistically in the full algorithmic context (Fig. 19B). In particular, it is only necessary to perform decoding when a logical measurement is performed, and that measurement result is used to infer a subsequent correction that needs to be applied, as in non-Clifford gate teleportation. Upon transversal measurement, the relevant pattern of stabilizer initialization products needed to decode this measurement result is determined, even if this information is not yet available on other logical qubits that have not yet been measured. By then imposing a consistency condition with future measurements in subsequent decoding, one can ensure that the measurement results are close to the ideal measurement distribution, thus guaranteeing correct execution of the quantum circuit.

[0242] To verify the fault-tolerance of this strategy, it is proven that the protocol has a threshold for an arbitrary surface code Clifford quantum circuit with feedforward. Numerical simulations of the protocol are performed, demonstrating that the full code distance is recovered with this strategy, and that it maintains a competitive threshold close to that of the regular surface code.

[0243] These results have significant implications for large-scale fault-tolerant quantum computing. On the practical side, it suggests an immediate depth reduction of O(d) through the use of transversal gates and transversal measurements with fast decoders, or a large constantfactor improvement when decoding speed becomes the bottleneck, and highlights possible architectural advantages of systems that natively support transversal gates over those where operations must be done through lattice surgery. On the fundamental side, it provides a new strategy for achieving fault-tolerance. Recent work has proposed the space-time view of quantum error correction as a general and powerful perspective, but the analysis of fault-tolerance has hitherto been focused on constructing composable building blocks that are individually fault-tolerant. In contrast, the present work sheds new light on this, demonstrating that there is much to gain from a holistic view of the fault-tolerance of the full circuit, with all operations included.

[0244] Referring now to Fig.20, a high-level description is provided of an exemplary protocol, together with a high-level sketch of the fault-tolerance proof. Figs. 20A-C illustrate a logical quantum circuit where a measurement result must be interpreted with only partial syndrome information. Circles illustrate a single round of syndrome extraction, and the logical |0) / |+) state preparation is performed by preparing all physical qubits in the corresponding state and measuring the syndromes once. Logical gates and measurements are performed transversally. When the last logical qubit in Fig. 20A is measured and its result decoded, the top two logical qubits have not yet been measured, and complete syndrome information on them is not available. To ensure correctness of the measurement result, it is required to show that when more information becomes available in Fig. 20C, the assigned measurement results are consistent between the multiple decoding attempts of the same result, and reproduce the same correlations as the ideal circuit. This is achieved by exploiting the fact in Fig. 20B, that certain Pauli operations act trivially on the logical initial state, but can change logical measurement results.

[0245] Fig. 20D illustrates different possible error chains and proof intuition for the surface code, in a cross-sectional view. Boundaries (2001, 2002, 2003, 2004, 2005) are where X errors can terminate, while boundaries (2006, 2007) are where Z errors can terminate. Only X error chains are drawn, since the logical measurement is in the Z basis. Strings (2008, 2009, 2010) do not cause a logical error, either because their net effect when projected to the logical measurement is equivalent to a stabilizer, or because they terminate on a |+)L initialization boundary. The dot (2011, 2012) indicates a string branching point due to the transversal CNOT. The string (2013) can cause a logical error, but must have weight at least d.

[0246] To illustrate the protocol, consider the quantum circuit shown in Fig. 20C. Here, | + / 0) logical state preparation involves preparing all physical qubits in | + / 0), followed by a single round of syndrome measurement, as indicated by the circles. For |+) initialization, this randomly initializes the Z stabilizers, and vice versa. Each logical CNOT is a transversal CNOT between corresponding physical qubits, followed by one round of syndrome measurement, and each logical measurement reads out all physical qubits transversally in the corresponding basis. After each logical ancilla measurement, a measurement result is committed to based on only the existing measurement data, as illustrated by the dashed lines in Fig. 20A and Fig. 20C. Although this is not strictly necessary in this circuit, it will become necessary in circuits with feed-forward operations, so this requirement is imposed. Crucially, at no point are there d rounds of syndrome extraction separating the initialization, measurement, or out-going qubits. Thus, conventional fault-tolerance theory would argue that there is not enough repetition for the initialization values of the random stabilizers to be learned fault-tolerantly. However, the protocol circumvents this issue.

[0247] The full protocol for performing single-shot fault-tolerant logical operations with the transversal algorithmic fault-tolerant scheme is now described. The logical measurements are grouped into layers based on their feed-forward dependencies. If a logical measurement is influenced by some conditional operation that depends on previous measurement results, the previous measurements must be fully decoded and their results must be committed before proceeding, even in the absence of future syndrome information on the unmeasured qubits.Crucially, it is only necessary to commit to the one bit of information corresponding to the logical measurement result, without committing to specific corrections. This setting is referred to as “partial decoding,” as there is only access to partial syndrome information, but also only need to commit to part of the results. Although not strictly required due to the lack of Clifford feedforward, the same partial decoding setting is also considered for the Bell state preparation circuit in Fig. 20A and Fig. 20C.

[0248] The decoding strategy will proceed as follows. For each layer, the existing logical measurement results are decoded using all currently accessible syndrome data, including transversal logical measurements as well as syndrome measurements up to this time point (e.g., circles and measurement of the last logical qubit in Fig. 20A). It will be shown that the deviations of the measurement results from the ideal measurement distribution are exponentiallysuppressed in the code distance. Intuitively, this is because the transversal measurements give more reliable syndrome information that can be used to fault-tolerantly determine the relevant stabilizer values on the measured logical qubit and perform the required corrections. Note that there is no guarantee of the correctness of the correction on the unmeasured qubits, but this is irrelevant because there is not yet a need to commit to their measurement results. The strategy thus performs each round of decoding from scratch, not making use of any information about past corrections, thereby avoiding any possibly erroneous inferences from previous measurement rounds that lacked the relevant syndrome history.

[0249] Generically, it is also possible that logical measurement results that have been previously committed to are now assigned a different value (arrow between Fig. 20A and Fig. 20C). In order to maintain consistency, operations are applied on the initial state that do not change the logical state, but anti-commute with the measurements. For example, if a logical qubit is initialized in state |+)L, then one can freely apply a X logical operator without changing the logical state (Fig. 20B). Propagating this operation through the circuit, however, can still flip the measurement result downstream when the measurement result is random, e.g., an A measurement on the state |+)L is flipped by a Z operator. In the decoding strategy, these operators are applied on initialization boundaries, until the measurement results are consistent with previous rounds or it is determined that this is impossible. The required pattern can be computed efficiently by solving a linear system of equations.

[0250] There are thus two possible ways in which the protocol could fail. First, one may not be able to find a set of initial logical operators to apply, such that the measurement results are consistent with previous measurement results. Second, even if such an assignment can be made, it is still possible for a logical error to happen and result in different measurements statistics. A rigorous proof is provided below for the proposition that for either of these types of logical errors to occur, at least d / 1 physical errors (including corrections) must have happened in a connected cluster. The intuition is shown in Fig. 20D: error chains that terminate on initialization boundaries can be made trivial by the application of operators that do not change the logical code state, so a logical error must span across the system and have length d. This allows adaptation to well-known strategies for the proof of fault-tolerance, which show that if a logical error requires a connected cluster of a minimal size d, then the logical error probability is suppressed exponentially in d.

[0251] The following threshold theorem may be derived:

[0252] Theorem 1. Consider a surface code error-corrected Clifford quantum circuit with feed-forward, subject to data-syndrome noise, involving at most n logical qubits, t layers of gate operations, and m measurements. There exists a threshold pO, such that for p < po, the above decoding strategy with the most likely error (MLE) as the inner decoder has a logical error rate of at most C(p / p0)d / 4, where C is some polynomial factor in m, d, n, t.

[0253] A detailed proof is provided below. The proof is under the data-syndrome noise model (also known as the phenomenological noise model), in which each data qubit is subject to error rate p per round, and each syndrome measurement result is flipped with probability p. However, it can be readily generalized to a local stochastic circuit-level noise model, as the syndrome graph has bounded weight. One can assume the use of a most likely error decoder here, which returns the most likely individual error given the syndrome data, but does not consider equivalence classes. However, the numerical results suggest that certain polynomial time decoders such as belief-hyper-union-find can still have a threshold.

[0254] These results are further generalized below to the setting of magic state injection, showing that if a magic state input with a smaller distance di is provided, then this can be grown to the full code distance d in a single step, while still maintaining an effective fault distance di. Combining these results, the protocol can implement transversal Clifford circuits with feed-forward and magic state inputs with only a constant round of syndrome extractions following each gate, while still achieving a logical error rate that exponentially decreases in the code distance. This suggests the possibility of universal logical quantum computation in the surface code with single-shot logical operations, despite the lack of single-shot quantum error correction.

[0255] Referring to Fig. 21, numerical simulations of an exemplary protocol are provided to evaluate its performance. In Fig. 21A, a comparison is provided between two different single-shot methods for logical state preparation between three rotated surface codes using transversal gates (left) or lattice surgery (right), followed by logical teleportation and measurement. Fig 21B is a graph of code distance versus logical error rate. With transversal gates and only a single round of syndrome extraction during state preparation, the error rate decreases exponentially with the code distance. With a single round of lattice surgery, the error rate increases with code distance.

[0256] The circuit shown in Fig. 21A is first considered, which highlights the differences between the transversal-gate-based approach and lattice surgery. The circuit prepares a GHZ state between three rotated surface codes, then teleports these qubits into an additional register, and finally measures the teleported qubits in the Z basis. For GHZ state preparation with transversal gates, the three data logical qubits are prepared in |+) using a single round of syndrome extraction, and two ancilla logical qubits and transversal gates are used to measure the ZZ correlation between neighboring pairs of logical qubits. Each operation only involves a single round of syndrome extraction. This approach is compared against performing lattice surgery between three logical |+) qubits, which can alternatively be viewed as performing a single round of syndrome extraction after state preparation of a larger surface code logical qubit in |+). Circuit-level noise is added to each physical gate in both circuits with probability p = 0.003, and a MLE decoder is used to compute the associated correction. Unlike usual lattice surgery, where d rounds of repetition are employed to fault-tolerantly learn the ZZ product, the single-shot lattice surgery is expected to have an error rate that does not decrease with increasing code distance. As shown in Fig. 21B, this behavior is observed: the logical error rate for the single-shot lattice surgery setting grows with code distance. On the other hand, the protocol using the transversal algorithmic fault tolerance construction has a logical error rate that decays exponentially with the code distance.

[0257] Next, an extension of the circuit shown in Fig. 20C is considered, where instead of checking the consistency between two measurements of the ZZ correlation, k repetitions are performed and it is examined how the probability of the two types of logical failures scale. When two ZZ measurements in the same decoding step are decoded to have different values, there will be a heralded failure (Fig. 20A and Fig. 20C), as they cannot be fixed consistently by applying the logical operators illustrated in Fig. 20B. After transversally measuring the logical Bell pair and decoding the full logical circuit, the product of the logical measurement result of two Bell qubits should match all previous ZZ correlation measurements. If there is a disagreement, one says there is an unheralded logical error. At the logical level, k = 8 is set. At the physical level, the code distance d of the rotated surface code is varied from d = 3, 5,., 15. A circuit-level noise model is used, and an MLE decoder is used to perform each round of partial decoding. Numerical experiments show that both the heralded and unheralded logical error rate is suppressed exponentially as d is increased. Furthermore, replacing the transversalstate initialization with state injection may be considered. The surface code state injection protocol first expands a physical data qubit into a distance di surface code patch and then expands it again to distance d2. If one assumes the first step is noiseless (e.g., the output of some previous distillation process), the logical error rate should be determined by di, even when the patch expansion requires only a single step. As shown in Fig. 22, the heralded failure rate is indeed suppressed exponentially by increasing di.

[0258] Referring to Fig. 22, graphs of physical error rate versus heralded failure rates for repeated Bell measurements with injected initialization states are provided. Instead of transversal state preparation, ideal state injection is performed at distance di followed by local stochastic noise, and then the code patch is grown to the full distance d₂ in a single step. The heralded failure rates are compared at each of the different steps of partial decoding, and it is found that the logical error scaling is consistent with the distance of the smaller patch, indicating the fault-tolerance of single-step patch growth in transversal algorithmic fault-tolerance.

[0259] Finally, the |7) state (also known as S state) magic state distillation circuit in Fig. 23 is considered. At the logical level, the S gate is implemented by injecting an ancilla |7) state, followed by an S gate teleportation, measuring out the ancilla qubit under the Z basis. There are three partial decoding steps:1. Interpret Z logical measurements on the injected |7) ancilla qubits.2. Interpret X logical measurements on qubits on which the S gates have been applied.3. Interpret X logical measurement on the output qubit and the last reference qubit.

[0260] Since the circuit only involves Pauli feed-forward, these classical controlled gates can be applied in software by flipping the measurement outcomes on the target qubits according to the decoding outcomes on the control qubits, as long as only the syndrome information accessible at a given measurement is used. A logical error in this example is identified by the logical measurement outcome on the final qubit after the software application of the classical controlled gates yielding -1. The performance of this circuit is numerically investigated, showing that it follows the expected error scaling of a magic state factory.

[0261] In this discussion, it has been shown that single-shot logical operations can be performed without relying on single-shot error correction. Crucial to the construction is the use of a newdecoding strategy that takes into account the full circuit context, and directly performs correlated decoding on the whole space-time graph.

[0262] This work highlights the advantages of transversal gate operations and transversal measurements over lattice-surgery-based alternatives. Transversal measurements provide more reliable syndrome information, allowing fault-tolerance in a space-time process that would otherwise lack it. In the quest for a unified theory of fault-tolerance, this work highlights the necessity of directly considering the entire circuit as a whole, rather than breaking things up into composable blocks, as is done in alternative approaches.

[0263] This approach provides a reduction in space-time overhead of <9(<7) in the abstract circuit model.

[0264] Comparison with alternative methods

[0265] In single-shot quantum error correction, redundancies are present in the syndrome measurement results, allowing one to robustly infer the actual stabilizer values up to small residual errors, in a fashion similar to classical error correction on the syndrome readings.

[0266] These ideas may be extended to certain families of quantum low-density parity-check (qLDPC) codes, where expansion and the so-called confinement property lead to single-shot QEC for quantum memories. In this case, however, there are usually no stabilizer redundancies, and so the randomly initialized stabilizer values cannot be reliably inferred in the conventional fault-tolerance strategies. Here, one only guarantees that the output error after a round of error correction is controlled if both the input error and added noise are controlled, and one may still require d rounds of repetition to learn the initialized values of the stabilizers with sufficient confidence. Moreover, the most general methods for performing logical operations on them make use of lattice surgery, which also requires d rounds of syndrome extraction to maintain fault-tolerance, similar to the lattice surgery example for the surface code analyzed above.

[0267] In both cases, single-shot QEC focuses on the fault-tolerance and error-reducing effect of individual error correction gadgets, rather than the complete end-to-end algorithmic context. This is in contrast to the fault-tolerance strategy described herein, which uses all accessible information throughout the algorithm.

[0268] In the following discussion, the importance of shallow-depth algorithmic gadgets in many practical compilations of quantum algorithms is discussed. This highlights the need for fault-tolerance strategies that do not require an <9(<7) separation between initialization and measurement, as developed above.

[0269] In general, circuit components that involve an ancilla logical qubit often have a shallow depth between initialization and measurement, whether this ancilla is used for algorithmic reasons or compilation reasons. For instance, temporary ancilla registers are used in algorithmic gadgets such as adders or quantum read-only memories, where the bottom rail of a ripple carry structure is initialized, 2 or 3 operations are performed on it, and then the ancilla qubit is measured. A useful technique for performing multiple circuit operations in parallel is time-optimal quantum computation, which is also related to gate teleportation and Knill error correction. In this case, a pair of logical qubits are initialized in a Bell state. One qubit is then sent as the input into a circuit fragment A, while the other qubit executes a Bell basis measurement with the output of another circuit fragment B. The combined circuit is equivalent to the sequential execution of B and A. This allows the two circuit fragments to be executed in parallel, despite them originally being sequential, thereby reducing the total circuit depth and idling volume. However, to fully capitalize on this advantage, it is desirable to only have a constant number of syndrome measurement rounds separating the Bell state initialization and Bell basis measurement, in order to minimize the extra circuit volume incurred by the spacetime trade-off. Thus, a depth 0(1) separation between state initialization and measurement is again highly desirable.

[0270] Another common situation in which there is a low-depth separation between initialization and measurement is magic state distillation and auto-corrected magic state teleportation. Many magic state factories involve a constant-depth Clifford circuit (e.g., depth 4 for the 15-to-l distillation factory), followed by the application of non-Clifford rotations. The non-Clifford rotations are often implemented via noisy magic states and gate teleportation, which therefore require logical measurements. If the Clifford circuit depth has to be at least d to maintain faulttolerance, the time cost of the magic state factory will be much larger than the case in which the circuit can be executed fault-tolerantly in constant depth, as demonstrated herein.

[0271] In practice, classical syndrome decoding will take a finite amount of time. In order to perform conditional feed-forward operations, it is often necessary to fully decode the previous measurement results and commit to them. However, the finite runtime of the classical decoding procedure may cause delays and even growing backlogs. The following discussion covers both the theoretical (asymptotic) and practical implications of these considerations on the schemes provided herein. Most of this discussion is not specific to the present fault-tolerance strategy, and may also apply to existing discussions of single-shot fault-tolerance.

[0272] Many approaches to fault-tolerant quantum computation rely on mid-circuit measurements and feed-forward operations. For example, in gate teleportation of |7) states, a measurement followed by conditional S gate feed-forward is usually performed, requiring the measurement result to be decoded with sufficient confidence before the feed-forward gate is executed. Alternatively, in the implementation of transversal non-Clifford gates, one needs to first perform decoding to fix residual X errors before performing the transversal gate. Quantum circuits where gates are performed in a single-shot fashion could thus impose a stringent requirement on the decoding speed: Since the separation between sequential measurements or decoding rounds may be O(1), e.g., when synthesizing an arbitrary angle rotation, one may also require the decoder to finish within a similar amount of time, in order to not introduce time or error bottlenecks. Thus, in order to support single-shot quantum error correction, O(1)-time classical decoding may be necessary.

[0273] There are various possibilities for decoders with classical constant runtime. For the surface code, parallel versions of minimum-weight perfect matching may be employed, with an average runtime that is argued to be O(1) per round of error detection. However, this runtime is amortized across multiple rounds of decoding, and still induces an O(d) latency when trying to decode any given logical measurement result. For LDPC codes, a parallel small-set-flip decoder has been argued to have constant parallel runtime if some amount of residual noise is allowed, which can be handled by future rounds of QEC. However, it again does not apply to the setting of decoding a measurement, where one must return to the codespace, and the decoder requires a runtime that is logarithmic in the initial syndrome weight in that case. Generically, one may even expect there to exist obstructions against constant-depth classical decoding when it is required to return to the codespace, as the decoder may need to see Ω(d) sites in order to make the correct inference.

[0274] In the case where constant time parallel decoding is not achievable, the scheme no longer supports O(1) rounds of syndrome extraction per logical operation, as more rounds of syndrome extraction must be inserted while waiting for the classical decoding to complete. However, it can still support any constant improvement over the typical Θ(d) cost per logical operation, based on the ratio between quantum and classical operation speeds: at most d / C rounds of syndrome measurement are needed, where the constant C can be made arbitrarily large as the ratio between classical and quantum operation speeds increases. Note that this situation is in stark contrast to existing fault-tolerance constructs, where the strategy fundamentally does not support a complete set of constant-time fault-tolerant logical operations, regardless of the speed of the classical computer.

[0275] In practice, there are many additional considerations for how to optimize decoding and achieve sufficiently fast runtimes. Although the relevant decoding graph for a given measurement may now be much larger, involving possibly d layers of syndrome extraction or more and many (possibly all) logical qubits, it is expected that using decoding algorithms such as hypergraph union-find or minimum-weight parity-factor decoding, and using parallelization techniques, the decoding problem can be divided into many smaller clusters of size O(d) and handled with limited communication. Decoding algorithms based on belief propagation may also allow for a high degree of parallelization. This may allow sufficient parallelization of the decoding problem and a decoding complexity that in practice becomes similar to techniques for gadget-level modular decoding. Because the decoding problems have substantial overlap, it may be possible to make partial use of past decoding results, particularly when the clusters do not merge during decoding. Although the relevant decoding graph for any given measurement is now larger, for a given rate of syndrome measurements on the hardware, the amount of incoming data is comparable to the usual fault-tolerance setting, and thus one may expect there to not be a substantial increase in classical decoding resources required in practice.

[0276] It is in principle possible to perform the decoding and feedback coherently within the quantum computer, and completely eliminate the need for measurements. A separate analysis of the overhead of the decoding process would need to be performed to understand the impact of finite decoding time in that setting and beyond the concatenated fault-tolerance considered there. However, such approaches are typically believed to be less effective in removing entropy fromthe system due to the lack of measurements, resulting in worse thresholds, so they are not the focus here.

[0277] The following describes the protocol for turning a target quantum circuit into a fault-tolerant circuit. This is specified by two pieces of information: the resulting physical quantum circuit that must be executed, and the decoding strategy.

[0278] For now, the discussion will focus on Clifford circuits for simplicity. However, the application of nontrivial Clifford feedforward operations is allowed, thereby requiring that some measurement results must be fully decoded before other measurement results become available. This setting captures the key challenge of non-Clifford circuits, namely that they often require conditional feedforward operations, without necessarily building up a full syndrome history on some of the outgoing qubits. In later sections, this proof is extended to a setting motivated by state injection, thereby providing evidence that the conclusions are also relevant to universal quantum computation.

[0279] Definition 2 (Clifford quantum circuit with feedforward). Define 풞, Clifford quantum circuit with feedforward, to be a quantum circuit that consists of layers of the following operations:1. Qubit initialization in state |0).2. Single-qubit Z gates.3. Single-qubit H gates.4. Single-qubit S gates.5. CNOT gate between any pair of qubits.6. Identity gate, if no other operation is specified in this layer.7. Measurement of a subset of qubits in the Z basis.8. Feedforward Clifford operations of the above types, conditional on certain measurement results of the measured qubits, on some subset of the remaining qubits.

[0280] To facilitate the construction of an error-corrected version of these circuits, the Clifford circuit has been compiled into a particular set of operations. X or Y basis operations can beobtained from the Z basis via H and / or S gates.

[0281] Having defined the circuit, it is now prescribed how it is translated into an error- corrected quantum circuit based on the surface code:

[0282] Definition 3 (Surface code error-corrected Clifford quantum circuit with feedforward). Given a Clifford quantum circuit with feedforward (Def 2), the logical Clifford quantum circuit of distance d is defined by replacing each of the operations as follows:1. Qubit initialization in the Z basis is replaced by initialization of all physical qubits in the Z basis with eigenvalue +1, followed by one round of syndrome measurement.2. Single-qubit Z gates do not lead to any physical action, but are tracked in the logical Pauli frame.3. Single-qubit H gates are replaced by a transversal H gate, in which an H gate is applied on each physical qubit of the code patch, followed by a reflection across the diagonal, and one round of syndrome measurement.4. Single-qubit S gates are replaced by a fold-transversal S gate, in which physical S gates are applied on qubits on the diagonal, and CZ gates are applied on pairs of qubits that are matched together when folding across a diagonal. This is followed by one round of syndrome measurement.5. CNOT gates are replaced by transversal CNOTs between pairs of logical qubits, followed by one round of syndrome measurement.6. Identity gates are replaced by one round of syndrome measurement.7. Measurements in the Z basis are replaced by a transversal measurement of all corresponding physical qubits in the Z basis.8. Feedforward Clifford operations are executed in the same way as above, based on the decoded logical measurement results.

[0283] Here, all logical qubits (code patches) are non-rotated surface codes of the same code distance d.

[0284] The syndrome measurement for the surface code can be performed simultaneously in both bases. When initializing the logical qubit, the values in one basis are already deterministic, and therefore one only needs to measure the complementary basis. However, for simplicity of analysis, both bases are included here.

[0285] Referring to Fig. 24, the unrotated surface code is illustrated. White qubits are data qubits. The logical Z(X) operator runs vertically (horizontally), and the convention for fixing Z(X) stabilizers is chosen to be performing a chain of X(Z) flips to the left(bottom) boundary. Rows(columns) that have data qubits on the outer edge are referred to as major rows(columns).

[0286] All surface code patches are chosen to have the same distance d unless otherwise noted, and follow the orientation shown in Fig. 24. Furthermore, the logical operator representatives are chosen as follows: the logical X operator is chosen to be the product of X operators on the topmost row of qubits, and the logical Z operator to be the product of Z operators on the right-most column of qubits. While any equivalent choice of logical representative is valid, this particular convention is chosen so that fixing the stabilizer values will not change the logical qubit readout.

[0287] The non-rotated surface code is used because it has a simpler implementation of the foldtransversal S gate. The results are readily generalizable to the rotated surface code, with a slight increase in proof complexity and slight sacrifice in code distance for the S gate.

[0288] Each nontrivial physical operation is followed by one round of stabilizer syndrome measurement in this construction. This is primarily for simplicity of analysis, and the number of rounds should be optimized in practice depending on the given target circuit and target logical error rate, possibly even performing multiple gate operations before one round of syndrome measurement. Notice also that d rounds of syndrome measurement are not performed following a given operation.

[0289] The physical quantum circuit to be executed has now been specified. To complete the specification a decoding strategy is also specified, which will then be used to prove the fault-tolerance of the protocol.

[0290] Due to the random initial projection when measuring X stabilizers during |0)L initialization, the physical state is not initially in the code space, where all stabilizers should have eigenvalue +1. Therefore, Z strings are applied to fix the stabilizers back into the code space. To capture the randomness introduced when measuring a physical qubit initialized in |0) in the X basis, a Z operator on that site is multiplied into the state with 50% probability. Starting from areference sample of measurement results, the full measurement result distribution can then be obtained by considering the distribution over these random Z operators and error events.

[0291] For the purposes of this discussion, a particular basis is defined for these extra random variables, so that stabilizer equivalences are automatically accounted for. Other choices of the basis, however, will lead to the same results, as long as they are chosen consistently across logical qubits.

[0292] First, recall the choice of logical representatives for the unrotated surface codes, as illustrated in Fig. 24. While any equivalent choice of logical qubit is valid, this particular convention is chosen so that fixing the stabilizer values will not change the logical qubit readout.

[0293] Gauge variables and operators are now defined, which specify the extra degrees of freedom that are randomly initialized when the logical state |0)L is prepared. For each X stabilizer s, the gauge operator Gsis defined to be the product of the Z operator chain that pairs to the bottom boundary, i.e., the product of Z operators along the column that the X stabilizer is located in, starting from the bottom data qubit of the stabilizer to the bottom boundary. For each logical qubit initialization and each random X stabilizer s, a binary random variable gsis associated with probability p = 0.5, denoting whether the gauge operator Gsis applied. If the stabilizer was initialized as -1, then Gsis applied.

[0294] Note that with this choice of logical operators, when applying Gs, only the single stabilizer.s is flipped, while all other stabilizers and logical operators remain unchanged. The random variables gscapture the fact that the X stabilizer values are initialized randomly when preparing |0)L.

[0295] Similarly, one can define gauge variables that correspond to flipping logical operators during state initialization. For the Z logical operator / , a binary random variable gi is associated with probability p = 0.5, denoting whether the logical operator I is applied. Since the logical qubit is initialized in |0)L, the logical operator Z acts trivially on the initial state, i.e., it can be viewed as a stabilizer in space-time.

[0296] The vector g is used to denote the vector of all gauge variables, and Q is used to denote the set of operators they correspond to, including both the ones that correspond to stabilizers and those that correspond to logical operators.

[0297] With these concepts in hand, one may return to the description of the detector error model. The phenomenological error model is specified below.

[0298] Definition 4 (Phenomenological error model). A phenomenological noise model with probability p replaces each operation as follows:1. Each data qubit initialization in |0) is followed by an X error with probability p.2. Similarly, each data qubit measurement is preceded by an X error with probability p. 3. Each gate, including the identity gate, is followed by a depolarizing error of strength p on each of the data qubits involved in the operation.4. The error correction is performed perfectly, but each measurement result is flipped with probability p.

[0299] Although this focuses on phenomenological noise for simplicity of the proof, the analysis should readily generalize to local stochastic noise models and circuit-level noise models.

[0300] Define the set of possible elementary errors (faults) as ε. A given error realization is denoted by the vector e ∈ {0,1}^|ε|, where the i-th entry of the vector is 1 iff the i-th error in ε occurred. The same notation is used to denote the Pauli operator the fault configuration corresponds to, where the meaning should be clear from the context. These errors will trigger certain detectors (checks) from the set of all possible detectors 7), and will flip certain logical operators from the set of all logical operators L. Here, detectors are products of stabilizer measurement outcomes that are deterministic in the absence of errors. Therefore, their values depend only on e and not g. The set of detectors that a given error triggers is denoted as D(e).

[0301] For the purposes of this discussion, the most likely error (MLE) decoder is used, also known as the minimum weight decoder. Given a detector shot d = D(e), the MLE decoder returns the most likely error K ∈ {0,1}^|ε|, that is consistent with the observed detectors. The total action of error and recovery is then given by f = e © K, where addition is understood to be mod 2. In slight abuse of terminology, this joint action is referred to as the fault configuration. The decoder should return a correction that triggers the same set of detectors as the error, D(K) = D(e), and by linearity, thereby triggering no detectors in combination D(K ⊕ e) = 0. Note that this decoder solves the most likely error problem instead of the maximum likelihood problem, i.e., it does not consider the entropy factor associated with the number of cosets. Additionally, for generic decoding problems, identifying the most likely error may be computationally challenging. Similar to the errors, the decoding result also corresponds to a certain assignmentof gauge operators, which is denoted A. For each logical qubit that is measured, one will correct back into the codespace. Therefore, the joint action of gauge variable and gauge assignment h = g ⊕ A, which is referred to as the gauge configuration, must be equivalent to a logical operator when restricted to the logical qubits that were measured.

[0302] The logical readout result depends on the measurement results of the physical qubits, the fault configuration f = e ® K, and the gauge configuration h = g® A. More specifically, the physical measurement results that would have occurred in the absence of any physical errors are written as m, and define the ideal observable My(m) formed by multiplying the ideal physical measurement results. Denote by 5(e) the propagation of all data qubit faults e through the circuit to the measurements, and similarly 5(g) for gauge operators propagated to measurements. Assume that the set of operations when propagated to a logical qubit measurement is / and denote by C( / ) the resulting change in the logical observable. Note that C( / ) is linear in / The raw observable is then given byM_raw(m, e, g) = M_id(m) ⊕ C(S(e) ⊕ S(g))

[0303] Upon decoding, an inference of the errors and gauge operators is produced, resulting in the correction to the logical observable C(5(«), 5(A))- The full measurement result is then given byM(m, e, g, K, A) MldW © C(0(e) © 5(g))© C(5(® © 5(A))= Mid(m) © C(Q(e © K) © (Q(g © A)).(1)

[0304] The relations between faults, gauge variables, detectors and logical measurement results can be expressed as a decoding graph (detector error model), with an edge between each fault or gauge variable node and each detector or logical measurement result that they flip.

[0305] The correction displays an important property: the measurement result distribution is identical regardless of the value of the gauge variable. Intuitively, this is because what random stabilizer pattern is projected into should not affect the logical measurement results.

[0306] Lemma 5 (Gauge variables do not affect measurement distribution). Given a fixed, arbitrary fault configuration f for any choice of gauge variable g G, the logical measurement distribution M(m, f, g) is identical regardless of the choice of g.

[0307] A choice of gauge variable corresponds to applying strings of Z operators that pair to the left boundary, and applying the logical Z operator on the |0)L initial state. The former does not intersect the X logical representative chosen, and therefore does not flip the measurement result. The latter acts trivially on the initial state, so it also does not change the measurement distribution.

[0308] An additional useful concept is that of a decoding subgraph. When performing a transversal logical measurement on some subset of logical qubits part way through the logical circuit, only a subset of the syndrome information will be accessible. One therefore needs to work with a reduced set of syndrome information and detectors. Crucially, however, one has access to the results of the transversal measurement, which allows one to obtain sufficient information to decode the logical measurement under question.

[0309] More concretely, given a set of logical measurements AC, the decoding subgraph Γ_AC associated with it is the subgraph of errors £, gauge variables Q, detectors T) and logical measurements L of the original detector error model, which only make use of stabilizer measurement and logical measurement results up to the last logical measurement in AC. The restriction is denoted with a subscript, e.g., 2)|M.

[0310] The measurements in the quantum circuit will have some causal dependency due to the conditional feedforward. For example, if the measurement result mi produces a feedforward operation that can change the measurement basis of another measurement result m2, m2 depends on mi or notationally, mi < m2. This defines a partial order on the full set of measurement results AC, and one can choose a concrete ordering of the measurement results that is consistent with the partial order, {m1, m2, ..., m_o}. the decoding subgraph corresponding to the first j measurements in M is given a similar notation as above, e.g., the decoding subgraph is Γ|_j, the set of measurements is M|_j, the set of detectors is D|_j.

[0311] Definition 6 (Decoding strategy). Given a measurement order {m1, m2, ..., m_o}, for each subset of logical measurements M|_j = {m1, m2, ..., m_j}, the decoding subgraph is constructed and the MLE decoder is applied. Gauge logical variables are then flipped to make the inferred measurement results consistent with the set of previous measurement result If this is possible, all measurement results in M are committed after applying the appropriate flip and continue executing the circuit. Otherwise, a heraldedfailure has occurred and output FAIL. This process is repeated until all measurements have been processed. If the algorithm has not output FAIL, the list of all measurement results is output.

[0312] In this definition, the logical measurement results are processed and fed-forward one by one. The technique, however, also readily applies to the case where one instead partitions the logical measurements based on layers of Clifford feedforward operations, resulting in fewer rounds of decoding. The inference of how gauge logical variables flip logical measurement results is already performed at the stage of building the detector error model, and has an efficient polynomial time algorithm based on stabilizer tracking.

[0313] To show that this decoding strategy has a high probability of success, one can show two things: First, the probability of outputting FAIL should be low; Second, the measurement distribution should be close (in total variation distance, TVD) to the measurement distribution of the ideal circuit.

[0314] The following examines how physical errors propagate under transversal gate operation. The transversality guarantees that a given error cannot cause too many errors on a given code patch when propagated to the qubit measurements.

[0315] Lemma 7 (Transversal gates limit error propagation). For any elementary error E, consider its forward propagation S(E) = U†EU in a deterministic circuit, to right before a logical qubit measurement. Then U"\EU will cause at most 2 errors in a given X or Z basis.

[0316] For the phenomenological noise model (Def. 4), each elementary error acts only on a single qubit. Measurement errors do not apply any operation to the data qubits, so they will not cause any effect on the data qubits at the logical measurement. The syndrome measurements themselves also do not directly flip any physical qubit within the phenomenological noise model.

[0317] Suppose the elementary error E acts on qubit q. Define q to be the qubit related to it by a reflection across the diagonal; q̄ = q for qubits on the diagonal. Z, H and / gates act locally, so they can only change the error of the basis, but do not change the weight within a given code patch. Fold-transversal S gates can propagate errors between q and q, but not anywhere else. Finally, a transversal CNOT gate keeps errors on the same relative sitebetween different code patches. Therefore, U"\EU can only have support on q and q, thereby having weight at most 2 in any basis.

[0318] This lemma requires the circuit to be deterministic up to the logical measurement under consideration. However, it is still useful in this setting: it can be applied to the jth measurement, after processing and committing the first j-1 measurement results, since the circuit up to the jth measurement is now deterministic, conditional on the previous results.

[0319] A technical lemma is now introduced, characterizing the effect of the fault configuration and gauge configuration after applying decoding and error correction.

[0320] Lemma 8 (Correction on codespace for low weight faults). Consider the decoding subgraph T|j for a surface code error-corrected quantum circuit, corresponding to the first j measurements of the circuit. For any fault configuration f\j = (e © k)\j in T|y where the largest weight of any connected cluster is less than d / 2, there exists a choice of gauge variable correction Xj, such that the combined effect of fault configuration and gauge configuration Q(e\j © K|7) © Q(g\j © Xfi acts trivially on all logical qubits that have been measured. In other words, for the Z measurements, it acts as a combination of stabilizers and the logical Z operator, which therefore do not change the logical measurement result.

[0321] As described in Def. 3, the measurement is performed in the Z basis IX basis measurements can be performed by an H gate followed by a Z measurement). Therefore, it suffices to study logical X errors on the measurement.

[0322] After decoding and performing the corrections, the data qubits are in the codespace. Recall the notation / = e © K, h = g © A, then Q(f) © QQT) must be equivalent to a product of stabilizers and logical operators. In order for Off © QQi) to have nontrivial logical action, there must be an error cluster that spans from one boundary to another; more precisely, for the non-rotated surface code, Qfif) © QQT) must have an odd number of X errors in each major column to cause a logical flip. Denote the errors corresponding to connected clusters in / as / , z = 1, 2, k, where k is the number of connected clusters. Because Qff) is linear in the input error, one can analyze each connected component independently. By Lemma 7 and the fact that wt( / i) < d / 2 for all z, wt( ()( / )) < d. By the connectedness of the component, < / ( / ) cannot simultaneously have support in the left-most and right-most columns.

[0323] Suppose, without loss of generality, that Off / does not have support in the right-most column (the other case can be shown analogously). It is now shown that there exists a choice of gauge variables hi, such that 0(f) ® Qfh / ) does not trigger any detectors and does not apply any logical operator on this measurement result. Suppose for the sake of contradiction that this is not the case, then there must be some gauge operator gi that is chosen to return things to the codespace, whose forward-propagated effect 5(gi) acts non-trivially on this column. By the definition of gauge variables above, there will also be a gauge logical operator gi that can be applied, which negates the effect without changing the measurement distribution (Lemma 5). The combined correction Qfhi) = Q(g ® gi) acts trivially on this column. Therefore, Q(f)®Q(hi) acts trivially on this column, and since it does not change the codespace, it must be a stabilizer and has trivial logical action for all logical measurements that have been performed.

[0324] One can combine the fault configurations / = ©, ft and gauge configurations h = hi. Since each component Q(f) ® QQn) has trivial logical action on the measurement results, by linearity, so does the combined effect < / ( / ) ® Q(h /

[0325] The logical action is only guaranteed to be trivial on logical qubits that have been measured, and only in the basis that was measured. There could be residual errors on the remaining qubits, or a Z flip on a logical qubit measured in the Z basis. However, the former will get fixed in later rounds of decoding, so long as one can maintain consistency on the logical measurement results, while the latter does not influence any measurement results. Thus, they do not cause any effects on the logical measurement distribution. This idea is formalized in the following lemma, which characterizes the structure of logical errors. It shows that small clusters of errors cannot give rise to logical errors on logical qubits that have been measured.

[0326] Lemma 9 (Logical errors must be composed of at least d / 2 faults). Consider the decoding subgraph T|y for a surface code error-corrected quantum circuit, corresponding to the first j measurements of the circuit. For any fault configuration f\,■ = (e ® k) in T|y where the largest weight of any connected cluster is less than d / 2, there exists a choice of gauge variable correction Xj, such that (1) the first j-1 measurement results are consistent with the previous round of decoding, if the previous round of decoding also satisfies the same condition and (2) the distribution of the jth measurement, conditioned on the outcome of thefirst j-X measurement results from the previous round of decoding, is identical to the ideal distribution.

[0327] Start with j = 1. In this case, the first condition is irrelevant, and it is only necessary to show that the inferred logical measurement distribution is consistent with the ideal distribution. Similar to above, it suffices to study logical X errors on the measurement.

[0328] By Lemma 8, there exists some gauge configuration A, such that the combined effect of fault configuration and gauge configuration does not change the measurement result. Therefore, this configuration is such that the measurement distribution is identical to the ideal distribution. By Lemma 5, all choices of gauge configuration are equivalent. Therefore, any choice of gauge configuration in fact gives rise to a measurement distribution identical to that of the ideal distribution.

[0329] This argument is now generalized to the inductive case. By assumption, the decoding problems of both the first j -1 measurement results and the first j measurement results also satisfy the condition that the fault configuration has the largest weight of any connected cluster less than d / 2. By Lemma 8, there exists a gauge configuration / i|7'_i, such that QC / ly-i) ® Q(^b'-i)actstrivially f°rthe first j-1 logical measurement results, and gauge configuration h|7,such that Q( / |7) ® Q(^b')actstrivially for the first j logical measurement results. Denote the inferred gauge correction this corresponds to as A|7'. In practice, the gauge configuration chosen for decoding the first j-1 measurements, h\j_1may differ from h\j_T, leading to different measurement outcomes for this specific shot (note that the distribution still remains the same). For joint decoding of the first j measurements, choose the following inferred gauge assignmentA|7~^b-i © ^b'-i ©^b- (2)

[0330] With this assignment and by linearity, the action on the codespace is^f\j) © Q(H|y_i) © Q( / l|7'-i) © Q(h\j) =[Q((f I;) © Q((^b)] © ^Q(^b-l) © Q(^b-l)]. (3)

[0331] Since Q( / |7) © Q(7z |y) has trivial logical action, the combined action is identical to that of Q(h|7-i) © Q(h|y_1), which is exactly the same as the decoding problem for the first j- 1th measurements. Therefore, for this choice of A|y, the first j - 1 measurement results are consistent with the previous round of decoding, proving property (1).

[0332] To prove property (2), note that conditioned on the outcome of the first j - 1 measurement results, the circuit up to the jth measurement is now a deterministic circuit C. From the above discussion, the gauge configuration h\j produces the same measurement distribution as the ideal circuit on C. Therefore, it also produces the same conditional distribution as the ideal circuit when conditioning on the first j - 1 measurement results. By Lemma 5, this also holds for the choice of inferred gauge assignment X\j, thereby completing the proof.

[0333] A lemma is now produced that bounds the number of connected clusters of a given size, and forms a core component of a number of fault-tolerance proofs.

[0334] Lemma 10 (Counting lemma). Consider a specific set S consisting of r vertices in a graph for which every vertex has degree at most z. Let Mz(s, S) be the number of sets containing S and a total of s vertices (i.e., s — r vertices beyond those in S), and which are a union of connected clusters, each of which contains a vertex in S. Then Mz(s, S) < er-1(ze)s-r, with e the usual base of the natural logarithm.

[0335] Using the counting lemma, one can now complete the proof of the main theorem.

[0336] Theorem 11 (Fault-tolerance of decoding strategy in Def. 6). Consider a surface code error-corrected Clifford quantum circuit with feedforward (Def 3) subject to the phenomenological noise model with probability p, with the circuit involving at most n logical qubits, t layers of gate operations, and m measurements. Then there exists a threshold p0, such that for p < p0, the probability Perr of either outputing FAIL or having a logical error for the entire circuit, when using the decoding strategy in Def. 6 is at most C(p / p0)d / 4. Here, p P0o=254'4 2, C = md.2nt7e(l-112eVp'

[0337] Each non-rotated surface code of distance d has <f + (< V—1 )2< 2d1physical qubits. Each logical qubit has < 2<72stabilizers. Therefore, each layer involves less than 2d2n detectors. Since there are t layers of gate operations, there are at most 2<fnt detectors in the full circuit. In each layer, each data qubit experiences at most 3 types of errors (X, Y, Z), and the syndrome measurement can have an error. Therefore, there are at most Scfnt possible faults.

[0338] Consider the fault adjacency graph, defined as a graph where the vertices are possible faults, and two vertices share an edge if the corresponding faults trigger the same detector. Each error triggers at most 8 detectors (there are at most 2 logical qubits involved in an operation, and in each of the X, Z bases, the error can trigger at most 2 detectors). Therefore, each vertex has degree at most z < 8 x (8 - 1) = 56.

[0339] Suppose a given fault configuration involves s faults. Using the MLE decoder, this implies that the error e must involve at least [s / 2] faults. Since each fault has probability at most p, the probability that this fault configuration appears is at most£f=[s / 2] (|)pi< [2i=o (|)]p5= 2sps / 2. (4)

[0340] For each logical measurement, by Lemma 9, the fault configuration must involve a connected cluster of at least d / 2 faults. Therefore, applying Lemma 10 to the fault adjacency graph consisting of Rd2nt vertices, the number of connected clusters Nsof size.s is upper bounded by the sum of clusters which contain any given vertex:Ns< 8d2nt(ze)s-1, (5)where since a given vertex at a time is being considered, r = 1 in the above lemma.

[0341] By Lemma 9, if none of the first j rounds of decoding involve a connected cluster of size at least d / 2, then the decoding strategy (Def. 6) will not output FAIL, and the output measurement distribution of the first j measurement results will be the same as the ideal distribution. Since there are m measurements in total, the total probability Pen- of outputting FAIL or having a logical error is at mostCO1 Perr< m ^Ns2spldS"2CO2< 8d2nt(ze)s-l(21| / p)dS"2d8md2nt (Zze^Jp)2ze 1 — 2ze. Jp / \ d / 4< p1 (6)7e(l-112eVp) 12544e2

[0342] This theorem demonstrates that below a certain physical error threshold, the logical error rate is exponentially suppressed in the code distance. Thus, despite never requiring d rounds of syndrome measurement anywhere in the circuit, one can still maintain fault-tolerance.

[0343] The numerical value of the threshold proven for the phenomenological noise model is around ICT5. This value is low, as very loose bounds are used in a number of places: z is bounded only by a very rough estimate, the fault configuration counting and fault probability counting are rather loose, and the union bound on failure probability is used when in reality, the failure events will likely be highly correlated.

[0344] These results are now extended to the setting of Clifford quantum circuits with magic state inputs, thereby showing the relevance of these methods to an important setting of relevance to universal quantum computation.

[0345] To begin, define a more general model of quantum circuits that now includes nonClifford resource states on some inputs.

[0346] Definition 12 (Clifford quantum circuits with magic state inputs). Define Cm, Clifford quantum circuit with magic state input, to be a quantum circuit that consists of layers of the operations defined in Def 2, together with an operation that initializes some set of qubits in an arbitrary, known state |y / ). These extra states are referred to as magic state inputs.

[0347] One can also define the error-corrected version of this. The main difference is that a second parameter < V / is introduced, to characterize the size of the magic state input.

[0348] Definition 13 (Magic state model of surface code error-corrected quantum circuits). Given a circuit Cmdefined in Def 12, define the logical Clifford quantum circuit with magic state inputs, with distances (d, df), as the logical Clifford quantum circuit of distance d defined in Def 3, together with the following operations:1. Initialization of some sets of logical qubits of distance di < d in an arbitrary, known slaie\if / ), and with all stabilizer values fixed to +1, up to local stochastic noise on each physical qubit of strength p.2. Logical qubit patch growth from distance di to d, by performing the initialization in a diagonal pattern and performing one round of syndrome measurement.

[0349] Note that this definition makes use of magic state inputs whose preparation is not yetspecified. A distinction is made between the potentially smaller code distance di of the magic state input and the full code distance d, and again only use a single step to grow the code patch. As shown later, this is still sufficient to have effective distance di in executing the full quantum circuit, by making use of extra information provided by transversal measurements.

[0350] The lemma characterizing the corrections on measurements for low weight errors, Lemma 8, is now extended to this setting with magic state inputs. Here, the same decoding strategy defined in Def. 6 is used. The statement and proof is essentially the same as before, except the distance d is replaced by di < d, the size of the magic state input. This still goes beyond previous work, as the code deformation is performed in a single round rather than over d rounds.

[0351] Lemma 14 (Correction on codespace for low weight faults, when there are magic state inputs). Consider the decoding subgraph T|y for a surface code error-corrected quantum circuit with magic state inputs, corresponding to the first j measurements of the circuit. For any fault configuration f\j= (e ® «)|y in T|y where the largest weight of any connected cluster is less than di / 2, there exists a choice of gauge variable correction XJtsuch that the combined effect of fault configuration and gauge configuration (2(ely ® / c|y) ® Q(g\j ® Xf acts trivially on all logical qubits that have been measured.

[0352] The proof is analogous to that of Lemma 8. The only modification is that there are additional gauge variables on the logical qubits with magic state input, corresponding to the randomly initialized stabilizers, but these do not have an associated gauge logical variable, as the magic state can be in an arbitrary input state.

[0353] The gauge variables that these randomly initialized stabilizers correspond to are spatially localized. Therefore, in the basis of Z stabilizers / X errors that is relevant to initialization and measurement, they can only produce operations in the shaded spatial region, and cover d - di rows. Meanwhile, all Z stabilizers in the highlighted region are deterministic, and a chain of errors that spans this region must have weight at least di. By Lemma 7, each fault throughout the circuit can propagate to at most 2 errors on the final measurement. Therefore, any fault configuration of weight diH must have trivial logical action on the measured logical qubits.

[0354] A simple illustrative example of the fault-tolerance approach is now provided. For illustration purposes, this focuses on a repetition code example, although the lessons readily generalize to the surface code.

[0355] Consider two repetition code logical qubits. The first is prepared in an unknown logical quantum state while the second logical qubit is prepared in a single-shot manner in the |+)L state, by preparing all physical qubits in |+) and measuring neighboring ZZ stabilizers once. At this stage, the physical state of the second logical qubit is in fact a mixture of product states if the stabilizer values are directly fixed back to the codespace. This is because there is not sufficient confidence about the ZZ stabilizer values from the single faulty measurement, and therefore may incorrectly pair up excitations, potentially causing a larger string of X errors.

[0356] A transversal CNOT is then performed, with the second logical qubit as control and first logical qubit as target. Following this, the first logical qubit is measured in the Z basis. With Pauli feedforward, this circuit can teleport the unknown state |I / ')L to the second logical qubit.

[0357] At this point, one may naively be concerned about the correctness of the first measurement result, since the string of X errors can lead to a probability linear in the physical error rate p of flipping this measurement result. However, the situation is a bit more subtle: when defining correctness of a quantum computer execution in a model of classical inputs and outputs, the focus is the ideal measurement distribution reproduction, rather than a given shot being interpreted in a particular way. In the circuit that is executing here, the logical CNOT propagates the randomness to the first logical qubit, and thus the measurement result will be a 50-50 random number. By itself, flipping the logical readout result therefore does not change the measurement distribution of the first logical qubit measurement, and so, in a sense, the long X error string does not yet cause a logical error at this stage.

[0358] Although the measurement distribution for the first logical qubit is unchanged, this does not yet mean the whole circuit is executed correctly: it is required to guarantee that the joint distribution between all logical measurements is the same as the ideal circuit. To understand this, more specification is required on the unknown state |I / ')L. In particular, it is necessary to specify whether it is already prepared in a fault-tolerant fashion, such that the residual noise on it is local stochastic, or whether some of the stabilizers have not yet been fault-tolerantly assigned.

[0359] First, consider the former case, where the unknown magic state |I / ')L has been fault-tolerantly prepared through some method. For the surface code, this may come from another circuit that involved, e.g., magic state distillation. The transversal measurement of the first logical qubit reveals information about the product of stabilizers at the same location on the two logical qubits, up to local stochastic errors, since one directly measures the physical qubits and therefore errors can be regarded as data errors rather than syndrome errors. Since the stabilizers of the first logical qubit are known with only local stochastic errors, inferences about the stabilizer initialization values of the second logical qubit have also effectively been made.

[0360] In the second case where the first logical qubit also has unknown stabilizer initialization values, its preparation must trace back to some Pauli basis input state. For example, consider the case where the first logical qubit was also initialized in a single step in |+)L. The transversal logical measurement still reveals information about the product of stabilizers, but now the initialization values of each of the stabilizers are no longer learned. Fortunately, this is not a concern, as only the product of stabilizers is relevant to interpreting the logical measurement result. Later logical measurements will provide additional information that will allows one to learn the individual values of stabilizers when they are necessary.

[0361] When the second logical qubit is now measured, in the decoding strategy, the existing portion of the circuit is re-decoded. This may cause a different assignment of the first logical measurement result. However, an X operation can be applied at initialization on the second logical qubit, which doesn’t change the |+) state. Propagating this X flip through, this will flip both logical measurement results, flipping the first measurement back to being consistent with the previous measurement, while also flipping the second measurement result. The second measurement result is thus interpreted as having taken the flipped value, to maintain consistency with the first measurement. With this method, the theorem shows that the measurement distribution of the noisy circuit can be made arbitrarily close to the ideal circuit, as the code distance is increased.

[0362] Although the initialization by itself does not provide enough information to infer the randomly-initialized stabilizers with high confidence, it does in combination with the transversal measurements. Indeed, transversal measurements are often thought of as resembling d rounds of syndrome measurement, because by reading out the individual data qubits, the results are equivalent to data qubit errors, followed by perfect syndrome extraction and logical measurement.Thus, by using the transversal measurement results, one can fault-tolerantly determine the relevant stabilizer values on the measured logical qubit and perform the required corrections. This is in stark contrast to the setting of lattice surgery, which never transversally measure the data qubits, and therefore don’t gain access to this extra information. In that case, d rounds of syndrome extraction would be necessary to maintain fault-tolerance.

[0363] Note also that only certain products of the random stabilizer initialization are learned, corresponding to the ones that directly propagate to the measured logical qubit. In the example of Fig. 20C, some knowledge about the product of random Z stabilizer initialization on the first two qubits is gained, which directly determines the stabilizer values of the logical ancilla, but does not fix the individual stabilizer initialization values. Indeed, at this point, there is not enough syndrome information to make a confident determination of the individual stabilizers on the top two logical qubits. This does not cause any issues, however, because the relevant information will become accessible whenever a logical measurement on those qubits is performed.

[0364] Each single-round logical state preparation is initialized in a known state, and therefore a logical operator can be applied without changing the logical state or measurement distribution. For example, when initializing a logical qubit in |0)L, applying a ZL operation right after initialization will act trivially on this initial state. However, if this ZL operation is propagated through the circuit, it can still flip the measurement result of a subsequent random measurement, e.g., if the logical qubit is measured in the X basis right after initialization. In this case, there is a different interpretation of the measurement result, but the measurement distribution will remain unchanged.

[0365] Fast Correlated decoding of transversal logical algorithms

[0366] Quantum error correction (QEC) is required for large-scale computation, but comes with significant overheads on both the quantum hardware and the classical decoding infrastructure used to infer and correct errors. By jointly decoding logical algorithms with transversal gates, the number of syndrome extractions can be reduced by a factor of the code distance d, though at the expense of a significantly increased classical decoding complexity. In this following discussion, a strategy is provided to substantially accelerate this correlated decoding by directly decodingrelevant products of logical qubit measurements. This simplifies the decoding problem and reduces the problem size, while maintaining high performance. Specializing to the surface code, such a procedure results in a decoding problem that can be efficiently solved with minimumweight perfect matching. The fault-tolerance of the efficient decoding strategy of one or more embodiments of the present disclosure is proved, and its performance is benchmarked on example circuits including magic state distillation, quantitatively finding thresholds and decoding runtimes competitive with the setting of a single-qubit memory. These results establish procedures for efficient decoding of transversal algorithms and provide a framework for adapting QEC techniques from the memory setting to the setting of logical algorithms.

[0367] Quantum error correction (QEC) is widely believed to be necessary for large-scale quantum computation. QEC may be realized across different systems, demonstrating the hallmark exponential error suppression upon increasing code distance d, as well as finding that tailored algorithms can be performed at higher fidelity through QEC. Such systems mark a transition where practical considerations in QEC are paramount to modern quantum hardware being able to successfully execute algorithms. Such experiments have highlighted the central role of the QEC decoder, which is the classical algorithm that uses measurement results to infer and correct errors. The decoder can have a significant impact on the performance of QEC: its accuracy affects whether a system is below the threshold, while its speed directly enters into the execution speed of a logical computation.

[0368] Correlated decoding, in which multiple logical qubits are jointly decoded, can significantly reduce the number of syndrome extraction (SE) rounds per logical operation.However, alternative decoders for correlated decoding either face rapidly increasing runtimes (most-likely error decoders based on integer programming), or rely on heuristic approximationsthat reduce performance (decoders based on hypergraph union find, belief propagation and ordered statistics decoding), rendering them insufficient for large-scale quantum computers. At the core of this challenge is the presence of error hyperedges errors that affect more than two checks in an irreducible way — that prevent the use of fast, high-performance decoders based on minimum-weight perfect matching (MWPM). As elaborated herein, these hyperedges appear due to syndrome measurement errors, and are a key reason behind the limitations of existing correlated decoders.

[0369] Various embodiments described herein comprise strategies for efficient and accurate correlated decoding of transversal algorithms. Various embodiments described herein are based on the observation that the decoding problem can be reduced to reliably evaluating certain logical operator products. Crucially, evaluating one product at a time, instead of simultaneously evaluating multiple results, produces a reduced decoding problem that only requires decoding a subset of the syndrome measurements together and, importantly, removes the hyperedges originating from measurement errors. Specializing to the surface code, in which data qubit errors also lead to simple edges of weight 2, it is found that the entire decoding problem can be solved with MWPM, similar to a single logical qubit memory. The fault tolerance of the efficient decoding strategy is proven, and its performance is benchmarked on example circuits including magic state distillation, quantitatively finding thresholds and decoding runtimes competitive with the setting of single-qubit memory. These results establish procedures for efficient decoding of transversal algorithms, and provide a framework for adapting QEC techniques from the memory setting to the setting of logical algorithms.

[0370] Efficient Decoding Of Transversal Algorithms

[0371] Decoding Logical Products

[0372] Fig.25 depicts decoding transversal algorithms, in accordance with one or more embodiments of the present disclosure. Fig.25 depicts how logical measurements in universal computation can be correctly predicted, in accordance with one or more embodiments of the present disclosure. Various embodiments comprise decoding logical Pauli products which, when back-propagated through circuit 102, commute with all logical Pauli initializations. Measurement products which anti-commute with a logical Pauli initialization are 50 / 50 random and are assigned randomly. Process 104 depicts that commuting logical Pauli products may be decoded by tracing back their evolution through the circuit and decoding the stabilizers along the resulting path, in accordance with one or more embodiments of the present disclosure. The decoding problem for these operators may bear similarity to decoding a single logical qubit, as stabilizer measurement errors appear as simple edges flipping two checks. In the case of the surface code, the entire algorithm may be decoded with MWPM.

[0373] Universal quantum computation can be performed via an adaptive transversal Clifford — 1 —circuit acting on logical Pauli states and magic state inputs |T̄⟩ = 1 / √2 (|0̄⟩ + e^{iπ / 4} |1̄⟩). As depicted in Fig. 25, the overline indicates logical qubits or operators. Recent work has shown that such universal circuits can be implemented with 0(1) SE rounds per transversal logical operation, even in the presence of mid-circuit measurements and conditional gates, assuming fault-tolerant magic state inputs. The essence of this can be understood by tracking how the logical Pauli operators propagate through the circuit. Because the actual circuit executed in any given run is a transversal Clifford circuit (acting on Pauli and non-Pauli inputs), such operators can be tracked deterministically. For Pauli initial states, there will be an associated logical Paulistabilizer. Measurements which anti-commute with any of the propagated logical Pauli stabilizers are guaranteed to be 50 / 50 random and thereby provide no information: individually, they do not even need to be decoded. Measurements, or more generally products of measurements, which commute with all logical Pauli stabilizers need to be decoded correctly.

[0374] Critically, it is found that by directly decoding each logical product of interest directly guarantees the fault-tolerance of this approach and provides efficient decoding with memory-like speed and performance. Process 104 depicts an embodiment of the present disclosure. Upon executing a noisy transversal circuit, the decoding procedure seeks to correctly infer logical measurement results, such that they reproduce the measurement distribution of the ideal circuit. To do so, a system of linear equations may be solved to identify a basis of logical operator products which commute with all logical Pauli stabilizers. The decoding problem for each logical Pauli product may then be solved by propagating the logical operator of interest backward through the fixed Clifford circuit that was executed, using only the stabilizer measurements along the propagation path to correct errors and infer the logical Pauli product.

[0375] For a circuit involving feed-forward operations, new measurements may need to be interpreted and ensured that they are assigned consistently with the previous measurements and follow the ideal distribution. The procedure may comprise finding a logical Pauli product including the current measurement (and potentially past measurements) which commutes with all logical Pauli initializations. If one does not exist, the measurement may be assigned randomly and / or the remaining steps may be skipped. The procedure may comprise back-propagating the logical Pauli product through the circuit. Said back-propagating may comprise recording all stabilizer measurements in the same basis along the propagation path. The procedure may comprise decoding the logical Pauli product using these relevant stabilizer measurements. Forexample, in the case of the surface code, matching may be applied. The procedure may comprise applying the correction in software. The procedure may comprise returning all relevant stabilizers to +1. The procedure may comprise reducing the size of the decoding problem for future logical Pauli product measurements. The procedure may comprise assigning the current logical measurement result using the decoded value of the logical Pauli product and any prior measurements involved in the product.

[0376] The performance for various embodiments described herein is benchmarked numerically. A formal description of a proof for the fault tolerance of one or more embodiments described herein is provided. This proof may provide a theoretical foundation for fast and accurate decoding.

[0377] Figs. 26A-C depict matchable decoding graphs and iterative decoding, in accordance with one or more embodiments of the present disclosure. Fig. 26A depicts decoding of logical Pauli products of interest during transversal algorithms, in accordance with one or more embodiments of the present disclosure. Fig. 26B depicts, in accordance with one or more embodiments of the present disclosure how when considering all checks in the circuit, measurement errors (hyperedges) may flip three checks (vertices) during transversal gates. This may make the decoding problem significantly more challenging. By only decoding the relevant checks along the back-propagation of the logical operator, stabilizer measurement errors flip only two checks, thereby allowing efficient decoding with matching. Fig.26C depicts that because logical Pauli products commuting with logical Pauli initializations are intrinsically faulttolerant against stabilizer measurement errors, upon decoding (i), their corrections can be applied in software, resetting the relevant stabilizers to +1 (ii). This can reduce the size of subsequent decoding problems (iii).

[0378] Constructing A Matchable Decoding Problem

[0379] A logical Pauli product may be decoded using the stabilizer measurements along its backwards-propagated path through the circuit. Only these stabilizer measurements may be necessary, as any error which can corrupt the logical operator will be detected by stabilizers in the same basis (see Lemma 4). When decoding, one first takes the stabilizer measurements and a description of the circuit error model to form an input to the decoder called a decoding hypergraph. The vertices of the hypergraph represent checks, which compare corresponding stabilizers in adjacent time steps and flip from +1 to —1 if an error is detected. The hyperedges may be errors connecting the checks they flip. In practice, the weight of the hyperedges (how many vertices they flip) has significant implications for the complexity of the decoding problem. Hyperedges of weight two (simple edges) can be efficiently decoded using algorithms such as MWPM and union find. Higher-weight hyperedges may require more complex algorithms and / or may incur significantly longer runtimes in practice.

[0380] By decoding the entire logical circuit, transversal gates can generate hyperedges of weight three due to stabilizer measurement errors, greatly increasing the complexity of the decoding problem. Here it is shown that by decoding only the logical Pauli products, such hyperedges can be entirely avoided. The basic idea is illustrated in Fig. 26A, which considers a transversal CNOT with three rounds of surrounding Z stabilizer measurements, in accordance with one or more embodiments of the present disclosure. Checks may be formed from the product of a stabilizer measurement with the measurement of its backwards-propagated operator through the circuit. In this example, the control qubit has checks Z^Z^+1and Zf+1Zf+2, and the target qubit has checks Z^Z^and Z^Z^Z^"2(using Zf to denote a generic Z stabilizer operator on logical qubit i at time t, as all Z stabilizers transform identically). If one uses all ofthe checks when decoding, the measurement error on Z^+1will flip three checks (e.g., as depicted at the top of Fig. 26B). However, crucially, always at least one of these checks may not be needed to decode the operator of interest. For example, only three checks propagate alongside logical Z2(Z2Z2+1, ■Zi+1Z2+1Z2+2,and ZiZi+1), and the same measurement error on Z^+1only flips two of them. In general, by removing the vertices corresponding to the check(s) which are irrelevant to the propagation of the logical Pauli product, the weight of the hyperedge is always reduced to two (e.g., as depicted at the bottom of Fig. 26B).

[0381] This simplification arises from the fact that transversal Clifford gates transform the logical and stabilizer bases in analogous ways. Assuming syndrome extraction occurs before and after each gate, the check can simply compare stabilizers in the same basis as the logical operator before and after the gate. Each stabilizer is then only involved in two checks: one tracking the logical between times t — 1 and t, and another tracking the logical between times t and t + 1. As a result, if the stabilizer measurement is incorrect, it will flip only these two checks, corresponding to a simple edge. This pattern is shown explicitly below for the remaining transversal Clifford gates in the surface code, including the Hadamard H and phase 5 gates.

[0382] Although the above discussion focused on the surface code, the observations readily generalize to other Calderbank-Steane-Shor (CSS) codes. Again, the decoding hypergraph will track the logical Pauli product through space and time. The space-like hyperedges, due to physical errors on the data qubits, will have the same structure as data qubit errors on a single copy of the original code without any logic gates. The time-like edges, corresponding to stabilizer measurement errors, will always be simple edges. Thus, it is expected that in many cases, standard decoders used to decode a single logical qubit of a particular code can be promoted to decode transversal algorithms.

[0383] Finally, at least one round of syndrome extraction is assumed herein between gates, these observations also generalize to the case of multiple gates between layers of syndrome extraction. However, for certain structures and larger depths between syndrome extraction, checks can involve more stabilizer measurements and therefore become more sensitive to noise. In practice, the syndrome extraction should be optimized for the particular circuit of interest.

[0384] Intrinsic Fault-Tolerance Against Stabilizer Measurement Errors

[0385] In general, only logical Pauli products which commute with all logical Pauli initializations need to be decoded, as anti-commuting operators are 50 / 50 random and do not need to be decoded. Crucially, by decoding only these commuting logical Pauli products, reliable syndrome information throughout the circuit is also obtained. When initializing logical qubits, all stabilizers in the basis of initialization have value +1. In some embodiments, because the logical Pauli state is classical, these stabilizers are all that is needed to protect its information, as the logical operator in the other basis is 50 / 50 random and therefore does not need to be protected. It is additionally assumed that the magic |T) states have been reliably prepared such that they have well-defined stabilizers, as can be realized via a variety of approaches ranging from distillation to cultivation. As the operator then evolves through the transversal Clifford circuit, the stabilizers evolve identically, with noisy stabilizer measurements providing protection in the right basis against errors during transversal gates. At the final transversal measurement, the stabilizers in the logical Pauli product may be learned reliably via the transversal measurement.

[0386] As a result, the decoding problem for the logical Pauli product bears key similarities to decoding just a single logical qubit initialized in 10) and measured in the Z basis. The X stabilizers may not be necessary to protect Z and may never need to be measured.

[0387] Reduced Decoding Volume And Iterative Decoding

[0388] Various embodiments comprise decoding only commuting logical Pauli products. This may have the benefit of reducing the size of the decoding problem. Each new measurement only requires decoding a single logical Pauli product including the current measurement and potentially past measurements. The new measurement result can then be inferred from the decoded value of the logical Pauli product and the values of any prior measurements. In practice, this can greatly reduce the size of the decoding problem, or decoding volume. Rather than jointly re-decoding the entire algorithm at each step (or a light-cone of depth d), only the parts that the logical Pauli product traces through may need to be decoded.

[0389] Upon decoding, error assignments to the hyperedges of the original full decoding hypergraph may be made. Furthermore, due to the aforementioned robustness against measurement errors, it may be expected that the error assignments made are somewhat reliable. This motivates an iterative strategy for decoding (e.g., as depicted in Fig. 26C). This may further reduce the decoding volume. After a logical Pauli product is decoded, the error predicted by the decoder is applied in software and the corresponding detectors may be updated. Since the assignment of error edges that have been decoded is fixed, the error edge can be removed from the problem and subsequent decoding problems may be decoupled from the previous one: only new detectors need to be decoded. Decoding logical Pauli products independently and iteratively is benchmarked as part of the numerical results, exploring their relative tradeoffs in performance and runtime.

[0390] Decoding only commuting logical Pauli products has the additional benefit of reducing the size of the decoding problem. Each new measurement only requires decoding a single logical Pauli product which includes the current measurement and potentially past measurements. The new measurement result can then be inferred from the decoded value of thelogical Pauli product and the values of any prior measurements. In practice, this can greatly reduce the size of the decoding problem, or decoding volume; rather than jointly re-decoding the entire algorithm at each step (or a light-cone of depth d), one only needs to decode the parts that the logical Pauli product traces through. In the worst case, the logical Pauli product can become exponentially large in the circuit depth due to branching from transversal CNOT gates.However, it is expected that in many circuits of practical interest (e.g., magic state distillation), the decoding volume is modest.

[0391] In certain settings, the decoding volume can be further reduced by leveraging the fact that due to its intrinsic robustness against stabilizer measurement errors, it is expected that the error assignments within a commuting logical Pauli product will be mostly reliable. This motivates an iterative strategy for decoding, depicted in Fig. 26C. After a logical Pauli product is decoded, the error predicted by the decoder is applied in software and the corresponding checks are updated. Since the assignment of error edges that have been decoded is fixed, the error edge are removed from the problem and subsequent decoding problems are decoupled from the previous one: only new checks need to be decoded. Such a procedure is valid as long as the remaining error edges in subsequent decoding problems enable them to be decoded fault-tolerantly. This is the case as long as neighboring decoding problems do not have certain structures involving short cycles of transversal gates. Many practical circuits of interest do not have such a structure, enabling their decoding volume to be reduced through this iterative procedure. For example, in the case of magic state distillation, the corrections in the factory can be frozen and committed, such that the stabilizers in the factory never need to be re-decoded.

[0392] Such a possibility provides a variety of trade-offs that can be explored. The specific choice of the basis of logical Pauli products can be optimized to minimize the number ofdetectors involved in the problem. Note that even though different decoding problems give consistent logical measurement results, their physical error assignments may in general differ by space-time stabilizers. For a given circuit, each commuting logical Pauli product may be decoded independently and in parallel. This results in a fast parallel runtime, but some of the decoding work may be duplicated. In some embodiments, an iterative strategy may be used, in which some products are decoded after earlier products have been assigned. This reduces the total amount of decoding work, since decoding on a detector multiple times may not be needed, but may come with an elevated amount of noise due to residual errors on earlier decoding problems.Furthermore, it may increase the latency of decoding, as later decoding problems need to wait for the completion of earlier ones.

[0393] Numerical Results

[0394] Numerical simulations with the above strategy were performed, and the competitive performance and fast runtimes of various embodiments of the present disclosure were verified. The simulations utilized the Stim package to perform error sampling, and the pymatching package to perform MWPM decoding.

[0395] Figs. 27A-C depicts benchmarking decoding strategies, in accordance with one or more embodiments of the present disclosure. Fig.27 depicts consideration of a logical Clifford circuit of depth 10 and width 10, in accordance with one or more embodiments of the present disclosure. The logical Clifford circuit may comprise layers of CNOTs between random pairs of qubits, followed by a single SE round per layer. Fig. 27B depicts logical error rate as a function of physical error rate, for different code distances. It is found that the decoding strategy described above achieves the full code distance, as expected. Fig. 27C depicts that the runtime as a function of code distance scales as approximately d2, in accordance with one or moreembodiments of the present disclosure. The findings depicted in Fig. 27C are consistent with expectations for one SE round per layer.

[0396] In Figs. 27A-27C, the results of a benchmark involving a random logical Clifford circuit are shown. The circuit involves 10 logical qubits and 10 layers, where each layer consists of CNOT gates between random pairs of qubits, followed by a single round of syndrome extraction. A random set of commuting logical Pauli products was chosen and the matchable decoding subgraph was constructed. The resulting logical error rate as a function of physical error rate for a range of different code distances is shown in Fig. 27B. It was verified that the scaling follows the expected power based on the code distance. It is shown in Fig. 27C that the circuit achieves competitive runtimes in Fig. 27C.

[0397] Figs. 28A-28C depict magic state distillation, in accordance with one or more embodiments of the present disclosure. Fig. 28A depicts that the logical magic state distillation circuit converts 15 |T) noisy states to a higher quality |T) state. To simulate the circuit efficiently, the |T) states are replaced with |5) states and the 5 feedforward gates with Z gates. The factory may be decoded in three stages, with two layers of intermediate feed-forward gates applied. At each layer, only logical Pauli products commuting with the initialized Pauli states are decoded (an example is shown). Fig. 28B depicts that the threshold using matching and circuitlevel noise is pth« 0.6%, comparable to the memory threshold with the matching decoder for this error model. The effective code distance is [(d + l) / 2 J, corresponding to the maximum possible value under the noise model. Fig. 28C depicts that the total time to decode each stage on a laptop at a physical error rate of p = 0.001 grows approximately quadratically in the code distance.

[0398] As a demonstration of one or more approaches described herein in a relevant algorithm setting, its performance within a magic state distillation factory is investigated in Fig.28A. It is shown in Fig. 28B that the logical error rate again displays the expected scaling with code distance, and that the runtime scales approximately quadratically in the code distance, which is consistent with the increase in the number of detection events. Strikingly, even for this relatively large algorithmic gadget and decoding problem, each stage can be decoded for up to d = 25 in less than 100 / is, well within the time budget of current neutral atom quantum processors.

[0399] Proof of Fault Tolerance

[0400] A formal description of the protocol is provided herein, and its fault tolerance when using an efficient MWPM decoder is established.

[0401] For the analysis, logical qubits encoded in unrotated surface codes are considered.Clifford operations are implemented (fold-)transversally, each followed by one SE round: Pauli and CNOT gates are implemented transversally; the logical H gate is implemented with physical H and a patch reflection; the logical 5 gate is implemented with 5 gates on the diagonal and CZ gates between pairs of qubits reflected across the diagonal. Initialization of |0)(|+)) is performed by initializing data qubits in |0)(|+)) and applying one SE round, while logical measurement in the X (Z) basis involves measuring all data qubits in the X (Z) basis. An important assumption in the analysis is access to |T) magic states that are fault-tolerantly initialized with known stabilizers, for example, as the output of magic state cultivation or magic state distillation. It is expected that the methods can also be applied in practice to magic state distillation procedures, providing analogous reductions in the number of SE rounds.

[0402] A local stochastic noise model is assumed, in which the probability of m elementary errors occurring decays as pm. The set of elementary errors is chosen to be single qubit Pauli Xand Z errors before each SE round and on the input logical magic states, as well as flips of syndrome measurement results, similar to a phenomenological noise model. Checks (detectors) are constructed from local stabilizer products that are deterministic (+1) in the absence of errors, and a decoding hypergraph is constructed based on how errors flip detectors and logical operators.

[0403] Having described the setting for the analysis, how to choose the appropriate logical Pauli products for decoding is now formalized. A measured logical Pauli product is associated with a binary vector v E, where nmis the number of logical measurements so far, and V; = 1 if and only if the logical Pauli product has support on the ith measurement. A binary matrix M E X I1"™ is then constructed, where n is the total number of logical qubits, describing how these logical measurements back-propagate through the executed Clifford circuit. M has the form / zi,iZl,nm\Zl,7lZ1 / ft / nM =xl,lXVTlmxn,l x

[0404] where xtj (Zi j) is equal to one if and only if the back-propagation of measurement j hassupport on the initialization of logical qubit i in the X (Z) basis.

[0405] Next, a set B characterizing the basis of reliable stabilizer initialization is formed. The Z(A) initialization basis of logical qubit i is associated with a lengthen binary vector ez i(ex i\ which has only the ith (n + i)th entry equal to one, respectively. If the ith qubit is initialized in |T), B includes both ez iand exi. If the ith qubit is initialized in a Pauli state, B only includes the initialization basis which is deterministic (e.g., ez lE B if the ith qubit is initialized in |0)).

[0406] Note that all vectors from reliable logical Pauli products form a linear space. A logical Pauli product is called reliable, if its corresponding vector v satisfies Mv E span B (Definition 1). It is seen below why the terminology "reliable" is used: the decoding subgraph that it corresponds to has known stabilizer initialization (Lemma 4), and the logical error rate on inferring the product is exponentially suppressed with the code distance (Theorem 8).

[0407] To interpret measurement results, a complete basis of assignments for all Pauli measurement results is maintained. This complete basis is characterized in Lemma 2. According to Lemma 2 (Complete basis of measurements): for a set of m logical measurements, there exists a full-rank basis V E Tfxm, where each column if of V corresponds to one logical measurement product, such that either is a reliable logical product, or the result of if is conditionally independent from other columns and always 50 / 50 random.

[0408] The proof for Lemma 2 is shown herein. The lemma is shown by inductively constructing the basis V. It is shown that any column if that is not a reliable logical product must anti-commute with some logical Pauli stabilizer Sj, and that for different such columns, the set of anti-commuting logical Pauli stabilizers (sj is linearly independent. If this is true, then a logical Pauli stabilizer can be found that only flips the ith unreliable logical product: since this should not change the measurement distribution, it can be concluded that its result is conditionally independent from other columns and must always be 50 / 50 random.

[0409] For the base case, considering the first logical Pauli measurement P15if PTis not reliable, then it must anti-commute with some logical Pauli stabilizer, satisfying the condition above with V = (1). For the inductive case, suppose a full-rank basis of the first m measurements l^is available, satisfying the conditions. With the m + 1-th Pauli measurement Pm+i, the basis is updated as follows:• Extend the first m columns of Vmto length m + 1, with 0 in the new row.• If there exists a subset 5 of earlier measurements (possibly empty), such that^m+lxIlses Psis a reliable logical Pauli product, then include this product in the basis by setting the last column vm+1to be 1 at rows in 5 and the m + 1th row, and 0 otherwise.• Otherwise, include Pm+1in the basis by setting the last column vm+ 1to be 1 at the m + 1th row, and 0 otherwise.

[0410] The matrix Vm+1constructed recursively in this way will be upper triangular, with all diagonal elements being 1, so it is clearly full rank. Now it is shown that it also satisfies the condition on unreliable Pauli products. Suppose this is not the case, then there exists a set of anticommuting logical Pauli stabilizers {sj that is linearly dependent. This set must involve the newly added measurement, since otherwise it violates the induction hypothesis. However, taking the product of these logical measurements results in a logical Pauli product involving Pm+1that commutes with all logical Pauli stabilizers and is therefore reliable, contradicting the construction of vm+1. Therefore, the condition on unreliable Pauli products is satisfied, completing the proof.

[0411] This lemma establishes a convenient basis for interpreting logical measurement results, in which each product is either reliable, or 50 / 50 random and can therefore be chosen randomly. After interpreting the logical measurements in this basis, the inverse of the full rank matrix Pean be applied to obtain the logical measurement results of each logical qubit. It is now shown that reliable logical Pauli products are indeed “reliable”: they can be inferred with logical error rate exponentially suppressed with the code distance.

[0412] A few key properties of the decoding subgraph for each reliable logical Pauli product are shown. This subgraph is constructed by propagating the Pauli product back through the Clifford circuit, and including stabilizer measurement results on the same logical qubits and basis as the logical operator. The decoding subgraph is then formed from detectors that only involve those stabilizer measurements, together with all physical errors that can flip these detectors. First, it is shown that the decoding subgraph only involves stabilizers that are reliably initialized in +1 (up to local stochastic noise), thereby avoiding the basis of stabilizers that are newly initialized in a random value. As described above, it is assumed here that logical magic states are provided with +1 stabilizer values, up to local stochastic noise.

[0413] According to Lemma 3 (Reliable stabilizer initialization): all initial stabilizers in the decoding subgraph of a reliable logical Pauli product have the known value of +1. The proof for Lemma 3 is shown herein. The fact that Pauli stabilizers propagate in the same way as Pauli logical operators in the same basis is used. The logical Pauli product P under consideration is propagated back through the Clifford circuit, until it becomes a Pauli product P supported only on logical state preparation. By the definition of a reliable logical Pauli product, P must commute with all Pauli initial states. If a logical qubit is initialized in |0), then P must act as I or Z on it, so only Z basis stabilizers can be part of the decoding subgraph. Since |0) initialization is performed by preparing all physical qubits in |0) and measuring stabilizers, the Z stabilizers are initialized deterministically to +1. A similar analysis readily applies to logical qubits initialized in |+). All logical qubits initialized in magic states, by assumption, are initialized with +1 stabilizers up to local stochastic noise. Therefore, all initial stabilizers in the decoding subgraph have value +1. Next, it is shown that the decoding subgraph contains all physical errors that can affect the logical measurement result, thereby providing sufficient information for decoding.

[0414] According to Lemma 4 (Completeness of decoding subgraph), any elementary physical error that can affect the reliable logical Pauli product will be detected in the resulting decoding subgraph. The proof for Lemma 4 is shown herein. Consider any elementary Pauli error e, and denote its propagation through the rest of the circuit as the Pauli operator e' = U^eU. In order for the elementary error to affect the logical Pauli product P, e' and P must anti-commute (e', P] = 0, which in turn means that the error must anti-commute with the back-propagation of the logical Pauli product, (e, UPlP} = 0. Since the stabilizer measurements are in the same basis as the logical Pauli product, the error e must flip stabilizer measurement results and therefore be detected by the decoding subgraph.

[0415] The next important property of the decoding subgraph pertains to its ability to be matched. According to Lemma 5 (Matchable decoding subgraph): the decoding subgraph for any reliable logical Pauli product has edge degree at most 2, and bounded vertex degree 0(1). The proof for Lemma 5 is shown herein. Lor the surface code, data qubit X or Z errors trigger 1 or 2 syndromes at their endpoints, thereby affecting 1 or 2 detectors. Each syndrome measurement is involved in at most 2 detectors, corresponding to propagating the logical operator forward and backward. Therefore, time-like edges also have degree at most 2. Because all detectors are formed from local products of measurement results, the number of elementary errors that can flip them will be bounded, leading to the bounded vertex degree. The fact that the decoding subgraph is composed of simple edges of degree at most 2 implies that the MWPM decoder can be applied, which can efficiently identify the most likely error in this case. This allows standard cluster counting arguments to be used to bound the logical error rate of each reliable logical Pauli product.

[0416] Theorem 6 (Exponential error suppression for single reliable Pauli product) is considered. Consider a [[n, k, d]] surface code transversal Clifford circuit with reliable magic state inputs (stabilizers are +1, up to local stochastic noise), and consider a single reliable logical Pauli product. Then there exists a threshold p0, such that if the code experiences local stochastic noise with error probability p < (1 — e)p0, then decoding the result of the reliable logical Pauli product by applying MWPM to the corresponding subgraph has a logical error rate that approaches 0 proportionally to O(Vn(p / p0)d / 4), where V is the logical space-time volume of the subgraph, 0 < e < 1 is any nonzero constant.

[0417] The proof for Theorem 6 is shown herein. In the proof for Theorem 6, the logical error rate is bounded. Consider the syndrome adjacency graph (line graph of the decoding hypergraph), where vertices etare error events included in the decoding subgraph, and there is an edge (c;, ek) between vertices if they both trigger the same detector. By Lemma 5, each error is involved in at most 2 detectors, each of which has a bounded vertex degree, so the vertices in the syndrome adjacency graph also have degree bounded by some constant z. Errors and the inferred correction form undetectable clusters on the syndrome adjacency graph, and properties of connected clusters of errors are analyzed to bound the logical error rate.

[0418] The number of vertices in a connected cluster required for a logical error to occur may be lower bounded. For a fault cluster f and logical Pauli product P, both of them may be propagated back through the circuit until they are only supported on state initialization, resulting in operators f and P. Since both faults and logical Pauli products are propagated through the same circuit, f will lead to a logical error if and only if { / , P] = 0. By Lemma 4, all initial stabilizers in the same basis as P have known initial values. Any error cluster that can cause a logical error on P must be in the opposite basis, and by the distance of the code, there is a lowerbound on the weight \ f\ > d. By the transversal structure of the circuit, in which a given error can only spread to at most 2 other locations, this implies that | / | > d / 2.

[0419] The number of clusters with weight w that contain any given error is upper bounded by (ze)w 1. There are a total of V logical space-time locations, resulting in O(Kn) physical error locations. Since the MWPM decoder is able to identify the minimum-weight error, the weight of the correction is upper bounded by the number of physical errors in each cluster. Therefore, at least w / 2 physical errors must have occurred, with a probability upper bounded by pw / 2. For a cluster of size w, there are at most 2Wways to choose subsets that correspond to the physical error configuration. This allows the logical error rate on the Pauli product to be bounded asPL< 0(vn) (ze)w-12»pW / 2w>d / 2rn (2ze1 / p) / = 07.e x — Ize^jp= o (vn(—\ / V)V Vp0 / / where p0= l / (2ze)2, as desired.

[0420] It is noted here that the factor of 2 reduction in the distance is due to the loose bound on error propagation of the (fold-)transversal H and 5 gates, and can probably be improved.

[0421] Having established a bound on the error rate of each individual reliable logical Pauli product, the union bound may be used to combine them and obtain a bound on the total logical error rate of the full computation.

[0422] According to Lemma 7 (Union bound on logical errors in transformed basis), consider a probability distribution Q(x) over N binary random variables x = (xvx2,..., xN), and a full rank matrix M E F2xNrepresenting a basis transformation of the random variables. Suppose anerror (possibly correlated) of probability at most p is applied to each element of Mx, resulting in a new distribution Q'(x). Then the total variation distance between the original and erroneous distribution is upper bounded by ||Q(x) — Q'(x)||TV< Np.

[0423] A proof for Lemma 7 is shown herein. Denote the vector y = Mx, and the vector with noise applied y' = y © e. Each element of e has probability at most p. Applying the union bound, it is found that P(e #= 0) < Np. Since the matrix M is full rank, the vector y'can be inverted to obtain a vector x' in the original basis. With probability 1 — Np, no error vector was applied, so the same vector x' = x is recovered, implying that deviations in the distribution can only occur in the remaining Np probability. This implies the upper bound ||Q(x) — Q'(x)||TV≤ Np.

[0424] According to Theorem 8 (Exponential error suppression for universal quantum computation), consider a [[n, k, d]] surface code transversal Clifford circuit with reliable magic state inputs (stabilizers are +1, up to local stochastic noise). Then there exists a threshold p0, such that if the code experiences local stochastic noise with error probability p < (1 — e)p0, there exists a decoding strategy based on MWPM with a logical error rate that approaches 0 proportionally to O(Vn(p / p0)d / 4), where V is the logical space-time volume of the subgraph, 0 < e < 1 is any nonzero constant.

[0425] According to various embodiments described herein, a decoding framework tailored to transversal logical algorithms is provided, demonstrating that direct decoding of multi-logical Pauli products can preserve fault tolerance while greatly simplifying the classical decoding task. Various procedures described herein result in decoding graphs which are matchable and thus efficiently decodable, and substantially reduce the decoding volume. The fault-tolerance of the approach is proved, and in numerical benchmarks it is found that it maintains the highperformance of most likely error hypergraph correlated decoding (which has exponential runtime in the worst-case), with runtime speeds comparable to single-qubit memory. Broadly, by tracking how logical operators propagate through a transversal circuit, one or more techniques described herein promote a memory decoder to an algorithm decoder while maintaining memory-like performance with 0(1) syndrome extraction per transversal logical operation.

[0426] The reduction in decoding complexity has significant practical implications on runtime and performance. Moreover, the broader ability to promote memory decoders to algorithms decoders yields a framework for decoding transversal algorithms in general. For example, the efficient transversal distillation circuit considered here may be combined with magic state cultivation.

[0427] The descriptions of the various embodiments of the present disclosure have been presented for purposes of illustration, but are not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen to best explain the principles of the embodiments, the practical application or technical improvement over technologies found in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

CLAIMSWhat is claimed is:

1. A method of performing a quantum computation, the method comprising:providing at least a first plurality of physical qubits;encoding a first logical qubit into the at least first plurality of physical qubits according to a quantum error correcting code;encoding a second logical qubit into the at least first plurality of physical qubits according to the quantum error correcting code;based on the quantum error correcting code, constructing a bipartite decoding graph corresponding to the first and the second logical qubits, the bipartite decoding graph comprising a plurality of detector nodes and a plurality of error nodes, each error node corresponding to an error mechanism;applying a transversal gate to the first and the second logical qubits;performing a first round of syndrome measurement of the first and the second logical qubits;for each of the plurality of detector nodes affected by the corresponding error mechanism of one of the plurality of error nodes, generating an edge on the bipartite decoding graph therebetween; anddetermining a physical error configuration from the bipartite decoding graph.

2. The method of claim 1, further comprising:applying at least one gate to the first plurality of physical qubits to correct the physical error configuration.

3. The method of any one of claims 1-2, wherein the quantum error correcting code is a surface code.

4. The method of any one of claims 1-3, wherein the transversal gate is a Clifford gate.

5. The method of claim 4, wherein the Clifford gate is a CNOT gate.

6. The method of any one of claims 1-3, further comprising:encoding a third logical qubit into the at least first plurality of physical qubits according to the quantum error correcting code.

7. The method of claim 6, wherein the transversal gate is a non-Clifford gate.

8. The method of claim 7, wherein the non-Clifford gate is a CCZ gate.

9. The method of any one of claims 1-8, wherein determining the physical error configuration comprises maximizing an error probability on the bipartite decoding graph.

10. The method of claim 9, wherein maximizing the error probability comprises solving a mixed-integer programming problem corresponding to the error probability.

11. The method of any one of claims 1-8, wherein determining the physical error configuration comprises determining one or more subgraphs of the bipartite decoding graph corresponding to the physical error configuration.

12. The method of claim 11, wherein determining the one or more subgraphs comprises defining a subgraph for each detector node having a detected error and expanding each such subgraph until it encompasses error nodes, which, if they had occurred, would result in syndrome measurements consistent with the observed syndrome.

13. The method of any one of claims 1-8, wherein determining the physical error configuration comprises determining one or more subgraph of the bipartite decoding graph by backpropagation from a plurality of logical measurements.

14. The method of claim 13, wherein the plurality of logical measurements is chosen such that their product commutes with all logical Pauli stabilizers of the quantum error correcting code.

15. The method of claim 13, wherein determining the physical error configuration further comprises applying a decoder to the one or more subgraph iteratively.

16. The method of claim 13, wherein determining the physical error configuration further comprises applying a decoder to the one or more subgraph in parallel.

17. The method of claim 15 or 16, wherein the decoder comprises a matching decoder.

18. The method of claim 15 or 16, wherein applying the decoder comprises performing the method of claim 9 or claim 11.

19. The method of claim 15 or 16, wherein the matching decoder comprises minimum weight perfect matching on the one or more subgraph.

20. The method of claim 19, wherein said backpropagation is performed prior to applying the decoder to the one or more subgraph.

21. The method of any one of claims 1 to 20, wherein each qubit of the first plurality of physical qubits is a neutral atom.

22. The method of any one of claims 1-21, wherein the at least first plurality of physical qubits comprises a second plurality of physical qubits, and whereinthe first logical qubit is encoded in the first plurality of physical qubits andthe second logical qubit is encoded in the second plurality of physical qubits.

23. The method of any one of claims 6-8, wherein the at least first plurality of physical qubits comprises a second plurality of physical qubits and a third plurality of physical qubits, and whereinthe first logical qubit is encoded in the first plurality of physical qubits,the second logical qubit is encoded in the second plurality of physical qubits, and the third logical qubit is encoded in the third plurality of physical qubits.

24. The method of claim 22, wherein applying the transversal gate comprises placing the first and second pluralities of physical qubits such that each physical qubit of the first plurality of physical qubits is within a blockade radius of exactly one corresponding physical qubit of the second plurality of physical qubits and illuminating the first and second plurality of physical qubits with a first laser.

25. The method of any one of claims 1-24, further comprising:alternately applying one or more additional transversal gates to the first and the second logical qubits and one or more additional rounds of syndrome measurement of the first and the second logical qubits.

26. A quantum processor, comprising:a first array of optical traps disposed in an active zone;a second array of optical traps disposed in a readout zone;a first laser configured to illuminate the active zone and to drive a transition to a Rydberg state;a second laser configured to illuminate the active zone and to drive a transition between hyperfine states;a third laser configured to illuminate the readout zone;a fourth laser configured to adiabatically move neutral atoms between the optical traps of the active zone and the readout zone; anda camera configured to capture an image of the readout zone, wherein the quantum processor is configured to:provide at least a first plurality of neutral atoms in the active zone, each in a respective optical trap of the first array;encode a first logical qubit into the at least first plurality of neutral atoms according to a quantum error correcting code by the first and second lasers; encode a second logical qubit into the at least first plurality of neutral atoms according to the quantum error correcting code by the first and second lasers; based on the quantum error correcting code, constructing a bipartite decoding graph corresponding to the first and the second logical qubits, the bipartite decoding graph comprising a plurality of detector nodes and a plurality of error nodes, each error node corresponding to an error mechanism;place the first and second pluralities of neutral atoms in the active zone such that each neutral atom of the first plurality of neutral atoms is within a blockade radius of exactly one corresponding neutral atom of the second plurality of neutral atoms;illuminate the first plurality of neutral atoms and the second plurality of neutral atoms while in the active zone by at least the first or second laser, thereby applying a transversal gate to the first and second logical qubits;performing a first round of syndrome measurement of the first and the second logical qubits;for each of the plurality of detector nodes affected by the corresponding error mechanism of one of the plurality of error nodes, generating an edge on the bipartite decoding graph therebetween; anddetermining a physical error configuration from the bipartite decoding graph.

27. A method of performing a quantum computation, the method comprising:providing at least a first plurality of physical qubits;encoding a first logical qubit and a second logical qubit into the at least first plurality of physical qubits according to a quantum error correcting code;applying a first transversal gate to one or more of the first and second logical qubits; performing a first round of syndrome measurement of the first and / or second logical qubit;obtaining a first measurement of the second logical qubit;based on the quantum error correcting code, constructing a first bipartite decoding graph corresponding to the first and second logical qubits and the first transversal gate, the bipartite decoding graph comprising a plurality of detector nodes and a plurality of error nodes, each error node corresponding to an error mechanism;for each of the plurality of detector nodes affected by the corresponding error mechanism of one of the plurality of error nodes, generating an edge on the first bipartite decoding graph therebetween;determining a first physical error configuration from the first bipartite decoding graph and the first round of syndrome measurements;applying a second transversal gate to at least one of the first and second logical qubits; performing a second round of syndrome measurement of the first and / or second logical qubits;obtaining a second measurement of the first and second logical qubits;based on the quantum error correcting code, constructing a second bipartite decoding graph corresponding to the first and second logical qubits and the first and second transversal gates, the bipartite decoding graph comprising a plurality of detector nodes and a plurality of error nodes, each error node corresponding to an error mechanism; for each of the plurality of detector nodes affected by the corresponding error mechanism of one of the plurality of error nodes, generating an edge on the second bipartite decoding graph therebetween; anddetermining a second physical error configuration from the second bipartite decoding graph and the second round of syndrome measurements.

28. The method of claim 27, wherein the second transversal gate is conditional on the first measurement.

29. The method of claim 27, further comprising:decoding a first value of the first logical qubit from the first measurement based on the first bipartite decoding graph;decoding a second value of the first logical qubit from the second measurement based on the second bipartite decoding graph;applying one or more logical operator to the first and / or second logical qubits according to a disparity between the first and second values.

30. The method of claim 27, further comprising:applying at least one gate to the first plurality of physical qubits to correct the first and / or second physical error configuration.

31. The method of any one of claims 27-30, wherein the quantum error correcting code is a surface code.

32. The method of any one of claims 27-30, wherein the first and / or second transversal gate is a Clifford gate.

33. The method of claim 32, wherein the Clifford gate is a CNOT gate.

34. The method of any one of claims 27-30, wherein the first and / or second transversal gate is a non-Clifford gate.

35. The method of claim 34, wherein the non-Clifford gate is a CCZ gate.

36. The method of any one of claims 27-35, wherein determining the first physical error configuration comprises maximizing an error probability on the first bipartite decoding graph.

37. The method of any one of claims 27-35, wherein determining the second physical error configuration comprises maximizing an error probability on the second bipartite decoding graph.

38. The method of claim 36 or 37, wherein maximizing the error probability comprises solving a mixed-integer programming problem corresponding to the error probability.

39. The method of any one of claims 27-35, wherein determining the first physical error configuration comprises determining one or more subgraphs of the first bipartite decoding graph corresponding to the first physical error configuration.

40. The method of any one of claims 27-35, wherein determining the second physical error configuration comprises determining one or more subgraphs of the second bipartite decoding graph corresponding to the second physical error configuration.

41. The method of claim 39 or 40, wherein determining the one or more subgraphs comprises defining a subgraph for each detector node having a detected error and expanding each such subgraph until it encompasses error nodes, which, if they had occurred, would result in syndrome measurements consistent with the observed syndrome.

42. The method of any one of claims 27 to 41, wherein each qubit of the first plurality of physical qubits is a neutral atom.

43. The method of any one of claims 27-42, wherein the at least first plurality of physical qubits comprises a second plurality of physical qubits, and whereinthe first logical qubit is encoded in the first plurality of physical qubits,the second logical qubit is encoded in the second plurality of physical qubits.

44. A method of performing a quantum computation, the method comprising:providing at least a first plurality of physical qubits;encoding a first logical qubit and a second logical qubit into the at least first plurality of physical qubits according to a quantum error correcting code;applying a first transversal gate to one or more of the first and second logical qubits; obtaining a first measurement of the second logical qubit;based on the quantum error correcting code, constructing a first bipartite decoding graph corresponding to the first and second logical qubits, the first transversal gate, and the first measurement, the bipartite decoding graph comprising a plurality of detector nodes and a plurality of error nodes, each error node corresponding to an error mechanism;for each of the plurality of detector nodes affected by the corresponding error mechanism of one of the plurality of error nodes, generating an edge on the first bipartite decoding graph therebetween;determining a first physical error configuration from the first bipartite decoding graph; applying a second transversal gate to at least one of the first and second logical qubits; performing a second round of syndrome measurement of the first and / or second logical qubits;obtaining a second measurement of the first and second logical qubits;based on the quantum error correcting code, constructing a second bipartite decoding graph corresponding to the first and second logical qubits and the first and second transversal gates, the bipartite decoding graph comprising a plurality of detector nodes and a plurality of error nodes, each error node corresponding to an error mechanism; for each of the plurality of detector nodes affected by the corresponding error mechanism of one of the plurality of error nodes, generating an edge on the second bipartite decoding graph therebetween; anddetermining a second physical error configuration from the second bipartite decoding graph and the second round of syndrome measurements.