Relaxation oscillator-based probabilistic combinatorial optimization engine for soft decoding of LDPC codes

The relaxation oscillator-based system addresses the limitations of conventional architectures by directly mapping LDPC decoding to a dynamical system with native CT sixth-order spin interactions and soft initialization, improving error-rate performance and throughput.

WO2026122160A2PCT designated stage Publication Date: 2026-06-11THE RGT UNIV OF MICHIGAN

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
THE RGT UNIV OF MICHIGAN
Filing Date
2025-08-28
Publication Date
2026-06-11

Smart Images

  • Figure IMGF000008_0001
    Figure IMGF000008_0001
  • Figure IMGF000008_0002
    Figure IMGF000008_0002
  • Figure IMGF000010_0001
    Figure IMGF000010_0001
Patent Text Reader

Abstract

Physics-inspired computing harnesses continuous time operation, massive parallelism, and direct compute load mapping to coupled CMOS-based spins to accelerate solving complex optimization problems. This disclosure advances the field by introducing relaxation oscillator (RXO) low-density parity check (LDPC), a combinatorial optimization problem (COP) engine that natively supports six-body spin interactions for efficient, robust, and one-shot oscillator-based soft decoding of LDPC codes. The proposed RXO spins feature a capacitor DAC-based initialization structure, allowing precise mapping of soft information to initial spin phases for high-performance decoding. A crossbar-based feedback system facilitates six-body spin interactions by directly coupling spins based on the COP graph. Evaluated with more than 100 million decoding cycles, the system demonstrates reliable performance across a wide range of SNRs, supply voltages, temperatures, and for different dies. These measurement results highlight the RXO-based architecture's potential as an accelerator for directly solving COPs with multi-body spin interactions.
Need to check novelty before this filing date? Find Prior Art

Description

Attorney Docket No. 2115-008420-WO-POARELAXATION OSCILLATOR-BASED PROBABILISTIC COMBINATORIALOPTIMIZATION ENGINE FOR SOFT DECODING OF LDPC CODESCROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of U.S. Provisional Application No. 63 / 689228 filed on August 30, 2024. The entire disclosure of the above application is incorporated herein by reference.FIELD

[0002] The present disclosure relates to a relaxation oscillator-based system for performing error correction.BACKGROUND

[0003] The current trend in computation hardware focuses on developing custom accelerators to tackle the growing energy-efficiency and throughput demands of modern workloads such as combinatorial optimization problems (COPs), artificial intelligence (Al), and advanced communication systems. While traditional digital architectures, including von Neumann-based processors and digital application-specific integrated circuits (ASICs), constitute the main paradigm of conventional computing, they impose significant limitations. The inherent constraints of digital architectures stem from their reliance on clocked synchronous operation, restricted parallelism, and excessive data movement, leading to increased latency, high energy costs, and limited adaptability for specialized tasks. These factors significantly constrain system throughput, scalability, and energy efficiency, particularly in memory-intensive and high-performance applications. Alternative paradigms such as analog computing have been explored to address such challenges.

[0004] Recently, novel computing paradigms, such as quantum computing, have emerged as potential solutions to these challenges. Quantum computers use superposition, entanglement, and interference to process data in ways classical computers cannot. Quantum annealers, such as the D-Wave system, solve combinatorial optimization problems by mapping them onto a quantum system’s energy landscape. However, quantum computers and annealers face fundamental limitations that hinder their practical application. These include limited coherence times, which restrict the duration of quantum effects, hardware constraints like sparse and local qubitAttorney Docket No. 2115-008420-WO-POA connectivity that limit scalability, and limitations in qubit error rates, which limit their ability to solve complex optimization problems efficiently. Moreover, limited qubit coupling strength, the need for cryogenic temperatures, and substantial energy (kW-level) overhead further hinder the deployment of quantum computers and annealers.

[0005] Physics-inspired computers offer a promising alternative, emulating key features of quantum systems: 1 ) direct mapping of compute loads to physical compute elements; 2) spin interactions that implement a dynamical system with an engineered energy landscape; 3) massive parallelism; 4) continuous-time (CT) asynchronous operation; and 5) flexibility in representing complex interactions, such as higher order and non-linear couplings. The potential of physics-inspired solvers has been demonstrated, where a coherent Ising machine outperforms D-Wave’s quantum annealer time-to-solution (TTS) by several orders of magnitude for Sherrington-Kirkpatrick and dense Max-Cut problems.

[0006] Coherent Ising machines show promise for solving combinatorial optimization problems but face several limitations. The scalability and design of coherent Ising machines become increasingly challenging in larger systems due to the complexity of implementing optical components, such as optical parametric oscillators. Additionally, coherent Ising machines are sensitive to thermal and quantum noise, which affect stability and performance consistency. In contrast, CMOS technology offers a more robust and scalable alternative for implementing physics-inspired solvers, given its technological maturity, resilience to process-voltage-temperature (PVT) variations, and well-established supply chain.

[0007] CMOS-based solvers provide a practical realization of physics-inspired computing, by substituting quantum spin with the phase of oscillators, typically set at 0° and 180° to represent binary 0 and 1 . These solvers map complex optimization problems directly onto physical hardware, where interactions between oscillators define the problem’s objective function and energy landscape. Once initialized, the oscillator compute elements interact and evolve dynamically toward a global extremum, ultimately converging on the problem solution. The operational principle of physics-inspired solvers contrasts with that of digital accelerators for combinatorial optimization, which have shown promising results by using techniques such as in-memory computation or controlled annealing to accelerate solving combinatorial optimization problems.

[0008] This section provides background information related to the present disclosure which is not necessarily prior art.Attorney Docket No. 2115-008420-WO-POASUMMARY

[0009] This section provides a general summary of the disclosure, and is not a comprehensive disclosure of its full scope or all of its features.

[0010] A system is presented for performing error correction of a message. The system includes: a parity check matrix for an error correction code; an array of oscillator circuits, where each oscillator circuit outputs a signal and maps to a different bit in the message, such that relative phase of the signal output by a given oscillator circuit indicates a value of the bit in the message; a series of digital logic gates interconnected between inputs of the oscillator circuits in the array of oscillator circuits and outputs of the oscillator circuits in the array of oscillator circuits, where each digital logic gate in the series of digital logic gates corresponds to a row in the parity check matrix; and a digital controller interfaced with the array of oscillator circuits.

[0011] During operation, the digital controller operates to initialize state of each oscillator circuit in the array of oscillator circuits to a corresponding bit value in a received message.

[0012] In an example embodiment, the digital controller further operates to initialize state of each oscillator circuit in the array of oscillator circuits using likelihood information (e.g., log-likelihood ratio) associated with the corresponding bit value in the message. More specifically, each oscillator circuit includes at least one capacitor and the digital controller initializes the state of a given oscillator circuit by applying a voltage across the at least one capacitor, where the magnitude of the voltage correlates to the likelihood information for the bit mapped to the given oscillator circuit. The digital controller may apply the voltage across the at least one capacitor via a digital to analog converter.

[0013] In some embodiments, each oscillator circuit in the array of oscillator circuit is further defined as a relaxation oscillator and includes a set-reset latch and at least one capacitor, wherein the digital controller initializes the state of the set-reset latch and applies a voltage via a digital to analog converter to the at least one capacitor.

[0014] The system may further include an input crossbar array and an output crossbar array. The input crossbar array couples outputs of the oscillator circuits in the array of oscillator circuits to inputs of the digital logic gates in the series of digital logic gates. On the other hand, the output crossbar array couples outputs of the digital logic gates in the series of digital logic gates to inputs of the oscillator circuits in the array of oscillator circuits, such that output of the digital logic gates affects threshold voltage of aAttorney Docket No. 2115-008420-WO-POA comparator in an oscillator circuit in the array of oscillator circuits. Each oscillator circuit in the array of oscillator circuit may also include a digital to analog converter configured to receive output from the digital logic gates.

[0015] In one embodiment, the digital logic gates are further defined as exclusive OR gates, where each input of a given exclusive OR gate couples to an output of a corresponding oscillator circuit as specified by the parity check matrix.

[0016] The system may also include a readout circuit interfaced with the array of oscillator circuits. The readout circuit operates to read out state of each oscillator circuit in the array of oscillator circuits in response to a trigger signal, where states of oscillator circuits represent corrected bits of the received message. The readout circuit preferably samples state of each oscillator circuit multiple times during an oscillation period. An error sense circuit is configured to compare states of each oscillator circuit in the array of oscillator circuit to the parity check matrix and output an error sense signal based on the comparison, where the trigger signal is generated after detecting a predefined number of transitions in the error sense signal.

[0017] Further areas of applicability will become apparent from the description provided herein. The description and specific examples in this summary are intended for purposes of illustration only and are not intended to limit the scope of the present disclosure.DRAWINGS

[0018] The drawings described herein are for illustrative purposes only of selected embodiments and not all possible implementations, and are not intended to limit the scope of the present disclosure.

[0019] Figure 1 is a diagram showing a workflow for a proposed physics-inspired solver with direct mapping of higher order spin interactions.

[0020] Figure 2 depicts an example of an LDPC H-matrix.

[0021] Figure 3 is a diagram showing how the proposed solver natively supports six-body spin couplings, enabling higher order interactions with fewer hardware resources for a given graph.

[0022] Figure 4 is a graph showing how the energy function exhibits equal global minima, each corresponding to one valid LDPC codeword.

[0023] Figure 5 is a graph showing how precise solver initialization with soft information improves the likelihood of a correct solution.Attorney Docket No. 2115-008420-WO-POA

[0024] Figure 6 is a schematic of an example embodiment of a relaxation oscillator circuit.

[0025] Figure 7 is a diagram illustrating how LDPC codes are mapped to an oscillator array.

[0026] Figure 8 is a diagram depicting a system architecture for the decoder.

[0027] Figure 9 is a diagram of an asynchronous compute framework.

[0028] Figure 10 is a schematic of the relaxation oscillator circuit with feedback.

[0029] Figure 11 is a graph showing oscillator comparator input voltages for an example compute run.

[0030] Figure 12 depicts details of the SDAC and feedback DAC integrated with each relation oscillator circuit.

[0031] Figure 13 is a diagram showing a feedback system with twin crossbars facilitating six-body spin coupling.

[0032] Figure 14 illustrates the layout of a conventional crossbar (left) and the layout of the proposed crossbar with alternating metal layers and equal wire lengths (right).

[0033] Figure 15 is a schematic for an integrated sampling system.

[0034] Figure 16 is a diagram for an example digital controller.

[0035] Figure 17 is a graph showing error rates for the measured prototype.

[0036] Figure 18 is a graph showing error rates for the measured prototype with soft and hard initialization.

[0037] Figure 19 is a graph showing convergence rates for the measured prototype with soft and hard initialization.

[0038] Figure 20 is a graph showing time-to-solution for the measured prototype with soft and hard initialization.

[0039] Corresponding reference numerals indicate corresponding parts throughout the several views of the drawings.DETAILED DESCRIPTION

[0040] Example embodiments will now be described more fully with reference to the accompanying drawings.

[0041] LDPC codes are widely used for efficient error correction in noisy communication channels such as in Ethernet and 5G. LDPC decoding involves finding the most likely transmitted codeword, x, given a noisy received signal, r . In thisAttorney Docket No. 2115-008420-WO-POA disclosure, binary phase shift keying (BPSK) or PAM-2 modulation is considered, where a binary codeword, w, is converted to symbols by: x = -1 + 2w. LDPC codes are defined by a sparse parity-check H-matrix, where each row corresponds to a parity-check equation, and each column relates to a bit in the codeword. The H-matrix encodes the relationships (i.e., parity checks) between different codeword bits, ensuring that certain combinations of bits must satisfy the parity-check constraints. Fig. 2 shows an example of a small, 4 x 8 H-matrix, that defines a code of length 8. This H-matrix specifies four paritycheck constraints: CO, C1 , C2, and C3.

[0042] Given a received signal, r , the decoding problem is essentially to solve for the most likely, w, such thatHivT= 0 (1 ) where w is the sent binary codeword and x = -1 + 2w the corresponding sent symbol with Xi e {-1 ,+1 }. The received signal r, is = Xi + nt(2) where m is the noise term added to a transmitted bit, Xi. The objective in LDPC decoding is to estimate w based on r, leveraging parity-check constraints to correct errors due to noise.

[0043] Belief propagation (BP) is the dominant decoding algorithm for LDPC codes and operates by exchanging messages between received symbols (variable nodes) and parity checks (check nodes) in the Tanner graph of the LDPC code. Each variable node is initialized with the log-likelihood ratio (LLR) of the received signal, indicating its likelihood of being +1 or -1 . In an iterative loop, variable nodes send updated likelihoods to check nodes, which process them to enforce parity-check constraints and return their refined likelihoods. The process terminates once all parity checks are satisfied or a maximum iteration count is reached.

[0044] While belief propagation in conventional digital hardware benefits from technology scaling, it has challenges meeting high throughput and low error rates. Achieving high decoding parallelism and large LLR precisions incurs significant hardware overheads. Additionally, belief propagation’s iterative nature hampers throughput due to the need for multiple iterations to converge.

[0045] Recent efforts map LDPC decoding and other higher-k problems to dynamical systems by first reducing them into Ising QUBO formulations. The LDPC decoding problem is reduced to an Ising model with two-body interactions using the Chimera graph topology, enabling mapping to the D-Wave quantum annealer. However,Attorney Docket No. 2115-008420-WO-POA this reduction from higher order to lower order graphs introduces inefficiencies, requiring nearly five physical spins per LDPC variable in existing works, compared to just one physical spin per variable in this work. A promising approach is seen in other dynamical system based solvers, where an all-to-all Ising machine tackles 20-variable 3-SAT problems. Nevertheless, since the Ising machine (with 50 physical spins) can only directly solve problems with two-body interactions, it requires the 3-SAT problem, which includes three-body interactions, to be decomposed into smaller QUBO-based sub-problems and solved iteratively.

[0046] Ising machines that rely on two-body interactions were originally inspired by spin interactions in ferromagnetism (i.e., spin glass). The relevant energy function, the Ising Hamiltonian, describes how spins interact with their neighborswhere a = [s-i, S2,...,SN ] denotes the vector of N spins and, SiG {-1 , +1 } represents the spin of the ith element, corresponding to the discrete variables in an optimization problem. The coupling strength Ji, j dictates whether two spins Si and Sj tend to align (positive Ji, j ) or adopt opposite orientations (negative Ji, j ) in equilibrium.

[0047] The proposed combinatorial optimization problem engine is not derived from the Ising Hamiltonian but instead utilizes an energy function specifically designed for LDPC decoding. The system directly maps the COP graph (H-matrix) to implement native CT sixth-order spin interactions, improving upon the three-body interactions demonstrated in both continuous time and discrete-time solvers. Neither preprocessing nor decomposition is required for solving higher order combinatorial optimization problems with this architecture. The order of the spin interactions, k, is defined by the row weight (i.e., the number of ones in each row) of the H-matrix, which is usually much larger than two for LDPC decoding (k = 6 in this work, Fig. 3).

[0048] A new LDPC-based system energy function is proposed to convert LDPC decoding to a combinatorial optimization problem that can be directly and efficiently mapped to a dynamical system of spins. The number of parity checks associated with each spin encodes the key metric of how far the dynamical system’s state is from a correct solution for an error-corrupted data frame. This system state is captured by the proposed energy function for an LDPC code with M parity checks and a block length ofAttorney Docket No. 2115-008420-WO-POA where Si e {- 1 ,+1 } represents the spin of the th element that directly maps one LDPC variable. It needs to be noted that the energy function exhibits equal global minima for all valid LDPC codewords (see Fig. 4).

[0049] While the system supports hard spin initialization, precise soft spin initialization significantly improves the solver’s performance, as confirmed by measurement results. The system aims to identify the most likely transmitted binary codeword from a received initial state, r. Initializing spins with the best available prior information significantly enhances convergence accuracy. In this disclosure, the solver spins initialize with soft information (i.e., the analog value of r) rather than hard information (i.e., the sign of r). The hardware’s unique ability to precisely set initial spin phases is leveraged by mapping the analog information from the error-corrupted data frame to the initial oscillator spin phases for soft decoding. This initialization specifies the likelihood of spins being either a +1 or -1 . As shown below, initializing the solver with soft information, compared to hard binary initialization, improves error-rate performance and throughput significantly as it initializes the solver closer to the ground state that corresponds to the transmitted codeword.

[0050] Figure 5 illustrates a compute run example comparing soft and hard information initialization in the solver. When initialized with soft information, the solver assigns analog-valued initial spin phases, which accurately represent the likelihood of each bit being 0 or 1 . This soft information enables the system to begin evolution closer to the correct solution, significantly increasing the probability of successful convergence to the transmitted LDPC codeword. In contrast, hard information initialization assigns binary values, without fully preserving the confidence level of the received signal. Using only hard information positions the solver further from the correct solution within the energy landscape, increasing the likelihood of convergence to an incorrect codeword that does not match the transmitted one.

[0051] Figure 6 depicts an example embodiment of a relaxation oscillator (RXO) circuit 60. The oscillator circuit 60 is comprised of two capacitors Co, Ci ; two comparators 62, 63; a current source 64; and a set-reset latch 65. This implementation is merely illustrative and other circuit arrangements for the oscillator circuit are contemplated by this disclosure.

[0052] During operation, the N RXO spin nodes each have binary states Qj e {0, 1 }Nand internal phases corresponding to the capacitor voltages Vj e [0, VREF ]. Each oscillator circuit uses the same fixed, charging current, IRXO. The oscillator referenceAttorney Docket No. 2115-008420-WO-POA voltage, VREF, j , determines an oscillator’s spin frequency by setting the threshold voltages for the two capacitors. Soft decoding is enabled by representing both the signs and the analog values of the received signals in the initial spin states. One can initialize the oscillator SR latches to store these signs. Capacitor voltages (i.e., Vo or Vi) are initialized to map the initial spin phases that correspond to the received signal’s analog values. Once released, the oscillator spins rapidly evolve from the initial state toward the nearest minimum of the LDPC energy function. This evolution occurs through the modulation of the instantaneous spin frequencies f j , according to the energy function ELDPC, j

[0053] Each spin’s state Sj corresponds to the SR latch value Sj = 1 - 2Qj. fj changes proportionally to the sum of the associated parity-check violations, as described by ELDPC, j. For no parity-check violations, spins oscillate at a nominal frequency corresponding to the nominal reference voltage Vnom. The spin injection gain, m, is tunable.

[0054] With reference to Figure 7, directly mapping LDPC as a combinational optimization problem to the proposed dynamical system consists of the following. First, N oscillator spins with the capability of setting arbitrary initial phases and facilitating higher-k CT spin injections. Second, M digital logic-based clauses for directly computing satisfiability of LDPC check nodes in CT. Third, a crossbar-based feedback system enabling k-body spin couplings. The crossbar connection points between spins and clauses directly reflect the locations of ones in the sparse H-matrix.

[0055] The dynamical system of coupled spins directly maps the proposed energy function ELDPC. This function exhibits an energy landscape with multiple equal global minima, each representing a valid LDPC codeword. The initialization of spins maps the received LLR signs to binary spin states, while the analog LLR values directly map to initial spin phases (p e [0, IT]. Soft initialization positions the system closer to the correct LDPC codeword’s ground state. After initialization, spins interact and rapidly converge to a ground state, leveraging natural system dynamics without requiring annealing. Spins increase frequency proportionally to the number of violated parity checks. For example, a spin with more parity-check violations will oscillate at a higher frequency at that moment. This interaction ultimately results in spins resolving their relative phases to either 0 or IT, representing 0 and 1 . Annealing is unnecessary, as spins continuouslyAttorney Docket No. 2115-008420-WO-POA adjust their frequency and phase if clauses are violated, ensuring that the system converges to a ground state.

[0056] Figure 8 depicts an example implementation for the proposed solver 80. The solver 80 is comprised generally of an array of oscillator circuits 81 , a series of digital logic gates 82, and a digital controller 83. Each oscillator circuit outputs a signal and maps to a different bit in the message, such that relative phase of the signal output by a given oscillator circuit indicates a value of the bit in the message. The series of digital logic gates form part of a feedback system as will be further described below.

[0057] Briefly, the array of oscillator circuits 81 spins directly maps combination optimization problem variables, while the feedback system implements six-body interactions among groups of spins with shared LDPC clauses. Each spin directly represents an LDPC variable, and each XOR-6 gate represents an LDPC clause. The feedback system directly maps the H-matrix. An error-sense circuit flags the completion of a compute run once all parity checks are satisfied. Run completion initiates the sampling of the spin phases to read out the decoded message. The digital controller 83 handles only configuration and interfacing, while the entire compute runs within the analog core.

[0058] More specifically, the series of digital logic gates 82 are interconnected between inputs of the oscillator circuits in the array of oscillator circuits 81 and outputs of the oscillator circuits in the array of oscillator circuits 81 . Each digital logic gate in the series of digital logic gates 82 corresponds to a row in the parity check matrix. In the example embodiment, the digital logic gates 82 are further defined as exclusive OR gates, where each input of a given exclusive OR gate couples to an output of a corresponding oscillator circuit as specified by the parity check matrix. While reference is made in this disclosure to exclusive OR gates, other types of digital logic gates fall within the broader aspects of this disclosure.

[0059] The digital controller 83 is interfaced with the array of oscillator circuits 81 . During operation, the digital controller 83 initializes state of each oscillator circuit in the array of oscillator circuits 81 to a corresponding bit value in a received message. That is, the digital controller 83 initializes state of each oscillator circuit in the array of oscillator circuits 81 using likelihood information associated with the corresponding bit value in the received message. Taking advantage of the solver’s unique ability to set arbitrary initial phases, the received, error-corrupted data frame is mapped to the initial oscillator phases (soft decoding). The reliability of a received bit (i.e., soft information) specifies itsAttorney Docket No. 2115-008420-WO-POA likelihood of being either a 0 or a 1. By initializing the spins with the best possible information, they are more likely to converge on the correct solution. Encoding soft information as the initial solver state increases both BER / FER performance and throughput significantly.

[0060] Each oscillator’s instantaneous frequency changes based on the sum of its associated parity check violations. This allows the system to concurrently search for the ground state (lowest energy). Convergence is achieved when all the parity checks are satisfied and a corrected codeword is found. It is envisioned that this architecture is easily adaptable to a wide range of other combinatorial optimization problems like Max- Cut or k-SAT. The relaxation oscillator phases are preferably initialized with soft inputs, instead of binary inputs as in hard decoding, to improve decoding performance. Once released, the feedback system computes parity checks in continuous time and adjusts the phases to minimize the system’s energy. The oscillator phases evolve until all parity checks are satisfied, at which point the phase difference between all compute nodes is either 0 or 180 degrees. The final oscillator phases represent the decoded data frame.

[0061] The solver architecture is demonstrated by implementing a decoder for a regular (96, 48) code. The (96, 48) LDPC code is selected as a representative rate-1 / 2 code for an initial hardware demonstration. Rate-1 / 2 codes are widely used in practical applications, including Wi-Fi (IEEE 802.1 1 n / ac / ax). Additionally, behavioral simulations confirm that the solver operates correctly across various LDPC codes (e.g., Wi-Fi and ten GBASE-T) and code rates, even at higher code rates, where constraints become more tightly coupled.

[0062] The feedback system is a twin-crossbar architecture that facilitates bidirectional connectivity between spins and clauses for six-body spin couplings, and enables one-to-one variable-to-spin mapping. The crossbar architecture includes an input crossbar array 86 coupling outputs of the oscillator circuits in the array of oscillator circuits 81 to inputs of the digital logic gates in the series of digital logic gates 82, as well as an output crossbar array 87 coupling outputs of the digital logic gates in the series of digital logic gates 82 to inputs of the oscillator circuits in the array of oscillator circuits 81 , such that output of the digital logic gates affects threshold voltage of the oscillator circuits in the array of oscillator circuits. In one embodiment, the interconnects in the input crossbar array and the output crossbar array are fixed. In other embodiments, the interconnects may be programmable to handle other codes and different applications.Attorney Docket No. 2115-008420-WO-POA

[0063] The input crossbar array 86 connects spin outputs to the 48 six-input XOR gates, which are responsible for executing the parity checks. The output crossbar array 86 routes the clause outputs back to the corresponding spins. This architecture fundamentally differs from systems supporting two-body or three-body spin interactions, as it enables native solving of higher order combination optimization problems (k = 6). Further extending the interaction order requires increasing XOR gate inputs and modifying the twin-crossbar structure to accommodate additional interconnects between spins and clauses.

[0064] An advantage of the proposed RXO-based spins is that they allow for digital signal coupling, which does not load the oscillators, enabling long-range and high-order coupling. In contrast, resistive coupling directly loads sensitive oscillation nodes, limiting both the range and the order of coupling. In addition, digital feedback is resilient to noise since it operates with well-defined digital signal levels.

[0065] The feedback system monitors clauses in continuous time and provides feedback signals that push the oscillators to the most likely state. The architecture is highly adaptable, allowing reconfiguration of the input and output crossbars, as well as scaling of the spin and clause arrays, to support different codes and H-matrices. Additionally, features such as precise spin phase initialization (i.e., with soft initialization), optimal mapping (i.e., one COP variable corresponds to one physical spin), full parallelism, and robust feedback enhance scalability, enabling support for longer codes.

[0066] Figure 9 shows a custom compute framework that manages the compute process. At initialization, after the oscillators are reset, their spin states are precisely set based on the received LLRs. Subsequently, once the spins are released, they interact and evolve rapidly toward a global energy minimum (solver convergence). Once convergence is reached, all parity checks are satisfied, and the sampling system registers the final solution. The final solution is encoded in the locked oscillator phases, which have either relative phases of 0° or 180°. In this compute framework, the digital controller 83 handles configuration and interfacing during the “Compute setup” and “Solution readout” steps, while the entire LDPC compute runs within the solver core, without pre- or post-processing.

[0067] One of the key design considerations is the use of a twin-capacitor relaxation oscillator (RXO) circuit as the core of the proposed CT spin as shown in Figure 10. The twin-cap relaxation oscillator alternately integrates current on one capacitor while resetting the other (see Fig. 1 1 ). A CT comparator detects when the charging capacitorAttorney Docket No. 2115-008420-WO-POA exceeds a reference voltage and sets (or resets) a latch, which stores the oscillator’s binary state. The relaxation oscillator provides crucial advantages for physics-inspired computing, especially when compared to ring oscillators used in Ising machines with two- body interactions.

[0068] Spin injection based on the manipulation of comparator threshold voltages enables accurate coupling with higher order interactions. ROSC-based spins typically rely on resistive interconnects, which allow for configurable coupling coefficients, yet remain constrained to two-body interactions.

[0069] The ability to set the latch state and capacitor voltages (soft information) offers unprecedented capabilities. In most combinational organization problems, the solution is fully encoded in spin couplings, whereas in the proposed solver, it is determined by both soft initial conditions and the energy function, leveraging precise control over the initial state.

[0070] Current source charging in the relaxation oscillator significantly reduces sensitivity to PVT and supply variation, a common issue with ROSCs.

[0071] A digital spin feedback interface, incorporating a feedback DAC, (FDAC) enables CT spin injection with six-body spin interactions and purely digital spin coupling that is robust and does not load the oscillator. Adjusting the relaxation oscillator reference allows precise control of spin dynamics, distinguishing it from ROSC-based spins. While ROSC-based spins traditionally do not require D / A conversion, they remain limited to two-body spin interactions.

[0072] The symmetric oscillator structure delivers a stable frequency and duty cycle, which enhances the accuracy of the system. A large frequency or duty cycle error can impede convergence to solutions and phase readout.

[0073] The proposed spin implements the three key variables essential to the dynamical system: the spin’s internal phase with the capacitor voltage, the binary spin state with the SR latch state, and the spin frequency with the comparator threshold voltage. The soft DACs (SDACs) set the initial spin phases to match the received channel LLRs. The feedback system utilizes the binary spin states to generate CT spin injection signals. For each spin, this feedback drives its feedback interface (FDAC), which integrates the summation of feedback signals to modulate the oscillator frequency in continuous time. Modulating the comparator threshold with the FDAC facilitates six-body interactions. The CDAC and the FDAC occupy 23% and 5% of the spin’s area,Attorney Docket No. 2115-008420-WO-POA respectively. Moreover, this design’s primary custom analog component is the oscillator spin core, which requires only a one-time design, as all spins are identical.

[0074] During initialization, the SDAC connects to the RXO capacitors to set the initial phase of each spin (see Fig. 12). The 3-bit CDAC implementation allows efficient loading of the LLR values to the oscillator. The CDAC’s binary-weighted capacitors are initially reset to VSS and then connected to VDD or VSS based on the quantized LLR value. Finally, the CDAC and RXO capacitors charge-share to establish the initial phase.

[0075] The FDACs sum digital spin injection signals to enable six-body coupling and generate corresponding relaxation thresholds. After the spins are released, the continuous time spin injection signals from the feedback system drive the integrated FDACs for each spin to correct bit errors. Avoiding full adders, which incur a larger footprint, the feedback signals and their complements directly drive switches to connect the comparator threshold with voltages corresponding to the sum of the parity-check violations of each spin. In this way, the FDACs modulate the spin frequencies. The nominal frequency corresponds to zero parity-check violations, and each additional parity-check violation results in a positive increment in the frequency. The proposed FDAC is resource-efficient and allows for precise tuning of the analog spin injection weights.

[0076] The primary nonidealities affecting the RXO spin relate to oscillationfrequency mismatch. Behavioral simulations indicate an upper limit to inter-spin frequency and duty-cycle mismatch, beyond which spins fail to lock into a stable equilibrium state. To mitigate this performance degradation, one can identify the primary sources of spin frequency mismatch: the current source and the comparator. The transistors in both spin circuits are carefully sized with sufficient matching margins to minimize inter-spin frequency mismatch. One can also size the RXO twin-capacitors with sufficient matching margin to alleviate duty-cycle mismatch. The bias circuitry for the entire oscillator array was also carefully sized to minimize mismatches in the oscillator charging current and comparator biases.

[0077] Referring to Figure 13, the feedback system generates CT feedback to the RXO spins, directly implementing the LDPC energy function as described above. The fully digital design ensures compactness and scalability, making it suitable for larger codes. The system directly maps the sparse H-matrix onto a pair of crossbars, where each vertical and horizontal intersection represents an H-matrix cell. The systemAttorney Docket No. 2115-008420-WO-POA effectively encodes the six-body spin interactions by placing vias at intersections corresponding to matrix entries with ones.

[0078] The crossbar design is tailored for scalability. Note that the input crossbar has three times more horizontal wires than vertical, while the output crossbar has six times more vertical wires than horizontal. This configuration allows each clause to receive six distinct inputs from different spins and route its output back to those same spins. For this solver, a fixed crossbar architecture is implemented as an initial prototype. However, the system can adapt to variable-rate LDPC codes for different channel conditions by using programmable switches in place of fixed connections.

[0079] To streamline the design process, a custom SKILL script automates the layout generation. The script accepts an arbitrary H-matrix as input, instantiates the twin crossbars and clause structures, and places vias at the corresponding crossbar intersections. This allows full automation of the feedback system design, reducing design effort while ensuring accuracy, robustness, and scalability for longer LDPC codes. To scale the design to longer codes, the number of oscillator spins and XOR gate inputs must be increased, while the twin crossbars need to be expanded to accommodate additional spin-clause interconnects. Furthermore, since the feedback system is purely digital, it should scale well to advanced technology nodes like FinFET technologies, where propagation delay significantly reduces.

[0080] In the current implementation, increasing the number of variables (N) causes the feedback system to scale as N2, making it the primary bottleneck in scaling, whereas clauses and feedback DACs scale as N. The SDAC resolution does not need to increase for larger decoders, as the 3-bit soft information resolution provides sufficient performance improvement. To enhance the scalability of future iterations of this design, another approach would be to instantiate two adjacent spin arrays, each with its dedicated crossbar system placed below them. The clauses can be implemented at their vertical boundary. This structure improves scalability, as each crossbar and the corresponding feedback delay scale now on the order of N2 / 2.

[0081] The propagation delay of the feedback system is the main limitation of throughput. Above a certain amount, delay has a detrimental effect on the ability of the system to solve input problems correctly. A large feedback delay prohibits the spins from locking to correct states and results in chaotic oscillations. From behavioral simulations, one can estimate the upper limit of feedback delay to be approximately 10% of the nominal oscillation period. Feedback delay variation also prevents the system fromAttorney Docket No. 2115-008420-WO-POA reliably converging to solutions. Large delay variations cause some spins to receive feedback much faster than others, leading to chaotic behavior.

[0082] In Figure 14, a layout technique is introduced, specifically designed for the twin crossbars, to optimize the delay and delay variation of the feedback system. First, to improve the delay of the crossbars, instead of using just two metal layers for the crossbars (one for the vertical and one for the horizontal wires), one can use alternating metal layers for the horizontal lines to reduce the coupling capacitance between adjacent horizontal wires. This reduction is verified by extracting the crossbar parasitics. Since multiple parallel vias are placed at intersection points, there is no significant change in the effective resistance when comparing the two approaches. Furthermore, all crossbar wires are extended to balance propagation delays. The via placement within the crossbars introduces mismatch in each path’s RC load. However, extending metal wires reduces RC load variation and, thus, delay variation compared to designs with variable wire lengths. These layout techniques effectively reduce the maximum delay of the data paths in the feedback system by 22% and the maximum delay variation between different paths by 40%.

[0083] Figure 15 depicts an integrated sampling system that provides a robust, noise-free readout of the oscillator array’s state at the end of each compute cycle. Due to non-idealities such as frequency variation and loop delay, even at equilibrium, the spin phases exhibit slight deviations from ideal 0° or 180° differences between them, causing variations in edge transition times. To prevent sampling at noisy edge transition points, we introduce redundancy by capturing six samples in one oscillation period. The sampling system comprises a delay-locked loop (DLL), a sampling controller, and output data registers. The DLL generates six clock phases to enable sampling at six distinct points in an oscillation period. The reference from a dummy nominal-frequency RXO is derived to closely match the DLL frequency with the nominal array frequency. The DLL accommodates a broad frequency range from 0.1 to 300 MHz. A custom clock tree evenly distributes the clock phases to the 96 6-bit samplers (one for each spin).

[0084] The error-sense circuit determines whether all 48 clauses are satisfied by performing an OR operation on their outputs, generating a 1 -bit error flag. After initial convergence, it takes several cycles for the spin phases to fully align to 0° or 180° differences. During this transitory process, the error-sense flag is asserted but with a decreasing duty cycle (i.e., the more aligned the spin phases are, the shorter the duty cycle of the error-sense flag). A pulse counter integrated with the sampling controller isAttorney Docket No. 2115-008420-WO-POA used to prevent early sampling before spin phases align to 0° or 180° with sufficient margin. The counter initiates the sampling process only after detecting a programmable number of rising error-sense transitions. After sampling is triggered, the controller waits for all six DLL rising edges before locking the 96 registers for data readout.

[0085] In Figure 16, the digital controller enables a fully autonomous solver by managing the oscillator array configuration, loading initial soft values, controlling operations, retrieving results from the sampling system, and measuring the solution time. The controller is only responsible for configuring and interfacing with the solver, while all LDPC decoding computations occur entirely within the solver. An instruction-based architecture is designed to optimize memory usage and reduce the circuit footprint. The controller’s instruction set is designed for flexible flow control, accommodating both chip testing and application needs. The controller accurately measures compute time and supports loop iterations, allowing the solver to re-run if convergence fails before timeout. For a compute run, the controller generates waveforms for launching the decoding process, managing loop iterations, pausing when required, and terminating once decoding is complete. A sequence of up to 256 instructions is used to execute the decoding process. A Quad-SPI (QSPI) interface links the system to a host Teensy microcontroller for efficient communication.

[0086] As proof of concept, extensive evaluations of a prototype were performed to assess its performance accurately for a wide range of SNRs, for temperature, supply variation, and for different dies. The chip was evaluated with more than 100 million frame decoding cycles. The measured performance metrics are compared with state-of-the-art BP-based LDPC decoders and recent solvers. As mentioned earlier, LDPC is formulated as a COP and solved natively in the prototype, without the need for preor postprocessing. The prototype was fabricated in 28-nm CMOS technology and consumes 7.38 mW.

[0087] The prototype’s decoding performance was extensively evaluated to quantify bit error rate (BER) and frame error rate (FER) metrics. The solver’s converged solution during compute cycles is compared with the ideal sent codeword to calculate both metrics. Mismatches in specific bits correspond to bit errors, while mismatches in one or more bits of a frame correspond to frame errors. As is typical for testing decoder performance, one can use a zero-codeword and add noise from an AWGN noise source, whose noise power reflects a given channel SNR level, to generate the solver inputs. Also tested the solver for randomly generated non-zero codewords, and it exhibitsAttorney Docket No. 2115-008420-WO-POA identical performance. The dataset is generated in MATLAB, and the whole measurement process is fully automated. A Teensy microcontroller orchestrates the measurement process by iteratively configuring test runs on the on-chip digital controller and then retrieving the solver’s solution and the precise solution time. FER and BER were evaluated across varying channel SNRs, observing higher error rates at lower SNRs, as expected, due to increased noise in the channel. Additionally, higher SNR measurements require more decoding cycles since decoding error occurrences decrease as SNR increases.

[0088] Belief propagation is simulated for the same (96, 48) LDPC code used in the prototype chip to benchmark the proposed oscillator-based decoder heuristic against the state-of-the-art LDPC decoding algorithm. Since error-rate performance depends on both the code and decoding algorithm, comparing the measured BER / FER of the proposed solver against the simulated BP decoder provides a direct benchmark against the leading digital decoding method. Set the maximum iterations of BP to 10, a commonly used threshold. The measured FER and BER results are illustrated in Fig. 17 and compared with the belief propagation decoding metrics. The prototype consistently outperforms belief propagation, especially in the low-SNR region. For 2-5 dBs of SNR, the solver demonstrates more than three orders of magnitude improved BER compared to belief propagation.

[0089] To demonstrate the advantage of soft decoding, results were measured using a hard decoder configuration of the prototype. For the hard-decision solver initialization, the soft-information bits loaded to the RXO spins were set to zero, effectively disabling the SDACs. The RXO latches were initialized to zero or one based on a binary hard decision of the received LLR. Soft decoding significantly outperforms hard decoding across the entire SNR range for both FER and BER metrics (see Fig. 18). This comparison underlines the benefit of initializing the solver with soft information. The chosen 3-bit LLR resolution provides substantial performance improvement. Further increasing LLR resolution beyond 3 bits could further reduce error rates.

[0090] For COP solvers, two key performance metrics are the convergence rate (CR) and time-to-solution (TTS). CR and TTS were also measured across a range of SNRs. At lower SNRs, TTS increases slightly as the solver requires more time to correct the growing number of channel-induced parity-check violations, while CR degrades for the same reason. The CR is defined as the percentage of times the solver successfully finds a solution that satisfies the clauses defined by the COP problem, such as the H-Attorney Docket No. 2115-008420-WO-POA matrix in LDPC decoding. In the measurements, 1 17 ns of compute time (Tcomp) are allocated to the solver. If the system fails to converge to a solution within this time, the solver is re-initialized with the same codeword, and the compute cycle restarts.

[0091] Figure 19 illustrates the CR, comparing the soft-decision and hard-decision solver configurations. In both cases, the CR improves as the SNR increases, eventually saturating at high SNR levels. As with the BER / FER measurements, soft decoding significantly improves performance, with the soft CR being more than eight times higher than the hard CR at an SNR of 2 dB. Soft CR saturates near one at an SNR of 5 dB, while hard CR requires an SNR of 8 dB for saturation.

[0092] The solver’s TTS is derived from extensive measurement results. While the system’s accuracy is represented by the error-rate metrics, its TTS reflects the time it takes to reach a valid LDPC solution. To calculate TTS, one can account for both scenarios: when the solver converges to a solution with an average solution time, Tconv, and when the solver fails to converge, leading to wasted compute time equivalent to the allocated solver compute time Tcomp. To calculate the total time required to reach a valid LDPC solution, add the number of restarts required, given by Tcomp((1 / CR)-1 ), to the average solution time Tconv, when the solver converges. This relationship can be expressed by the following equation:

[0093] The measured TTS is shown in Figure 20. While soft initialization enables stable TTS across the entire SNR range, hard solver initialization results in significantly worse TTS for lower SNR values. The comparison between soft and hard decoder configurations highlights the important role of soft initialization. TTS includes the compute setup (three clock cycles), the asynchronous compute, and the readout time (see Fig. 9).

[0094] The foregoing description of the embodiments has been provided for purposes of illustration and description. It is not intended to be exhaustive or to limit the disclosure. Individual elements or features of a particular embodiment are generally not limited to that particular embodiment, but, where applicable, are interchangeable and can be used in a selected embodiment, even if not specifically shown or described. The same may also be varied in many ways. Such variations are not to be regarded as a departure from the disclosure, and all such modifications are intended to be included within the scope of the disclosure.

Claims

Attorney Docket No. 2115-008420-WO-POACLAIMSWhat is claimed is:1 . A system for performing error correction of a message, comprising: a parity check matrix for an error correction code; an array of oscillator circuits, where each oscillator circuit outputs a signal and maps to a different bit in the message, such that relative phase of the signal output by a given oscillator circuit indicates a value of the bit in the message; a series of digital logic gates interconnected between inputs of the oscillator circuits in the array of oscillator circuits and outputs of the oscillator circuits in the array of oscillator circuits, where each digital logic gate in the series of digital logic gates corresponds to a row in the parity check matrix; and a digital controller interfaced with the array of oscillator circuits.

2. The system of claim 1 wherein the digital controller operates to initialize state of each oscillator circuit in the array of oscillator circuits to a corresponding bit value in a received message.

3. The system of claim 1 wherein the initialized state of each oscillator circuit in the array of oscillator circuits is based on log-likelihood ratio of the corresponding bit value in the received message.

4. The system of claim 2 wherein the digital controller further operates to initialize state of each oscillator circuit in the array of oscillator circuits using likelihood information associated with the corresponding bit value in the message.

5. The system of claim 4 wherein each oscillator circuit includes at least one capacitor and the digital controller initializes the state of a given oscillator circuit by applying a voltage across the at least one capacitor, where the magnitude of the voltage correlates to the likelihood information for the bit mapped to the given oscillator circuit.

6. The system of claim 5 wherein the digital controller applies the voltage across the at least one capacitor via a digital to analog converter.Attorney Docket No. 2115-008420-WO-POA7. The system of claim 2 wherein each oscillator circuit in the array of oscillator circuit is further defined as a relaxation oscillator and includes a set-reset latch and at least one capacitor, wherein the digital controller initializes the state of the set-reset latch and applies a voltage via a digital to analog converter to the at least one capacitor.

8. The system of claim 1 further comprises an input crossbar array coupling outputs of the oscillator circuits in the array of oscillator circuits to inputs of the digital logic gates in the series of digital logic gates.

9. The system of claim 1 further comprises an output crossbar array coupling outputs of the digital logic gates in the series of digital logic gates to inputs of the oscillator circuits in the array of oscillator circuits, such that output of the digital logic gates affects threshold voltage of a comparator in the oscillator circuits in the array of oscillator circuits.

10. The system of claim 9 wherein each oscillator circuit in the array of oscillator circuit includes a digital to analog converter configured to receive output from the digital logic gates.1 1 . The system of claim 1 wherein the digital logic gates are further defined as exclusive OR gates, where each input of a given exclusive OR gate couples to an output of a corresponding oscillator circuit as specified by the parity check matrix.

12. The system of claim 2 further comprises a readout circuit interfaced with the array of oscillator circuits and operates to read out state of each oscillator circuit in the array of oscillator circuits in response to a trigger signal, where states of oscillator circuits represent corrected bits of the received message.

13. The system of claim 12 further comprises an error sense circuit configured to compare states of each oscillator circuit in the array of oscillator circuit to the parity check matrix and output an error sense signal based on the comparison, wherein the trigger signal is generated after detecting a predefined number of transitions in the error sense signal.Attorney Docket No. 2115-008420-WO-POA14. The system of claim 12 wherein the readout circuit samples state of each oscillator circuit multiple times during an oscillation period.