Molecular dynamics time series prediction method and system based on equivariant graph neural network

By using an autoregressive time series model based on an isovariant graph neural network, the state of a molecular dynamics system can be directly predicted, resolving the contradiction between efficiency and accuracy in traditional methods. This enables efficient and accurate dynamic simulation, applicable to complex physical processes on large spatiotemporal scales.

CN122245458APending Publication Date: 2026-06-19TONGJI UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
TONGJI UNIV
Filing Date
2026-02-28
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing molecular dynamics methods present a trade-off between accuracy and efficiency. High-cost ab initio molecular dynamics (AIMD) cannot be used for complex physical processes on large spatiotemporal scales, while classical empirical potentials have limited accuracy and poor portability, and machine learning potentials require a large amount of training data and are computationally expensive.

Method used

By employing an autoregressive time series model based on an isovariant graph neural network, the dynamic evolution process of an atomic system is reconstructed into an end-to-end time series prediction task. The system state is directly predicted through built-in physical constraints, bypassing the traditional force calculation steps, and long-term dynamic trajectory generation is achieved using the autoregressive time series model.

Benefits of technology

It achieves efficient and accurate dynamic simulation, reduces dependence on expensive training data, supports larger time steps, generates trajectories that are highly consistent in statistical physical properties, improves computational efficiency and has high physical fidelity.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122245458A_ABST
    Figure CN122245458A_ABST
Patent Text Reader

Abstract

This invention discloses a molecular dynamics time series prediction method based on equivariant graph neural networks. It reconstructs the dynamic evolution of an atomic system into an end-to-end time series prediction task, employing an autoregressive time series model based on equivariant graph neural networks as the dynamic evolution operator. It receives the complete phase space information of the atomic system at the current moment as input and, through built-in physical constraints, directly predicts the system's state at future moments. This invention bypasses the time-consuming energy and force calculations of traditional methods and significantly reduces the amount of training data required. It can generate trajectories with high computational efficiency and high physical fidelity, demonstrating superior efficiency compared to existing ab initio and machine learning potential-assisted molecular dynamics methods. This provides a novel and rapid approach for time-domain statistical physics simulations and the study of complex problems.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the fields of computational physics and materials science, specifically relating to a molecular dynamics time series prediction method and system (abbreviated as ChronoCast) based on equivariant graph neural networks. Background Technology

[0002] Molecular dynamics (MD) simulations are the cornerstone connecting microscopic atomic motion with macroscopic physical phenomena. However, existing MD methods face a fundamental contradiction between accuracy and efficiency. On the one hand, ab initio molecular dynamics (AIMD), based on first principles, obtains the interatomic interaction forces by solving quantum mechanical equations, achieving extremely high accuracy. However, its enormous computational cost (such as complex self-consistent electron calculations) limits its application to timescales of a few hundred atoms and picoseconds, making it unsuitable for studying complex physical processes requiring large spatiotemporal scales (such as phase transitions and crystal growth).

[0003] On the other hand, classical empirical potentials are fast to calculate, but they rely on predefined functional forms, resulting in limited accuracy and poor portability. Machine learning potentials, which have emerged in recent years, predict forces by learning high-dimensional potential energy surfaces, achieving a better balance between accuracy and efficiency. However, machine learning potentials typically require large and diverse training datasets (which still need to be obtained through costly AIMD) to ensure generalization, and their simulation process still follows the traditional paradigm of "computing force – integrating the equations of motion," which itself still incurs considerable computational costs.

[0004] Therefore, there is an urgent need in this field for a new simulation paradigm that can overcome the computational bottleneck of AIMD while maintaining high physical fidelity, so as to achieve accurate and efficient dynamic simulation of complex systems. Summary of the Invention

[0005] The technical problem to be solved by the present invention is to provide a molecular dynamics time series prediction method and system based on equivariant graph neural networks, which solves the problems of poor efficiency and fidelity caused by the traditional and high-cost force calculation steps in the prior art.

[0006] To solve the above-mentioned technical problems, the present invention adopts the following technical solution:

[0007] A molecular dynamics time series prediction method based on equivariant graph neural networks reconstructs the dynamic evolution of an atomic system into an end-to-end time series prediction task. It employs an autoregressive time series model based on equivariant graph neural networks as the dynamic evolution operator, receiving the atomic system's current time... Using complete phase space state information as input, and through built-in physical constraints, the system can directly predict the future state of the system. The state.

[0008] The state information includes the position information of each atom in the system. and speed information The atomic system is constructed as a graph structure in the autoregressive time series model, where atoms are viewed as nodes in the graph, and the velocity information of the atoms... It is explicitly input as one of the node features to capture complete phase space information.

[0009] The autoregressive time series model incorporates E(3) isovariant physical constraints to ensure that the predicted atomic positions and velocities have the correct physical response to the rigid transformation of the system.

[0010] The autoregressive time series model includes a momentum conservation correction step after predicting the velocity, which ensures the conservation of the total momentum of the system by subtracting the overall velocity of the system's center of mass.

[0011] The autoregressive time series model will predict the current... The state at each moment serves as the input for the next round of prediction, forming a closed loop for continuously generating dynamic trajectories over long periods.

[0012] By sampling and calculating from the generated dynamic trajectory data, the statistical physical properties of the atomic system are obtained, including the radial distribution function, root mean square displacement, and vibrational density state.

[0013] The autoregressive time series model is trained from a reference dynamic trajectory through supervised learning, and the predicted time step can be larger than the original time step of the reference dynamic trajectory.

[0014] The autoregressive time series model framework is scalable, allowing additional physical features to be used as new node feature inputs to simulate richer physical dynamic processes.

[0015] A molecular dynamics time series prediction system based on isovariant graph neural networks includes a processor that, when running, invokes the method to predict the continuous dynamic trajectory of an atomic system.

[0016] A computer-readable storage medium storing computer-readable instructions that, when executed by a processor, invoke the steps of the method.

[0017] Compared with the prior art, the present invention has the following beneficial effects:

[0018] 1. High computational efficiency: This invention bypasses the high-cost self-consistent electronic computation in traditional AIMD and the relatively time-consuming potential energy surface and force calculation in MLPs, transforming the simulation process into a highly efficient neural network forward propagation task, which greatly improves the simulation speed.

[0019] 2. High data efficiency: Compared with traditional machine learning potentials (such as neural network potential functions (NEP)) that require a large amount of diverse training data (e.g., sampling at multiple different temperatures), the method of this invention only requires a continuous dynamic trajectory to complete the training, which significantly reduces the dependence on expensive AIMD training data.

[0020] 3. Supports larger time steps: The method of this invention has been shown to support larger time steps than the original AIMD time step (e.g., Under the conditions (e.g.) or This allows for stable and accurate predictions, further reducing the number of computational steps required to obtain the same total duration of the trajectory, thus achieving further efficiency improvements.

[0021] 4. High Physical Fidelity: Although the predicted trajectory diverges from the actual AIMD trajectory in long-term predictions due to chaotic effects (a common characteristic of all dynamic models), the method of this invention, by coupling "position-velocity" features and embedding physical constraints such as physical isovariance and momentum conservation, ensures that the generated trajectory is highly accurate in a statistical physical sense. Calculations show that the generated trajectory is highly consistent with the AIMD benchmark in key physical properties such as radial distribution function, root mean square displacement, and vibrational density state, and even outperforms some machine learning potentials. Attached Figure Description

[0022] Figure 1 This is a schematic diagram of the algorithm (ChronoCast model) used in this invention, showing the encoder-processor-decoder workflow.

[0023] Figure 2 This is a comparison diagram of the algorithm workflow of the present invention (left) and the traditional AIMD workflow (right), highlighting the feature of the present invention that bypasses force and energy calculation.

[0024] Figure 3 The graph shows the convergence of the loss function of the monocrystalline silicon system during the training process (a) and the accuracy of the model's single-step prediction of position and velocity (b, c).

[0025] Figure 4The figure shows the long-term (0-20 ps) autoregressive prediction results of the dynamic process of a single crystal silicon system, which compares the predicted values ​​of the position and velocity of a randomly selected single atom with the actual AIMD values ​​(a, b), and the evolution of the corresponding error over time (c, d); the two magnified figures corresponding to (a) show a detailed comparison between the actual and predicted values ​​of the position at 0-5 ps and 15-20 ps.

[0026] Figure 5 A comparison of key physical properties (a: radial distribution function (RDF), b: root mean square displacement (MSD), c: displacement probability distribution, d: vibrational density state (VDOS)) calculated from trajectories predicted by different methods (AIMD, the method of this invention) in a monocrystalline silicon system.

[0027] Figure 6 Comparison of training convergence (a), position evolution (b), and key physical properties (c: displacement probability distribution, d: vibration density state (VDOS)) of complex NbSe3 systems.

[0028] Figure 7 To compare the computational time costs of different simulation methods applied to the NbSe3 system, the illustration shows a comparison of the time taken by a representative MLP and the ChronoCast model across different time steps to predict a 30ps trajectory. Detailed Implementation

[0029] The structure and working process of the present invention will be further described below with reference to the accompanying drawings.

[0030] The purpose of this invention is to overcome the shortcomings of the prior art and provide a novel small-scale dynamic simulation method based on time series prediction. This method treats dynamic evolution as a prediction task, thereby bypassing the traditional, high-cost force calculation steps and achieving efficient and high-fidelity simulation.

[0031] This invention proposes a new paradigm for dynamic simulation based on time series prediction. This algorithm uses an advanced, physically constrained equivariant graphical neural network (EGNN) as the dynamic evolution operator to reconstruct the dynamic evolution of the system (i.e., the integral process of solving Newton's equations of motion) into an end-to-end autoregressive prediction task.

[0032] The algorithm is executed as follows:

[0033] 1) Data Input and Graph Construction: The model first receives the atomic system at the current time. The state of the system is recorded, including the position, velocity, and atom type of all atoms (implicitly encoded). The system is then constructed as a graph, with each atom treated as a node in the graph structure. Using a fast nearest neighbor search technique, the algorithm constructs a directed edge (e) when the distance between two atoms in the system is less than a preset cutoff radius. ij This encoding process encodes the relative positions and local geometric information of atoms. Thus, the three-dimensional data of the atomic system is transformed into initial input features that the graph neural network can recognize, including scalar features representing atom types and vector features representing positions and velocities. Notably, velocity is also explicitly input as a node feature to capture complete phase space information, and to ensure lossless transmission of underlying physical information, the model input directly retains the original data; for example, velocity features maintain their original units (e.g., Å / fs) without mandatory unit conversion.

[0034] 2) Underlying core architecture and information transmission:

[0035] The model employs an encoder-processor-decoder architecture. The initial input features are fed into an L-layer stacked equivariant graph convolutional network (EGNN) that acts as the core processor. Within each layer, information transfer between nodes is not a simple numerical aggregation, but rather the construction of three-dimensional aggregated information: scalar information (extracting and aggregating rotation-invariant relative distance and velocity features), vector information (relative position and velocity vectors), and higher-order tensor information (this is to capture complex local environments; the algorithm uses the tensor product of features...). A second-order tensor information was constructed and incorporated into a coupled network of position and velocity. Finally, a specific isovariant update neural network module coupled these multi-order information to achieve synchronous updates of the atom position and velocity vectors.

[0036] 3) Time Series Prediction Mapping and Physical Constraint Output: After information transfer and updating through a multi-layer Graph Equivariant Neural Network (EGNN), the final features of the nodes are fed into an independent decoder module, allowing the model to directly learn from the current state. To the future state The mapping, where Represents the set of atomic positions. It represents the set of atomic velocities.

[0037] During the model output phase, physical symmetry-based equivariance is enforced, ensuring that the model output has the correct physical response to rotational and translational operations performed on the system corresponding to the input data. Simultaneously, at the output end, the algorithm includes a momentum conservation correction step. In the momentum conservation layer, the predicted atomic velocities are corrected by subtracting the center-of-mass velocity, ensuring that the total momentum of the system remains conserved during evolution.

[0038] 4) Autoregressive Time Series Evolution: In practical applications, when the model needs to generate continuous dynamic trajectories, the algorithm operates in an end-to-end autoregressive manner. The output of the previous prediction is used as the input for the next prediction, continuously mapping and generating... , And so on, to generate a long-term trajectory.

[0039] This process is repeated continuously, generating statistically valid dynamic trajectories with extremely high efficiency, without the need for real data calibration or human intervention.

[0040] 5) Specific performance: This prediction model not only improves the computation speed, but also shows excellent performance in physical fidelity. First, it has high prediction accuracy and physical consistency. In the single-step prediction task, the predicted atomic positions and velocities are highly consistent with the true AIMD values. In the long-term evolution, the mean absolute error (MAE) of position and velocity will not diverge indefinitely, effectively constraining the system within the phase space that conforms to the physical laws.

[0041] Secondly, the model accurately reproduces the statistical mechanical properties. The trajectory output by the model accurately reproduces the complex physical properties, including not only the radial distribution function (RDF), but also the mean square displacement (MSD) and the vibration dynamic density (VDOS).

[0042] Thirdly, computational efficiency is improved. Taking the 30ps trajectory generation of NbSe3 as an example, the ChronoCast model predicts in a 5fs span mode, and the time taken is only 31% of that of the traditional NEP model.

[0043] A key contribution of this invention is that it changes the paradigm of simulation. The core of traditional methods (AIMD, MLPs) is calculating the potential energy surface. Then through The force is obtained, and then the position and velocity are updated through numerical integration (such as the Verlet-Velocity algorithm). The algorithm of this invention completely bypasses this process. and The calculations directly learned the dynamic evolution itself.

[0044] A significant advantage of this new paradigm is that, with relatively little training trajectory data, researchers can use the algorithm of this invention to rapidly generate dynamic trajectories lasting tens of picoseconds after training, and immediately post-process and analyze this trajectory data to calculate key physical properties (such as radial distribution function, root mean square displacement, vibration density state, etc.). This ability to obtain physical properties "on the fly" greatly accelerates the evaluation, comparison, and screening process of material properties, which is unmatched by traditional high-cost methods (such as AIMD).

[0045] A molecular dynamics time series prediction method based on equivariant graph neural networks reconstructs the dynamic evolution of an atomic system into an end-to-end time series prediction task. It employs an autoregressive time series model based on equivariant graph neural networks as the dynamic evolution operator, receiving the atomic system's current time... Using complete phase space state information as input, and through built-in physical constraints, the system can directly predict the future state of the system. The state.

[0046] Specific embodiments, such as Figures 1 to 7 As shown:

[0047] The core of the prediction method proposed in this embodiment is as follows: Figure 1 The algorithm architecture shown is the ChronoCast model. This algorithm follows an encoder-processor-decoder process.

[0048] First, the atomic configuration (position information) of the system Speed ​​information It is encoded as a graph structure.

[0049] Next, in the processor (i.e., the multi-layer EGNN), the features of nodes and edges are iteratively optimized through message passing (Aggregation) and update steps. For example... Figure 1 As shown in the upper right illustration, this process involves handling scalar, vector, and second-order tensor information, ensuring physical equivariance.

[0050] Finally, the decoder predicts the atomic displacements based on the output features of the last EGNN layer. and an intermediate speed Final position Intermediate speed After correction by a momentum conservation layer, the final physically consistent velocity is obtained. .

[0051] The model incorporates E(3) isomorphic physical constraints to ensure that the predicted atomic positions and velocities have the correct physical response to rigid transformations (including rotation and translation) of the system.

[0052] After predicting the velocity, the model includes a momentum conservation correction step, which ensures the conservation of the system's total momentum by subtracting the overall velocity of the system's center of mass.

[0053] The model is trained from a reference dynamic trajectory (such as an AIMD trajectory) using supervised learning. The supervision signal is defined as the loss function for time series prediction: that is, calculating the model's prediction of the next time step. The mean square error between the atomic positions and velocities and the true values ​​of the reference dynamic trajectory (AIMD trajectory) is 1:1, and the method can achieve high-fidelity simulation results with less training data than traditional machine learning potentials.

[0054] The predicted time step The time step can be larger than the original time step of the reference trajectory (such as AIMD), thereby further improving the overall computational efficiency of the simulation.

[0055] The graphical model framework of the method is extensible, allowing additional physical features (such as heat flow, charge, or spin) to be input as new nodal features to simulate richer physical dynamics processes.

[0056] Example 1

[0057] This embodiment applies the method of the present invention to a single-crystal silicon system containing 216 atoms. 5000-step (1fs step size) AIMD trajectories generated by VASP software are used as training data. Figure 3 As shown in (a), the training loss and validation loss of the model decrease smoothly and converge, indicating that the algorithm has successfully learned the dynamic laws of the single-crystal silicon system. Figure 3 As shown in (b) and 3(c), in single-step prediction ( In this study, the predicted atomic position and velocity components are in high agreement with the true values ​​of AIMD (all points are closely clustered in the data). (Perfect online), demonstrating extremely high single-step prediction accuracy. In long-term autoregressive prediction ( Figure 4 The trajectory (ChronoCast line) generated by this method is almost identical to the AIMD benchmark in the initial stage (0-5ps). As time progresses (15-20ps), the trajectory diverges due to chaotic effects, but... Figure 4 The mean absolute error (MAE) in (c) and 4(d) shows that the error grows and quickly reaches saturation, indicating that the system remains stable and has not collapsed.

[0058] The most important thing is Figure 5 The physical properties are compared as shown. Whether it's the radial distribution function (RDF) ( Figure 5 a) Root Mean Square Displacement (MSD) Figure 5 b) Displacement probability density ( Figure 5 c) or vibrational density state (VDOS) Figure 5 d) The results calculated from the trajectories generated by the method of this invention (including trajectories of 0-5ps and 15-20ps) are highly consistent with the high-cost AIMD benchmark. This proves that the algorithm of this invention accurately reproduces the statistical physical properties of the system.

[0059] Example 2

[0060] To verify the generalization ability of the method, this embodiment applies it to a more complex NbSe3 system containing 204 atoms (with a quasi-one-dimensional chain structure and complex covalent / van der Waals interactions). Training is performed using AIMD data with 3000 steps (1 fs step size). Figure 6 As shown in (a), the model also exhibits good convergence on this complex system. Figure 6 Tables (c) and (d) compare the prediction results of the method of the present invention (ChronoCast line) with the AIMD benchmark and an advanced machine learning potential method (NEP-based MD line). The results show that the results of the method of the present invention agree best with the AIMD benchmark for both displacement probability density and vibration density state (VDOS), and outperform the machine learning potential method at several key peaks. This indicates that the end-to-end time series paradigm of the present invention has higher fidelity in reproducing physical properties because it eliminates the error accumulation of the series of intermediate steps of "potential energy surface fitting – force calculation – integration" in the machine learning potential method.

[0061] The algorithm framework described in this invention has high scalability. In the above embodiments, the input feature is mainly the position of atoms. and speed However, depending on the physical problem being studied, new physical inputs can be added to the nodal features of the graphical model. For example, features characterizing local heat flux, charge density, or spin can be introduced as new nodal inputs, enabling the algorithm of this invention to simulate and predict richer physical phenomena, such as thermal transport, charge dynamics, or the evolution of magnetic systems. Correspondingly, the model's decoder can also be designed to output new physical quantities, such as predicting changes in local energy or heat flux for each atom, rather than just position and velocity. This framework's flexibility allows it to be applied to a wide range of complex dynamical simulation problems, offering broad application prospects.

[0062] This invention bypasses the time-consuming energy and force calculation steps in traditional methods and significantly reduces the amount of training data required. It can generate trajectories with high computational efficiency and high physical fidelity, demonstrating superior efficiency compared to existing ab initio and machine learning potential-assisted molecular dynamics methods. It provides a novel and rapid approach for time-domain-based statistical physics simulations and the study of complex problems.

[0063] The above description is merely a specific embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any person skilled in the art can easily conceive of various equivalent modifications or substitutions within the technical scope disclosed in the present invention, and these modifications or substitutions should all be covered within the scope of protection of the present invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.

[0064] If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this invention, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

Claims

1. A molecular dynamics time series prediction method based on equivariant graph neural networks, characterized in that: The dynamic evolution process of an atomic system is reconstructed as an end-to-end time series prediction task. An autoregressive time series model based on an isovariant graph neural network is used as the dynamic evolution operator to receive the atomic system at the current time step. Using complete phase space state information as input, and through built-in physical constraints, the system can directly predict the future state of the system. The state.

2. The molecular dynamics time series prediction method based on equivariant graph neural networks according to claim 1, characterized in that: The state information includes the position information of each atom in the system. and speed information The atomic system is constructed as a graph structure in the autoregressive time series model, where atoms are viewed as nodes in the graph, and the velocity information of the atoms... It is explicitly input as one of the node features to capture complete phase space information.

3. The molecular dynamics time series prediction method based on equivariant graph neural networks according to claim 2, characterized in that: The autoregressive time series model incorporates E(3) isovariant physical constraints to ensure that the predicted atomic positions and velocities have the correct physical response to the rigid transformation of the system.

4. The molecular dynamics time series prediction method based on equivariant graph neural networks according to claim 3, characterized in that: The autoregressive time series model includes a momentum conservation correction step after predicting the velocity, which ensures the conservation of the total momentum of the system by subtracting the overall velocity of the system's center of mass.

5. The molecular dynamics time series prediction method based on equivariant graph neural networks according to claim 1, characterized in that: The autoregressive time series model will predict the current... The state at each moment serves as the input for the next round of prediction, forming a closed loop for continuously generating dynamic trajectories over long periods.

6. The molecular dynamics time series prediction method based on equivariant graph neural networks according to claim 5, characterized in that: By sampling and calculating from the generated dynamic trajectory data, the statistical physical properties of the atomic system are obtained, including the radial distribution function, root mean square displacement, and vibrational density state.

7. The molecular dynamics time series prediction method based on equivariant graph neural networks according to claim 6, characterized in that: The autoregressive time series model is trained from a reference dynamic trajectory through supervised learning, and the predicted time step can be larger than the original time step of the reference dynamic trajectory.

8. The molecular dynamics time series prediction method based on equivariant graph neural networks according to claim 1, characterized in that: The autoregressive time series model framework is scalable, allowing additional physical features to be used as new node feature inputs to simulate richer physical dynamic processes.

9. A molecular dynamics time series prediction system based on equivariant graphical neural networks, characterized in that: The system includes a processor that, when running, invokes the method described in any one of claims 1 to 8 to predict the continuous dynamic trajectory of an atomic system.

10. A computer-readable storage medium, characterized in that: The computer-readable storage medium stores computer-readable instructions that, when executed by a processor, invoke the steps of the method according to any one of claims 1 to 8.