Density functional theory calculation method and related apparatus

By designing the main processor and coprocessor in tandem, and utilizing the on-chip core memory of the coprocessor to perform density functional theory calculations, the problem of frequent data movement in traditional architectures is solved, achieving efficient computing cycles and low power consumption.

CN122245463APending Publication Date: 2026-06-19GUANGDONG XINPEISEN TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
GUANGDONG XINPEISEN TECHNOLOGY CO LTD
Filing Date
2026-04-21
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

In the traditional von Neumann architecture, density functional theory calculations suffer from problems such as excessively long computation cycles, memory overflow, and a surge in communication overhead, mainly due to the separation of computing units and storage, which leads to frequent data movement.

Method used

A collaborative hardware architecture of main processor and coprocessor is adopted to clearly define task boundaries and perform wave function iterative calculations through the on-chip core memory of the coprocessor, thereby reducing the frequent data movement between computing and storage units.

Benefits of technology

It significantly reduces data transmission overhead, greatly shortens the computation cycle, reduces computing power consumption, and improves computing efficiency.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122245463A_ABST
    Figure CN122245463A_ABST
Patent Text Reader

Abstract

This application discloses a density functional theory calculation method and related equipment, belonging to the field of data processing technology. The method includes: acquiring initial parameters of the simulated system; preprocessing the initial parameters and encapsulating them into a data packet; transmitting the data packet to a coprocessor, which then performs an iterative wavefunction calculation based on the data packet and its on-chip core memory, obtaining and returning the wavefunction iteration calculation result; and completing the density functional theory calculation based on the returned calculation result to determine the physical property parameters of the simulated system. This application employs a collaborative hardware architecture design with a main processor and a coprocessor, clearly defining the task boundaries of different processors. By combining the on-chip core memory of the coprocessor to perform iterative wavefunction calculations, it effectively reduces the frequent data transfer between computation and storage units, thereby significantly reducing data transmission overhead, greatly shortening the computation cycle of large-scale systems, and reducing computational power consumption.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of data processing technology, and in particular to a density functional theory calculation method and related equipment. Background Technology

[0002] Currently, density functional theory (DFT) is the core method of first-principles calculations and is widely used to predict physical property parameters such as electronic structure, mechanical properties and catalytic activity of materials.

[0003] In related technologies, density functional theory calculation methods involve a large number of data-intensive operations, and the calculation process typically utilizes traditional von Neumann architecture processors such as central processing units and graphics processing units. However, in practical applications, it has been found that in the traditional von Neumann architecture, the computing unit and storage are separated, and data needs to be frequently moved between the main processor and memory, resulting in problems such as excessively long calculation cycles, memory overflow, and a surge in communication overhead.

[0004] In summary, the technical problems existing in the relevant technologies need to be improved. Summary of the Invention

[0005] This application provides a density functional theory calculation method and related equipment, which can effectively reduce the frequent movement of data between computing and storage units, thereby significantly reducing data transmission overhead, greatly shortening the calculation cycle of large-scale systems, and reducing computing power consumption.

[0006] On one hand, embodiments of this application provide a density functional theory calculation method applied to a main processor, the method comprising the following steps: In response to the density functional theory calculation task command, the initial parameters of the simulated system are obtained; The initial parameters of the simulation system are preprocessed and then encoded and encapsulated to obtain data packets; The data packet is transmitted to the coprocessor, which then performs a wavefunction iterative calculation process based on the data packet and its on-chip core memory to obtain the wavefunction iterative calculation result, and returns the wavefunction iterative calculation result to the main processor. Based on the returned wave function iterative calculation results, density functional theory calculations are completed to determine the physical property parameters of the simulated system.

[0007] Optionally, the preprocessing of the initial parameters of the simulation system and the encapsulation of the data packet through encoding include: Based on the initial parameters of the simulated system, the effective potential of the simulated system is calculated; wherein, the initial parameters include the initial charge density and the initial wave function; Based on the effective potential of the simulated system, the initial solution of the Kohn-Sham equation is solved to obtain the wave function and eigenvalues; The initial parameters, wavefunctions, and eigenvalues ​​of the simulated system are encoded, and data packets are obtained through data encapsulation.

[0008] Optionally, calculating the effective potential of the simulation system based on the initial parameters of the simulation system includes: Calculate the external potential based on the atomic nucleus coordinates and number of electrons in the simulated system; Based on the initial charge density, calculate the Hartley potential and the exchange correlation potential; The effective potential of the simulation system is constructed by superimposing the external potential, the Hartley potential, and the exchange correlation potential.

[0009] Optionally, transmitting the data packet to the coprocessor, and having the coprocessor perform a wavefunction iterative calculation process based on the data packet and in conjunction with the coprocessor's on-chip core memory to obtain the wavefunction iterative calculation result, and returning the wavefunction iterative calculation result to the main processor, includes: The data packet is transmitted to the coprocessor's storage resources via a transmission interface; wherein the storage resources include on-chip core memory and / or off-chip main memory; The coprocessor reads the effective potential and the initial wavefunction into storage resources; The coprocessor constructs a Hamiltonian operator based on the effective potential and performs the operation of the Hamiltonian operator on the initial wave function; Based on the results of the calculation, an iterative optimization algorithm is executed to update and orthogonalize the wavefunction until the preset wavefunction convergence condition is met, and the wavefunction iterative calculation result is obtained. Then, the wavefunction iterative calculation result is returned to the main processor through the transmission interface.

[0010] Optionally, the iterative optimization algorithm may employ any one of the following: conjugate gradient method, Davidson algorithm, LOBPCG algorithm, or quasi-Newton method.

[0011] Optionally, the step of performing density functional theory calculations based on the returned wavefunction iterative calculation results to determine the physical property parameters of the simulated system includes: Based on the returned wave function iterative calculation results, combined with the eigenvalues ​​obtained by solving the Kohn-Sham equation, the charge density of the simulated system is updated. Based on the charge density of the simulation system before and after the update, a self-consistent field iterative convergence judgment is performed. When the convergence state of the simulation system is self-consistent convergence, the physical property parameters of the simulation system are calculated.

[0012] Optionally, the step of determining the convergence of the self-consistent field iteration based on the charge density before and after the update of the simulated system includes: Based on the charge density of the simulation system before and after the update, the spatial integral difference is calculated. If the spatial integral difference is less than or equal to a preset convergence threshold, the convergence state of the simulation system is determined to be self-consistent convergence. If the spatial integral difference is greater than the preset convergence threshold, the convergence state of the simulation system is determined to be non-self-consistent convergence. The updated charge density and the historical charge density are weighted and mixed according to the preset mixing algorithm to generate the initial charge density for the next round of iteration and trigger the next round of wave function iteration calculation.

[0013] Optionally, when the convergence state of the simulation system is self-consistent convergence, calculating the physical property parameters of the simulation system includes: The returned wave function iterative calculation results are simulated and analyzed; Based on the analysis results, the physical property parameters of the simulated system were calculated using statistical physics methods.

[0014] On the other hand, embodiments of this application provide a density functional theory calculation device applied to a main processor, the device comprising: The data acquisition module is used to acquire the initial parameters of the simulation system in response to the density functional theory calculation task instructions; The data processing module is used to preprocess the initial parameters of the simulation system and obtain data packets through encoding and encapsulation. The data transmission module is used to transmit the data packet to the coprocessor, so that the coprocessor can perform a wave function iterative calculation process based on the data packet and in combination with the on-chip core memory of the coprocessor, obtain the wave function iterative calculation result, and return the wave function iterative calculation result to the main processor; The parameter determination module is used to perform density functional theory calculations based on the returned wave function iterative calculation results, and determine the physical property parameters of the simulated system.

[0015] On the other hand, embodiments of this application provide an electronic device, which includes a memory and a processor. The memory stores a computer program, and the processor executes the computer program to implement the above-described method.

[0016] On the other hand, embodiments of this application provide a computer-readable storage medium storing a computer program that, when executed by a processor, implements the above-described method.

[0017] This application embodiment adopts a main processor and coprocessor collaborative hardware architecture design, clarifies the task boundaries of different processors, and effectively reduces the frequent data transfer between computing and storage units by combining the coprocessor's on-chip core storage to perform wave function iterative calculations. This significantly reduces data transmission overhead, greatly shortens the computing cycle of large-scale systems, and reduces computing power consumption. Attached Figure Description

[0018] Figure 1 This is a schematic flowchart of a density functional theory calculation method provided in an embodiment of this application; Figure 2 This is a schematic diagram of a density functional theory calculation process provided in an embodiment of this application; Figure 3 This is a schematic diagram of a main coprocessor communication method provided in an embodiment of this application; Figure 4 This is a schematic diagram of the structure of a density functional theory calculation device provided in an embodiment of this application; Figure 5 This is a schematic diagram of the hardware structure of an electronic device provided in an embodiment of this application. Detailed Implementation

[0019] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of this application and are not intended to limit it. In the following description, when referring to the accompanying drawings, unless otherwise indicated, the same numbers in different drawings represent the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with those of this application; they are merely examples of apparatuses and methods consistent with some aspects of the embodiments of this application as detailed in the appended claims.

[0020] It is understood that the terms “first,” “second,” etc., used in this application may be used herein to describe various concepts, but unless otherwise stated, these concepts are not limited by these terms. These terms are only used to distinguish one concept from another. For example, without departing from the scope of the embodiments of this application, first information may also be referred to as second information, and similarly, second information may also be referred to as first information. Depending on the context, the words “if,” “when,” or “in response to a determination” as used herein may be interpreted as “when…” or “when…” or “in response to a determination.”

[0021] As used in this application, the terms "at least one", "multiple", "each", "any", etc., "at least one" includes one, two or more, "multiple" includes two or more, "each" refers to each of the corresponding multiples, and "any" refers to any one of the multiples.

[0022] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of this application only and is not intended to limit this application.

[0023] Currently, density functional theory (DFT) is the core method of first-principles calculations and is widely used to predict physical property parameters such as electronic structure, mechanical properties and catalytic activity of materials.

[0024] In related technologies, density functional theory calculation methods involve a large number of data-intensive operations, and the calculation process typically utilizes traditional von Neumann architecture processors such as central processing units and graphics processing units. However, in practical applications, it has been found that in the traditional von Neumann architecture, the computing unit and storage are separated, and data needs to be frequently moved between the main processor and memory, resulting in problems such as excessively long calculation cycles, memory overflow, and a surge in communication overhead.

[0025] In view of this, this application provides a density functional theory calculation method and related equipment. By adopting a main processor and coprocessor collaborative hardware architecture design, the task boundaries of different processors are clearly defined. By combining the on-chip core storage of the coprocessor to perform wave function iterative calculation, the frequent data transfer between the computing and storage units is effectively reduced, thereby significantly reducing data transfer overhead, greatly shortening the computing cycle of large-scale systems, and reducing computing power consumption.

[0026] It should be noted that in all specific embodiments of this application, when processing data related to user identity or characteristics, such as user information, user behavior data, user historical data, and user location information, user permission or consent is obtained first. Furthermore, the collection, use, and processing of this data comply with relevant laws, regulations, and standards. In addition, when embodiments of this application require access to sensitive personal information of users, separate permission or consent from the user is obtained through pop-ups or redirection to confirmation pages. Only after obtaining the user's separate permission or consent is the necessary user-related data required for the proper functioning of these embodiments acquired.

[0027] The specific implementation methods of the embodiments of this application will be described in detail below with reference to the accompanying drawings. First, a density functional theory calculation method provided in the embodiments of this application will be described with reference to the accompanying drawings.

[0028] like Figure 1 As shown, Figure 1This is a flowchart illustrating a density functional theory calculation method provided in an embodiment of this application, which can be applied to a main processor, specifically including but not limited to steps 100 to 400.

[0029] Step 100: In response to the density functional theory calculation task instruction, obtain the initial parameters of the simulation system.

[0030] In the embodiments of this application, the heterogeneous computing architecture mainly includes a main processing unit (MPU), a coprocessor (Slave Processing Unit, SPU), and a high-speed transmission interface bus for data transmission and interaction between the main processor and the slave processor. The density functional theory calculation method provided in this application can be applied to the main processor.

[0031] Furthermore, the main processor and the coprocessor communicate with each other, and the coprocessor can receive data packets from the main processor as the data source for subsequent wavefunction iterative calculations.

[0032] The main processor can adopt a memory-computing separation architecture, including but not limited to a central processing unit and a graphics processing unit. It is mainly used for overall scheduling of density functional theory calculation tasks, input parameter preprocessing, integration and output of density functional theory calculation results, and non-computationally intensive auxiliary tasks such as user interaction adaptation. The coprocessor adopts a memory-computing integrated architecture, including but not limited to field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), and in-memory computing chips. It is mainly used for executing computationally intensive core tasks such as wave function iterative calculation.

[0033] In practical applications, the simulation system can be the atomic or molecular system to be studied, and the physical property parameters can be calculated and determined using density functional theory methods.

[0034] Specifically, please refer to Figure 2 , Figure 2 This is a schematic diagram of a density functional theory calculation provided in an embodiment of this application. In response to the input density functional theory calculation task instruction, the main processor can read the parameters of the simulation system from the input file. Specifically, it may include molecular system information and calculation parameter settings, such as the number of atoms, atom number, atom type and coordinates, the number of atom types in the simulation system, wave function cutoff energy, self-consistent field energy convergence threshold, force convergence threshold, basis set type and pseudopotential information, etc.

[0035] Furthermore, the main processor can obtain the initial parameters of the simulation system by constructing the initial charge density through uniform distribution or Gaussian superposition of atomic charges based on the obtained parameters of the simulation system, and by approximating the initial wave function through random states or linear combinations of atomic orbitals under the plane wave basis set.

[0036] Step 200: Preprocess the initial parameters of the simulation system and obtain data packets through encoding and encapsulation.

[0037] In this embodiment, the main processor can preprocess the initial parameters of the simulation system and encapsulate the preprocessed data information into a data packet through information encoding, so that the data packet can be transmitted to the coprocessor for parallel computation through the in-memory computing architecture.

[0038] Therefore, by having the main processor perform non-computationally intensive auxiliary tasks such as overall scheduling of computational tasks and preprocessing of input parameters, and a coprocessor with an in-memory computing architecture to perform computationally intensive tasks such as wavefunction iteration and construction of Hamiltonian and overlapping matrices, the frequent transfer of intermediate matrix and wavefunction data between the main processor and main memory, as is common in traditional architectures, is avoided. By leveraging the customizable pipeline hardware characteristics of the coprocessor, highly regularized operations such as basis function integration and matrix element calculation are mapped to parallel computing units, enabling simultaneous updates of multiple matrix elements or wavefunction components. This overcomes the parallel efficiency limitations of the main processor's serial execution, which is constrained by general thread scheduling, and effectively reduces the frequent movement of data between computing and storage units, thereby significantly reducing data transfer overhead, greatly shortening the computation cycle of large-scale systems, and reducing computing power consumption.

[0039] Specifically, as an optional implementation, the preprocessing of the initial parameters of the simulation system and the encapsulation of the data packet through encoding include: Based on the initial parameters of the simulated system, the effective potential of the simulated system is calculated; wherein, the initial parameters include the initial charge density and the initial wave function; Based on the effective potential of the simulated system, the initial solution of the Kohn-Sham equation is solved to obtain the wave function and eigenvalues; The initial parameters, wavefunctions, and eigenvalues ​​of the simulated system are encoded, and data packets are obtained through data encapsulation.

[0040] In the embodiments of this application, such as Figure 2 As shown, the main processor calculates the effective potential of the simulated system based on the initial parameters of the simulated system. The initial parameters include the initial charge density calculated by the main processor based on the simulated system parameter information read from the simulation system. and initial wave function .

[0041] Specifically, the external potential is calculated to form a spatially distributed local potential function, based on the initial charge density of the simulated system and the external potential. V ext The effective potential of the simulated system is obtained by calculation.

[0042] Based on the initial charge density distribution of the simulated system, the classical Coulomb repulsion potential generated between electrons is calculated and used as the Hartley potential. V H Based on the preset exchange-correlation energy functional, its variational derivative with respect to the initial charge density is solved to obtain the exchange-correlation potential. V xc The effective potential of the simulated system is obtained by superimposing and summing the external potential, the Hartley potential, and the exchange-correlation potential. V eff .

[0043] Furthermore, the main processor will... V eff The initial wave function is used to encode information, and the data is encapsulated to obtain a computation task data packet, which is then sent to the coprocessor.

[0044] Step 300: The data packet is transmitted to the coprocessor, which then performs a wavefunction iterative calculation process based on the data packet and its on-chip core memory to obtain the wavefunction iterative calculation result, and returns the wavefunction iterative calculation result to the main processor.

[0045] The coprocessor is configured to perform the task of solving the Kohn-Sham equations. Specifically, the coprocessor receives the computation task data packet, parses it to obtain the effective potential and initial wavefunction, and constructs a Hamiltonian operator based on the effective potential. H The coprocessor then performs eigenvalue solving or wavefunction iterative optimization on the Hamiltonian operator (e.g., using the conjugate gradient method). During the iterative optimization process, the coprocessor executes operations including the Hamiltonian acting on the wavefunction (…). The coprocessor accelerates core operators including residual vector calculation and wavefunction orthogonalization. In response to the satisfaction of preset wavefunction convergence conditions, the coprocessor encapsulates the calculated current wavefunction and corresponding eigenvalues ​​into a result data packet and sends it back to the main processor.

[0046] In practical applications, when updating and orthogonalizing the wavefunction by executing iterative optimization algorithms, any one of the following methods can be used: conjugate gradient method, Davidson algorithm, LOBPCG algorithm, or quasi-Newton method.

[0047] In this embodiment, after the main processor transmits the data packet to the coprocessor, the coprocessor, based on the received data packet and in conjunction with its on-chip core storage, maps highly regularized operations such as basis function integration and matrix element calculation into parallel computing units to simultaneously update multiple matrix elements or wave function components, executes the wave function iterative calculation process, obtains the wave function iterative calculation result, and then the coprocessor returns the wave function iterative calculation result to the main processor.

[0048] The coprocessor's on-chip core storage is a customized on-chip storage module that is tightly coupled to the computing unit through an on-chip high-speed network. This allows the customized computing unit and the storage unit to be tightly coupled through the on-chip high-speed interconnect network. The computing unit can directly access the data in the adjacent storage unit, enabling near-memory computation of large-scale data during the computation process. This effectively reduces the frequent data movement between the computing and storage units, thereby significantly reducing data transmission overhead and eliminating the performance bottleneck caused by it, achieving efficient acceleration of the core computation steps of density functional theory.

[0049] Specifically, the step of transmitting the data packet to the coprocessor, so that the coprocessor can perform a wavefunction iterative calculation process based on the data packet and in conjunction with the coprocessor's on-chip core memory, to obtain the wavefunction iterative calculation result, and returning the wavefunction iterative calculation result to the main processor, includes: The data packet is transmitted to the coprocessor via the transmission interface. The coprocessor parses the data packet to obtain the initial wavefunction and effective potential data. Based on the effective potential data, a Hamiltonian operator (or Hamiltonian matrix) is constructed. Starting from the initial wavefunction, an iterative optimization loop for the wavefunction is executed. In the iterative optimization loop, matrix operations of the Hamiltonian operator on the wavefunction are performed, and gradient calculation, search direction update, and wavefunction orthogonalization are performed based on the operation results. When the iteration result meets the preset convergence requirements, the current wavefunction is taken as the wavefunction iteration calculation result and returned to the main processor via the transmission interface.

[0050] The coprocessor executes the wavefunction iterative solution process. During this process, the on-chip core memory is used to cache the wavefunction vector and intermediate calculation results of the current iteration step, as well as the data that needs to be repeatedly called for multiple wavefunctions, thereby reducing the number of data operations scheduled from off-chip memory and greatly improving computational efficiency.

[0051] The coprocessor executes wave function iterative solution results that meet the preset convergence requirements, that is, convergence is considered when the total energy change of two consecutive iterations is less than the preset threshold.

[0052] In the embodiments of this application, please refer to Figure 3 , Figure 3This is a schematic diagram of a main processor and coprocessor communication method provided in an embodiment of this application. The main processor and coprocessor can be connected through a high-speed transmission interface. The transmission interface includes, but is not limited to, industry standard interfaces such as PCIe 5.0, UCIe 1.1, CCIX, and Gen-Z, to ensure the efficiency of data packet transmission and result return.

[0053] Step 400: Based on the returned wave function iterative calculation results, complete the density functional theory calculation and determine the physical property parameters of the simulated system.

[0054] In the embodiments of this application, such as Figure 2 As shown, the coprocessor encodes the wave function after iterative calculation and returns it to the main processor through a high-speed transmission interface. Based on the returned wave function iterative calculation results, the main processor completes density functional theory calculations, calculates and determines the physical property parameters of the simulated system, and outputs the results back to the user.

[0055] Therefore, this application adopts a hardware architecture design that combines a main processor and a coprocessor, clearly defines the task boundaries of different processors, and combines the on-chip core memory of the coprocessor to perform wave function iterative calculations, effectively reducing the frequent data transfer between computing and storage units, thereby significantly reducing data transfer overhead, greatly shortening the computing cycle of large-scale systems, and reducing computing power consumption.

[0056] Specifically, as an optional implementation, the step of performing density functional theory calculations based on the returned wavefunction iterative calculation results to determine the physical property parameters of the simulated system includes: Based on the returned wave function iterative calculation results, combined with the eigenvalues ​​obtained by solving the Kohn-Sham equation, the charge density of the simulated system is updated. Based on the charge density of the simulation system before and after the update, a self-consistent field iterative convergence judgment is performed. When the convergence state of the simulation system is self-consistent convergence, the physical property parameters of the simulation system are calculated.

[0057] In the embodiments of this application, such as Figure 2 As shown, the main processor can decode the wave function iterative calculation results returned by the coprocessor. The wave function is the electronic state function used for iterative calculation. It is expanded with a plane wave basis set, and its coefficients are stored as a complex array. Each atom corresponds to a set of basis functions. The eigenvalues ​​are used to determine the occupancy state of each orbital. Combined with the Fermi-Dirac distribution, electron filling is achieved.

[0058] Therefore, the charge density of the simulated system can be updated by combining the returned wave function iterative calculation results with the eigenvalues ​​obtained by solving the Kohn-Sham equation.

[0059] Furthermore, the charge density of the simulated system before and after the update is used as the criterion for judging the convergence of the self-consistent field iteration, and the physical property parameters of the simulated system are calculated when the convergence state of the simulated system is self-consistent convergence.

[0060] Specifically, the spatial integral difference can be calculated based on the updated charge density and the updated charge density. For example, the spatial integral difference can be calculated by taking the square root of the L2 norm of the charge density difference, i.e., by integrating the square of the charge density difference over the whole space.

[0061] Furthermore, if the spatial integral difference exceeds a preset convergence threshold, the convergence state of the simulated system is determined to be non-consistent convergence. A weighted mixture of the updated charge density and historical charge density can be performed using a hybrid algorithm to generate the initial charge density for the next iteration, triggering the next round of wavefunction iteration calculation, and returning to solving the initial solution of the equations to obtain the wavefunction and eigenvalues ​​of the simulated system. The preset convergence threshold can be set according to specific application requirements; the hybrid algorithm can be simple mixing, Pulay mixing, Broyden mixing, Keler mixing, Anderson mixing, etc., to smooth the charge density evolution path and improve convergence stability. The historical charge density is a snapshot of the charge density saved in the previous iterations.

[0062] Furthermore, if the spatial integral difference is less than or equal to a preset convergence threshold, the convergence state of the simulation system is determined to be self-consistent convergence, which can further trigger the main processor to calculate and determine the physical property parameters of the simulation system. These physical property parameters may include total energy, electronic density of states, band structure, effective mass, dielectric function, and magnetic moment.

[0063] Specifically, when the convergence state of the simulated system is self-consistent convergence, the returned wave function iterative calculation results can be simulated and analyzed to obtain the analysis results of the simulated system. Based on the analysis results, the physical property parameters of the simulated system can be calculated using statistical physics theory methods.

[0064] Among them, statistical physics methods include the Fermi-Dirac distribution for electron occupation calculation and conductivity estimation based on Boltzmann transport theory.

[0065] Therefore, this application utilizes a collaborative architecture between the main processor and the coprocessor to offload computationally intensive tasks such as wavefunction iteration, Hamiltonian matrix and overlapping matrix construction to a coprocessor with in-memory computing capabilities. This avoids frequent data transfer between main memory and the processor, effectively alleviating the memory wall bottleneck. Furthermore, leveraging the coprocessor's customizable pipelined parallel hardware characteristics, it achieves efficient concurrent computation of basis function integration and matrix elements, overcoming the parallel limitations of the main processor's general scheduling. Further, the main processor focuses on non-regular logic such as convergence judgment and task scheduling, while the coprocessor handles highly regular core loops. Through high-speed interface and pipelined collaboration, resource allocation and overall hardware utilization are optimized, significantly reducing energy consumption for large-scale density functional theory computations while ensuring computational accuracy.

[0066] Please see Figure 4 , Figure 4 This is a schematic diagram of a density functional theory calculation device provided in an embodiment of this application. This application also provides a density functional theory calculation device that can implement the above-mentioned density functional theory calculation method and is applied to a main processor. The device includes: The data acquisition module 410 is used to acquire the initial parameters of the simulation system in response to the density functional theory calculation task command. The data processing module 420 is used to preprocess the initial parameters of the simulation system and obtain data packets through encoding and encapsulation. The data transmission module 430 is used to transmit the data packet to the coprocessor, so that the coprocessor can perform a wave function iterative calculation process based on the data packet and in combination with the on-chip core memory of the coprocessor, obtain the wave function iterative calculation result, and return the wave function iterative calculation result to the main processor; The parameter determination module 440 is used to perform density functional theory calculations based on the returned wave function iterative calculation results to determine the physical property parameters of the simulated system.

[0067] It is understood that the content of the above method embodiments is applicable to the present device embodiments. The specific functions implemented by the present device embodiments are the same as those of the above method embodiments, and the beneficial effects achieved are also the same as those achieved by the above method embodiments.

[0068] Please see Figure 5 , Figure 5 This is a schematic diagram of the hardware structure of an electronic device provided in an embodiment of this application. The electronic device includes: The processor 501 can be implemented using a general-purpose CPU (Central Processing Unit), microprocessor, application-specific integrated circuit (ASIC), or one or more integrated circuits, and is used to execute relevant programs to implement the technical solutions provided in the embodiments of this application. The memory 502 can be implemented as a read-only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM). The memory 502 can store the operating system and other applications. When the technical solutions provided in the embodiments of this specification are implemented through software or firmware, the relevant program code is stored in the memory 502 and is called and executed by the processor 501 using the methods described in the embodiments of this application. The input / output interface 503 is used to implement information input and output; The communication interface 504 is used to enable communication and interaction between this device and other devices. Communication can be achieved through wired means (such as USB, network cable, etc.) or wireless means (such as mobile network, WIFI, Bluetooth, etc.). Bus 505 transmits information between various components of the device (e.g., processor 501, memory 502, input / output interface 503, and communication interface 504); The processor 501, memory 502, input / output interface 503, and communication interface 504 are connected to each other within the device via bus 505.

[0069] This application also provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the above-described method.

[0070] It is understood that the content of the above method embodiments is applicable to this storage medium embodiment. The specific functions implemented in this storage medium embodiment are the same as those in the above method embodiments, and the beneficial effects achieved are also the same as those achieved in the above method embodiments.

[0071] This application also provides a computer program product, including a computer program that, when executed by a processor, implements the above-described method.

[0072] It is understood that the content of the above method embodiments is applicable to the embodiments of this program product. The specific functions implemented by the embodiments of this program product are the same as those of the above method embodiments, and the beneficial effects achieved are also the same as those achieved by the above method embodiments.

[0073] Memory, as a non-transitory computer-readable storage medium, can be used to store non-transitory software programs and non-transitory computer-executable programs. Furthermore, memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory may optionally include memory remotely located relative to the processor, and these remote memories can be connected to the processor via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

[0074] This application provides a density functional theory calculation method and related equipment. By adopting a main processor and coprocessor collaborative hardware architecture design, the task boundaries of different processors are clearly defined. By combining the on-chip core storage of the coprocessor to perform wave function iterative calculation, the frequent data transfer between the computing and storage units is effectively reduced, thereby significantly reducing data transfer overhead, greatly shortening the computing cycle of large-scale systems, and reducing computing power consumption.

[0075] The embodiments described in this application are for the purpose of more clearly illustrating the technical solutions of the embodiments of this application, and do not constitute a limitation on the technical solutions provided by the embodiments of this application. As those skilled in the art will know, with the evolution of technology and the emergence of new application scenarios, the technical solutions provided by the embodiments of this application are also applicable to similar technical problems.

[0076] Those skilled in the art will understand that the technical solutions shown in the figures do not constitute a limitation on the embodiments of this application, and may include more or fewer steps than shown, or combine certain steps, or different steps.

[0077] The device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs.

[0078] Those skilled in the art will understand that all or some of the steps in the methods disclosed above, as well as the functional modules / units in the systems and devices, can be implemented as software, firmware, hardware, or suitable combinations thereof.

[0079] The terms “first,” “second,” “third,” “fourth,” etc. (if present) in the specification and accompanying drawings of this application are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments of this application described herein can be implemented in orders other than those illustrated or described herein. Furthermore, the terms “comprising” and “having,” and any variations thereof, are intended to cover non-exclusive inclusion; for example, a process, method, system, product, or apparatus that comprises a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or apparatus.

[0080] It should be understood that in this application, "at least one (item)" means one or more, and "more than" means two or more. "And / or" is used to describe the relationship between related objects, indicating that three relationships can exist. For example, "A and / or B" can represent three cases: only A exists, only B exists, and both A and B exist simultaneously, where A and B can be singular or plural. The character " / " generally indicates that the preceding and following related objects are in an "or" relationship. "At least one (item) of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one (item) of a, b, or c can represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", where a, b, and c can be single or multiple.

[0081] In the several embodiments provided in this application, it should be understood that the disclosed apparatus and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of the units described above is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.

[0082] The units described above as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0083] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.

[0084] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes multiple instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods of the various embodiments of this application. The aforementioned storage medium includes various media capable of storing programs, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0085] The preferred embodiments of the present application have been described above with reference to the accompanying drawings, but this does not limit the scope of the claims of the present application. Any modifications, equivalent substitutions, and improvements made by those skilled in the art without departing from the scope and substance of the embodiments of the present application shall be within the scope of the claims of the present application.

Claims

1. A density functional theory calculation method, characterized in that, Applied to the main processor, the method includes the following steps: In response to the density functional theory calculation task command, the initial parameters of the simulated system are obtained; The initial parameters of the simulation system are preprocessed and then encoded and encapsulated to obtain data packets; The data packet is transmitted to the coprocessor, which then performs a wavefunction iterative calculation process based on the data packet and its on-chip core memory to obtain the wavefunction iterative calculation result, and returns the wavefunction iterative calculation result to the main processor. Based on the returned wave function iterative calculation results, density functional theory calculations are completed to determine the physical property parameters of the simulated system.

2. The method according to claim 1, characterized in that, The preprocessing of the initial parameters of the simulation system and the encapsulation of the data packet through encoding include: Based on the initial parameters of the simulated system, the effective potential of the simulated system is calculated; wherein, the initial parameters include the initial charge density and the initial wave function; Based on the effective potential of the simulated system, the initial solution of the Kohn-Sham equation is solved to obtain the wave function and eigenvalues; The initial parameters, wavefunctions, and eigenvalues ​​of the simulated system are encoded, and data packets are obtained through data encapsulation.

3. The method according to claim 2, characterized in that, The calculation of the effective potential of the simulation system based on the initial parameters of the simulation system includes: Calculate the external potential based on the atomic nucleus coordinates and number of electrons in the simulated system; Based on the initial charge density, calculate the Hartley potential and the exchange correlation potential; The effective potential of the simulation system is constructed by superimposing the external potential, the Hartley potential, and the exchange correlation potential.

4. The method according to claim 1, characterized in that, The step of transmitting the data packet to the coprocessor, and then having the coprocessor perform a wavefunction iterative calculation process based on the data packet and in conjunction with the coprocessor's on-chip core memory to obtain the wavefunction iterative calculation result, and then returning the wavefunction iterative calculation result to the main processor, includes: The data packet is transmitted to the coprocessor's storage resources via a transmission interface; wherein the storage resources include on-chip core memory and / or off-chip main memory; The coprocessor reads the effective potential and the initial wavefunction into storage resources; The coprocessor constructs a Hamiltonian operator based on the effective potential and performs the operation of the Hamiltonian operator on the initial wave function; Based on the results of the calculation, an iterative optimization algorithm is executed to update and orthogonalize the wavefunction until the preset wavefunction convergence condition is met, and the wavefunction iterative calculation result is obtained. Then, the wavefunction iterative calculation result is returned to the main processor through the transmission interface.

5. The method according to claim 4, characterized in that, The iterative optimization algorithm can be any one of the following: conjugate gradient method, Davidson algorithm, LOBPCG algorithm, or quasi-Newton method.

6. The method according to claim 1, characterized in that, Based on the returned wavefunction iterative calculation results, density functional theory calculations are performed to determine the physical property parameters of the simulated system, including: Based on the returned wave function iterative calculation results, combined with the eigenvalues ​​obtained by solving the Kohn-Sham equation, the charge density of the simulated system is updated. Based on the charge density of the simulation system before and after the update, a self-consistent field iterative convergence judgment is performed. When the convergence state of the simulation system is self-consistent convergence, the physical property parameters of the simulation system are calculated.

7. The method according to claim 6, characterized in that, The step of determining the convergence of the self-consistent field iteration based on the charge density before and after the update of the simulated system includes: Based on the charge density of the simulation system before and after the update, the spatial integral difference is calculated. If the spatial integral difference is less than or equal to a preset convergence threshold, the convergence state of the simulation system is determined to be self-consistent convergence. If the spatial integral difference is greater than the preset convergence threshold, the convergence state of the simulation system is determined to be non-self-consistent convergence. The updated charge density and the historical charge density are weighted and mixed according to the preset mixing algorithm to generate the initial charge density for the next round of iteration and trigger the next round of wave function iteration calculation.

8. The method according to claim 6, characterized in that, When the convergence state of the simulated system is self-consistent convergence, the physical property parameters of the simulated system are calculated, including: The returned wave function iterative calculation results are simulated and analyzed; Based on the analysis results, the physical property parameters of the simulated system were calculated using statistical physics methods.

9. A density functional theory calculation device, characterized in that, Applied to a main processor, the device includes: The data acquisition module is used to acquire the initial parameters of the simulation system in response to the density functional theory calculation task instructions; The data processing module is used to preprocess the initial parameters of the simulation system and obtain data packets through encoding and encapsulation. The data transmission module is used to transmit the data packet to the coprocessor, so that the coprocessor can perform a wave function iterative calculation process based on the data packet and in combination with the on-chip core memory of the coprocessor, obtain the wave function iterative calculation result, and return the wave function iterative calculation result to the main processor; The parameter determination module is used to perform density functional theory calculations based on the returned wave function iterative calculation results, and determine the physical property parameters of the simulated system.

10. An electronic device, characterized in that, The electronic device includes a memory and a processor, the memory storing a computer program, and the processor executing the computer program to implement the method according to any one of claims 1 to 8.