Organic crystal structure prediction method, device, equipment and storage medium
By using a machine learning framework built with generative adversarial networks and graph convolutional networks, organic crystal structures are generated and screened, solving the problem of low prediction efficiency of organic crystal structures in existing technologies and achieving efficient and accurate prediction of organic crystal structures.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- UNIV OF MACAU
- Filing Date
- 2023-01-04
- Publication Date
- 2026-06-26
AI Technical Summary
Existing technologies for predicting organic crystal structures are impractical, inefficient, computationally complex, and have long computation cycles.
A machine learning framework is built using generative adversarial networks and graph convolutional networks. Organic crystal structures are generated through generative models, and index parameters of stable organic crystal structures are predicted using discriminative models. The calculated index parameters and predicted index parameters are combined for screening and sorting, thereby reducing computational complexity.
It effectively reduces the computational complexity of predicting organic crystal structures, improves prediction efficiency, ensures prediction accuracy, and has high practicality.
Smart Images

Figure CN116130018B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of organic crystal structure prediction technology, and more specifically, to an organic crystal structure prediction method, apparatus, device, and storage medium. Background Technology
[0002] Organic crystal structure refers to the microscopic molecular packing form of solid-state organic compounds, and it is one of the most studied solid-state characteristics. Crystal structure has a direct or indirect impact on the physicochemical properties and biological effects of solid substances; therefore, exploring the crystal structure of organic compounds has become a research hotspot in recent years.
[0003] Since experimental methods for crystal screening often fail to consistently yield stable crystal structures, methods using computer technology for crystal structure prediction have emerged, such as quantum mechanical calculation-based crystal structure prediction. This method generates initial experimental crystal structures through random search, grid search, genetic algorithms, Monte Carlo simulated annealing, or other generative methods, and then predicts stable crystal structures by calculating the lattice energy of these experimental crystal structures.
[0004] However, the above methods have extremely high computational complexity and long computation cycles, thus they are impractical and inefficient for predicting organic crystal structures. Summary of the Invention
[0005] The purpose of this application is to address the shortcomings of the prior art by providing an organic crystal structure prediction method, apparatus, device, and storage medium, so as to solve the problems of poor practicality and low efficiency in the prior art for organic crystal structure prediction.
[0006] To achieve the above objectives, the technical solution adopted in this application is as follows:
[0007] In a first aspect, this application provides a method for predicting the structure of organic crystals, the method comprising:
[0008] Obtain the molecular structure of the target compound and determine the condition vector of the target compound based on the feature descriptor of the target compound;
[0009] The conditional vector is input into the pre-trained generative model to generate at least one organic crystal structure;
[0010] Determine the computational index parameters for each of the described organic crystal structures;
[0011] The molecular structure of the target compound is input into a pre-trained discriminant model to obtain the prediction index parameters of the stable organic crystal structure corresponding to the target compound.
[0012] The stable organic crystal structure is obtained based on the calculated index parameters, the predicted index parameters, the preset index parameter difference threshold, and each organic crystal structure.
[0013] Optionally, the generative model is a generative adversarial network;
[0014] The step of inputting the conditional vector into a pre-trained generative model to generate at least one organic crystal structure includes:
[0015] The condition vector is input into the generative adversarial network, and the generator of the generative adversarial network generates at least one organic crystal structure based on the condition vector.
[0016] Optionally, the discrimination model is a graph convolutional network, which includes: multiple graph convolutional layers and a prediction layer connected in sequence;
[0017] The step of inputting the molecular structure of the target compound into a pre-trained discriminant model to obtain prediction index parameters of the stable organic crystal structure corresponding to the target compound includes:
[0018] The molecular structure is input into the first graph convolutional layer, and according to the connection order of each graph convolutional layer, it is processed sequentially by each graph convolutional layer to obtain the fragment features of the molecular structure.
[0019] The fragment features are input into the prediction layer to obtain the prediction index parameters of the stable organic crystal structure.
[0020] Optionally, obtaining the stable organic crystal structure based on the calculated index parameters, the predicted index parameters, the preset index parameter difference threshold, and each of the organic crystal structures includes:
[0021] Based on the calculated index parameters, the predicted index parameters, and the preset index parameter difference threshold, multiple intermediate organic crystal structures are selected from multiple organic crystal structures.
[0022] The intermediate organic crystal structures are clustered to obtain at least one clustered organic crystal structure. The clustered organic crystal structures are then sorted according to the difference in index parameters of each clustered organic crystal structure. The stable organic crystal structure is obtained based on the sorting results.
[0023] Optionally, the step of selecting multiple intermediate organic crystal structures from multiple organic crystal structures based on the calculated index parameters, the predicted index parameters, and a preset index parameter difference threshold includes:
[0024] The difference in index parameters of the organic crystal structure is determined based on the calculated index parameters and the predicted index parameters.
[0025] Multiple intermediate organic crystal structures are selected from multiple organic crystal structures based on the difference between the index parameters and a preset threshold value for the difference between the index parameters.
[0026] Optionally, determining the condition vector of the target compound based on its feature descriptor includes:
[0027] The feature descriptor of the target compound is obtained, and the feature descriptor is discretized to obtain the condition vector of the target compound.
[0028] Optionally, before inputting the condition vector into the generative adversarial network (GAN), and before the generator of the GAN generates at least one organic crystal structure based on the condition vector, the process includes:
[0029] The sample condition vector and noise vector are input into the initial generative adversarial network (GAN). The generator in the initial GAN generates multiple experimental organic crystal structures. The experimental organic crystal structures and pre-labeled organic crystal structures are input into the discriminator in the initial GAN to determine the difference information between the experimental organic crystal structures and the pre-labeled organic crystal structures. The loss value of the initial GAN is determined based on the difference information. The GAN is iteratively corrected based on the loss value to obtain the final GAN.
[0030] Secondly, this application provides an organic crystal structure prediction device, the device comprising:
[0031] The acquisition module is used to: acquire the molecular structure of the target compound and determine the condition vector of the target compound based on the feature descriptor of the target compound;
[0032] The generation module is used to: input the conditional vector into a pre-trained generation model to generate at least one organic crystal structure;
[0033] The determination module is used to: determine the computational index parameters of each of the organic crystal structures;
[0034] The prediction module is used to: input the molecular structure of the target compound into a pre-trained discrimination model to obtain prediction index parameters of the stable organic crystal structure corresponding to the target compound;
[0035] The filtering module is used to obtain the stable organic crystal structure based on the calculated index parameters, the predicted index parameters, the preset index parameter difference threshold, and each organic crystal structure.
[0036] Optionally, the generative model is a generative adversarial network;
[0037] Optionally, the generation module is specifically used for:
[0038] The condition vector is input into the generative adversarial network, and the generator of the generative adversarial network generates at least one organic crystal structure based on the condition vector.
[0039] Optionally, the discrimination model is a graph convolutional network, which includes: multiple graph convolutional layers and a prediction layer connected in sequence;
[0040] The prediction module is specifically used for:
[0041] The molecular structure is input into the first graph convolutional layer, and according to the connection order of each graph convolutional layer, it is processed sequentially by each graph convolutional layer to obtain the fragment features of the molecular structure.
[0042] The fragment features are input into the prediction layer to obtain the prediction index parameters of the stable organic crystal structure.
[0043] Optionally, the filtering module is specifically used for:
[0044] Based on the calculated index parameters, the predicted index parameters, and the preset index parameter difference threshold, multiple intermediate organic crystal structures are selected from the multiple organic crystal structures.
[0045] The intermediate organic crystal structures are clustered to obtain at least one clustered organic crystal structure. The clustered organic crystal structures are then sorted according to the difference in index parameters of each clustered organic crystal structure. The stable organic crystal structure is obtained based on the sorting results.
[0046] Optionally, the filtering module is further specifically used for:
[0047] The difference in index parameters of the organic crystal structure is determined based on the calculated index parameters and the predicted index parameters.
[0048] Multiple intermediate organic crystal structures are selected from the multiple organic crystal structures based on the difference between the index parameters and a preset threshold value for the difference between the index parameters.
[0049] Optionally, the acquisition module is further specifically used for:
[0050] The feature descriptor of the target compound is obtained, and the feature descriptor is discretized to obtain the condition vector of the target compound.
[0051] Thirdly, this application provides an electronic device, including: a processor, a storage medium, and a bus, wherein the storage medium stores machine-readable instructions executable by the processor, and when the electronic device is running, the processor communicates with the storage medium via the bus, and the processor executes the machine-readable instructions to perform the steps of the organic crystal structure prediction method described above.
[0052] Fourthly, this application provides a computer-readable storage medium storing a computer program, which, when executed by a processor, performs the steps of the organic crystal structure prediction method described above.
[0053] The beneficial effects of this application are: the generative model can explore more organic crystal structures, and compared with existing methods that match in databases, it greatly reduces computational complexity. Compared with existing methods that directly sort the matched organic crystal structures based on index parameters, this application uses a discriminative model to predict the index parameters of stable organic crystal structures, and then filters and sorts the organic crystal structures based on the predicted and calculated index parameters. This effectively reduces computational load and quickly identifies stable organic crystal structures from multiple sources. In summary, the method of this application predicts organic crystal structures by coupling a generative model and a discriminative model. The generation and screening process of organic crystal structures is entirely implemented by machine learning algorithms, effectively reducing computational complexity while ensuring prediction accuracy, and possessing high practicality. Attached Figure Description
[0054] To more clearly illustrate the technical solutions of the embodiments of this application, the accompanying drawings used in the embodiments will be briefly introduced below. It should be understood that the following drawings only show some embodiments of this application and should not be regarded as a limitation of the scope. For those skilled in the art, other related drawings can be obtained based on these drawings without creative effort.
[0055] Figure 1 A schematic diagram of the architecture of a machine learning framework provided in an embodiment of this application is shown;
[0056] Figure 2 A flowchart of an organic crystal structure prediction method provided in an embodiment of this application is shown;
[0057] Figure 3 This document illustrates a flowchart of a method for determining predictive index parameters according to an embodiment of this application.
[0058] Figure 4 This document illustrates a flowchart of determining prediction index parameters in a graph convolutional network according to an embodiment of this application.
[0059] Figure 5 A flowchart illustrating a method for determining a stable organic crystal structure, as provided in an embodiment of this application, is shown.
[0060] Figure 6 A flowchart illustrating a method for determining an intermediate organic crystal structure, as provided in an embodiment of this application, is shown.
[0061] Figure 7 This paper shows a schematic diagram of the structure of an organic crystal structure prediction device provided in an embodiment of this application;
[0062] Figure 8 A schematic diagram of the structure of an electronic device provided in an embodiment of this application is shown. Detailed Implementation
[0063] To make the objectives, technical solutions, and advantages of the embodiments of this application clearer, the technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. It should be understood that the accompanying drawings in this application are for illustrative and descriptive purposes only and are not intended to limit the scope of protection of this application. Furthermore, it should be understood that the schematic drawings are not drawn to scale. The flowcharts used in this application illustrate operations implemented according to some embodiments of this application. It should be understood that the operations in the flowcharts may not be implemented in sequence, and steps without logical contextual relationships may be reversed or implemented simultaneously. In addition, those skilled in the art, guided by the content of this application, may add one or more other operations to the flowcharts, or remove one or more operations from the flowcharts.
[0064] Furthermore, the described embodiments are merely some, not all, of the embodiments of this application. The components of the embodiments of this application described and illustrated herein can typically be arranged and designed in various different configurations. Therefore, the following detailed description of the embodiments of this application provided in the accompanying drawings is not intended to limit the scope of the claimed application, but merely to illustrate selected embodiments of the application. All other embodiments obtained by those skilled in the art based on the embodiments of this application without inventive effort are within the scope of protection of this application.
[0065] It should be noted that the term "comprising" will be used in the embodiments of this application to indicate the presence of the features declared thereafter, but does not exclude the addition of other features.
[0066] Predicting crystal structures using computer technology is a widely used method for screening experimental crystals. Its ultimate goal is to explore all stable crystal structures starting from molecular structures.
[0067] Based on the assumption that thermodynamically stable crystal structures possess relatively lower energies, the general steps for predicting crystal structures using computer technology involve first exploring the conformational preferences of the target molecule, then generating initial experimental crystal structures through random search, grid search, genetic algorithms, Monte Carlo simulated annealing, or other generative methods. Subsequently, structure optimization or energy minimization is performed to obtain a stack of stable experimental crystal structures. These structures are then ranked using some form of energy-based metric to obtain an energy polymorphism landscape. Finally, the polymorphism or the probability of predicting a stable crystal structure is determined based on the energy landscape. Traditional methods for predicting organic crystal structures not only require lengthy computational cycles, but the optimization of each experimental structure also necessitates expensive computational resources.
[0068] It is evident that current methods for predicting organic crystal structures suffer from long computation cycles and high computational complexity.
[0069] While existing technologies have proposed generating inorganic crystal structures using machine learning models, the prediction of polymorphism in organic compounds differs significantly from the prediction of crystal structures in inorganic materials. For example, the crystal transformation of inorganic materials requires high-temperature and high-pressure environments, such as thousands of degrees Celsius and tens of atmospheres of pressure, while organic materials can undergo crystal transformation at room temperature and pressure. Inorganic materials are highly ordered because the interatomic forces in inorganic crystals are directional, while the stacking of organic compounds involves non-directional intermolecular forces and is influenced by molecular conformation. Furthermore, compared to inorganic materials, organic materials typically have a greater number and variety of atoms and a more complex chemical environment, leading to an exponential increase in the difficulty of predicting their crystal structures. Therefore, methods for predicting the crystal structure of inorganic materials cannot be applied to the prediction of organic crystal structures.
[0070] To address the aforementioned problems, this application proposes a method for predicting the crystal structure of organic compounds. A machine learning framework is constructed using generative and discriminative models, and this framework is then used to predict the crystal structure of organic compounds, effectively improving the efficiency of structure prediction.
[0071] like Figure 1 The diagram shown is an architectural schematic of a machine learning framework proposed in this application. Figure 1The machine learning framework of this application includes: an organic compound crystal structure generation model, such as an Organic Crystal Structure Conditional Generative Adversarial Network (OCGAN), and a discriminative model with an attention mechanism, such as a Molecular Graph Convolutional Network with Attention Mechanism (MolGAT).
[0072] The generative model, constrained by the structural features of the input compound, generates a potential crystal structure space (characterized by crystal structure parameters). Simultaneously, a discriminative model is used to predict the index parameters of stable crystal structures of organic compounds at room temperature. The discriminative model achieves high prediction accuracy for these index parameters by relying on a two-dimensional molecular diagram as input. Subsequently, the predicted index parameters are used as the basis for screening and ranking potential crystal structures, ultimately resulting in a recommended crystal structure ranking map and determining the predicted stable organic crystal structures.
[0073] First, the process of organic compounds forming crystals is explained. Single-component compounds or multi-component compounds can arrange themselves to form organic crystals. The compounds first crystallize into thermodynamically variable structures, and then, due to high nucleation barriers, they stop transforming into thermodynamically stable structures. These variable structures can transform into stable crystal structures under specific crystallization conditions. This application explores how to predict the stable crystal structure of a compound.
[0074] Next, combine Figure 2 To further explain the method of this application, the subject of the method is any electronic device with computing power, which is based on the above... Figure 1 The machine learning framework shown enables the prediction of organic crystal structures. For example... Figure 2 As shown, the method includes:
[0075] S201: Obtain the molecular structure of the target compound and determine the condition vector of the target compound based on its feature descriptor.
[0076] Optionally, the target compound can be a single-component compound or a multi-component compound. A single-component compound can form various organic crystal structures; for example, carbamazepine molecules can form various organic crystals. A multi-component compound can form inclusion complex-type organic crystals.
[0077] Optionally, the feature descriptor of the target compound can be the molecular mass, three-dimensional structure of the molecule, and shape features, etc. For example, the feature descriptor such as the molecular mass, three-dimensional structure of the molecule, and shape features of the compound can be calculated based on the simplified linear input specification of the compound molecule.
[0078] Optionally, the condition vector of the target compound can be a discrete vectorized representation of the feature descriptor.
[0079] Optionally, the molecular structure can be a three-dimensional, two-dimensional, or one-dimensional molecular structure of the compound. Taking a two-dimensional molecular structure as an example, it can be represented as an undirected graph. Nodes can represent atoms of the compound, and edges can represent bonds connecting the atoms.
[0080] S202: Input the conditional vector into the pre-trained generative model to generate at least one organic crystal structure.
[0081] Optionally, the generative model can be any model or algorithm capable of predicting organic crystal structures, such as an autoregressive (AR) model, a hidden Markov model (HMM), an expectation-maximization (EM) algorithm, a variational autoencoder (VAE) framework, etc. This application does not limit the specific algorithm or model used to implement this step.
[0082] Optionally, the generative model can generate at least one organic crystal structure, constrained by a conditional vector of the input compound. The organic crystal structure can be represented as a crystal structure vector and stored as tabular data.
[0083] Optionally, after the generative model generates multiple organic crystal structures, post-processing operations can be performed on each generated organic crystal structure to obtain at least one final output organic crystal structure, in order to avoid generating data that does not conform to the rules of crystal structure.
[0084] For example, multiple organic crystal structures generated can be filtered using a conditional vector to determine at least one organic crystal structure in the final output.
[0085] Optionally, the structure of an organic crystal can be represented by its crystal structure parameters. These parameters include: space group, Z-value, cell edge lengths a, b, and c, and cell angles α, β, and γ. The space group describes the lattice symmetry, the Z-value characterizes the number of molecules in the cell, and the six lattice parameters explore the degrees of freedom of the smallest repeating unit of the crystal structure. These parameters can describe the physical size and geometry of the organic crystal.
[0086] S203: Determine the calculation parameters for each organic crystal structure.
[0087] Optionally, the calculated index parameters can be any structural or energy parameter of the organic crystal, such as density, lattice energy, free energy, three-dimensional atomic coordinates, bond length, bond angle, dihedral angle, etc. The index parameters need to be related to the stability of the organic crystal structure (e.g., there is a direct or inverse proportional relationship), and this application does not impose any restrictions on this.
[0088] For example, structural parameters such as density, bond length, bond angle, and dihedral angle can be calculated based on the definitions of each parameter, while energy parameters can be calculated based on first principles.
[0089] Taking the density of organic crystals as an example, the density rule has shown that organic crystal structures with higher densities are more stable. Therefore, density can be used as an indicator parameter for screening stable organic crystal structures. After obtaining at least one organic crystal structure from the output of the generative model in step S202 above, the electronic device can calculate the density of each organic crystal structure separately.
[0090] For example, different calculation methods can be used for crystal structures of different crystal systems. Taking the triclinic, monoclinic, and orthorhombic crystal systems as examples, the cell volume (A3) and density (g / cm3) of organic crystal structures generated in these three crystal systems can be calculated using the following formulas:
[0091] Triclinic: V=abc(1-cos2α-cos2β-cos2γ+2cosαcosβcos (1)
[0092] Monoclinic: V=abc sin(β) (2)
[0093] Orthorhombic: V = abc I (3)
[0094] Calculated density=nM / (NaV) (4)
[0095] Where V is the unit cell volume, a, b, c are the unit cell lengths, α, β, γ are the unit cell angles, n is the number of molecules in the unit cell, M is the molecular weight, and Na is Avogadro's constant.
[0096] S204: Input the molecular structure of the target compound into the pre-trained discriminant model to obtain the prediction index parameters of the stable organic crystal structure corresponding to the target compound.
[0097] Optionally, the discriminant model can be any model or algorithm that can predict the stable crystal structure of a compound based on its molecular structure.
[0098] For example, the discriminative model can be the graph convolutional network mentioned above, or a deep neural network (recurrent neural network (RNN), convolutional neural network (CNN), graph neural network (GNN)), multilayer perceptron, ensemble learning methods (random forest RF, gradient boosting tree XGBoost, lightweight gradient boosting tree LightGBM, etc.), decision tree, support vector machine, linear model, k-nearest neighbor algorithm, Bayesian algorithm, etc. This application does not limit the specific algorithm or model used to implement this step.
[0099] Optionally, the prediction index parameters can be any structural or energy parameter of an organic crystal, such as density, lattice energy, free energy, three-dimensional atomic coordinates, bond length, bond angle, and dihedral angle. This application does not impose any restrictions on these parameters.
[0100] It is worth noting that the predicted index parameters and the aforementioned calculated index parameters should be of the same property as the organic crystal, such as density or bond angle. The calculated index parameters can be one or more, and their number should correspond one-to-one with the number of organic crystal structures output by the generative model; the predicted index parameters represent the index parameters of stable organic crystal structures, and their number can be one.
[0101] For example, when the index parameter is density, the discriminant model can predict the density of a stable organic crystal structure based on the molecular structure of the compound.
[0102] S205: Based on the calculated index parameters, predicted index parameters, preset index parameter difference thresholds, and each organic crystal structure, a stable organic crystal structure is obtained.
[0103] Optionally, the preset threshold for the difference between the index parameters can be a preset threshold for the difference between the calculated index parameters and the predicted index parameters. Based on the difference between the calculated index parameters and the predicted index parameters, as well as the threshold for the difference between the index parameters, the generated organic crystal structures can be screened.
[0104] For example, assuming the threshold for the difference of index parameters is 0-0.05, the predicted index parameter for a stable organic crystal structure is 0.5, and the calculated index parameter for organic crystal structure 1 is 0.52, then the difference between the calculated index parameter and the predicted index parameter for organic crystal structure 1 is 0.02. By comparing this with the threshold for the difference of index parameters, it can be determined that organic crystal structure 1 meets the threshold for the difference of index parameters and can be identified as a stable organic crystal structure.
[0105] It is worth noting that, based on the calculated index parameters, predicted index parameters, preset index parameter difference thresholds, and various organic crystal structures, the electronic device can first sort and cluster multiple screened organic crystal structures, and then recommend crystal structures based on the sorting results, thereby determining stable organic crystal structures.
[0106] Optionally, the crystal structure parameters of a stable organic crystal structure can be used as the characterization output of the stable organic crystal structure.
[0107] In this embodiment, the generation model first generates at least one organic crystal structure based on the condition vector of the compound, determines the calculated index parameters of each organic crystal structure, and then uses a discriminant model to determine the predicted index parameters of the stable organic crystal structure based on the molecular structure of the compound. Finally, based on the calculated index parameters, the predicted index parameters, the index parameter difference threshold, and each organic crystal structure, the stable organic crystal structure is obtained.
[0108] This application utilizes a generative model that goes beyond the existing templates for organic crystals in databases, exploring a wider range of organic crystal structures. Compared to existing methods that match within databases, this significantly reduces computational complexity. Instead of directly ranking matched organic crystal structures based on index parameters, this application uses a discriminative model to predict the index parameters of stable organic crystal structures. Then, it filters and ranks the organic crystal structures based on both the predicted and calculated index parameters. This effectively reduces computational load and quickly identifies stable organic crystal structures from a pool of options. In summary, this method predicts organic crystal structures by coupling generative and discriminative models. The generation and filtering processes are entirely implemented using machine learning algorithms, effectively reducing computational complexity while ensuring prediction accuracy, thus demonstrating high practicality.
[0109] When using a generative adversarial network as a generative model, the steps described above for inputting conditional vectors into a pre-trained generative model to generate at least one organic crystal structure include:
[0110] The conditional vector is input into the generative adversarial network (GAN), and the generator of the GAN generates at least one organic crystal structure based on the conditional vector.
[0111] Optionally, the generative adversarial network may include a generator and a discriminator. During the use phase of the generative adversarial network (i.e., generating organic crystal structures based on the condition vector of the compound), the generator may use the condition vector as a constraint to generate at least one organic crystal structure.
[0112] For example, the generator can generate multiple organic crystal structures of the target compound. Among these organic crystal structures, there may be organic crystal structures that do not conform to the crystal structure rules. In this case, the multiple generated organic crystal structures can be screened according to the condition vector to filter out the organic crystal structures that do not conform to the crystal structure rules, so as to finally obtain and output at least one organic crystal structure.
[0113] The following section uses graph convolutional networks as an example to explain the steps of the above discriminative model in determining the prediction index parameters based on the molecular structure.
[0114] A graph convolutional network can include multiple graph convolutional layers connected in sequence, as well as a prediction layer. For example... Figure 3 As shown, the above step S204 can be specifically described as follows:
[0115] S301: Input the molecular structure into the first graph convolutional layer, and process it sequentially by each graph convolutional layer according to the connection order of each graph convolutional layer to obtain the fragment features of the molecular structure.
[0116] like Figure 4 The diagram shown is a flowchart illustrating how a graph convolutional network, as described in this application, processes molecular structures to obtain prediction index parameters. After inputting the molecular structure into the graph convolutional layer, multiple fragment features of the molecular structure can be obtained. Multiple sequentially connected graph convolutional layers can be viewed as feature extraction layers, extracting features from the input molecular structure to obtain multiple fragment features. For example, the molecular structure can be the aforementioned two-dimensional molecular structure.
[0117] Optionally, the graph convolutional layer can recursively pass information from neighboring atoms encoded by feature vectors and information from connection bonds encoded by feature vectors to the central atom on the molecular graph via recurrent convolution, aggregating the embedded information of its surrounding chemical environment. Ultimately, a representation of each central atom is learned, followed by state updates and readout operations for the central atom.
[0118] For example, the process of a graph convolutional layer learning the representation and embedding of atoms can be as follows:
[0119] To maintain the order of adjacent atoms, for each atom acting as the center atom, the feature vectors of its adjacent atoms and chemical bonds are concatenated as adjacent vectors. These adjacent vectors are then summed to form a message vector. Different shared convolutional filters are used to extract structural features for predicting crystal index parameters for different input information channels and node levels. This flexibility of attention-based graph convolutional networks allows for optimized automatic extraction of features for crystal index parameters. This message vector then undergoes atom update and output / readout phases. For example, the specific mathematical representation of this process can be:
[0120]
[0121]
[0122] in, Let N(v) be the message of node v at time t+1, and let N(v) be the neighbors of v in the graph. t It is the message function at time point t. It is the hidden state of the neighbors at time t, e v,w It represents edge features, and || indicates a join operation.
[0123] The aforementioned message vector can be used to update features derived from the central atom. Both the hidden vector and the message vector are linearly projected to maintain sufficient expressive power, extracting low-level inputs to high-level features. The transformed hidden vector is added to the transformed message vector, and then, utilizing the expressive power of the transition matrix, batch normalization and non-linear activation operations are performed to define the new hidden state of the atom. Here, only the atom features are updated to ensure high computational efficiency. The mathematical representation of this step can be:
[0124]
[0125]
[0126] Among them, U t Indicates the update function, This represents the hidden state of node v at time t, ReLU is the ReLU activation function, and W0 and W1 are the learned weight matrices.
[0127] Optionally, the fragment features of the molecular structure can be the substructure representations of the central atom output by the above operations. The molecular structure is input into the first graph convolutional layer, and each graph convolutional layer updates the data sequentially according to its connection order, yielding the atomic representations output by each layer. These atomic representations represent different substructure features of the central atom and can be used to predict molecular properties.
[0128] S302: Input the fragment features into the prediction layer to obtain the prediction index parameters of the stable organic crystal structure.
[0129] Continue to refer to Figure 4 By combining multiple fragment features using an attention mechanism, and then inputting the combined fragment features into a prediction layer composed of multiple sequentially connected fully connected layers, the prediction index parameters can be finally obtained.
[0130] Optionally, the prediction layer may include multiple fully connected layers. Before the prediction layer, an attention mechanism can be used to capture the different importance of each fragment feature in determining the crystal index parameters, identify the important substructure features that determine the crystal index parameters, and combine all learned fragment features through weighted summation, thereby making the fragment features globally dependent. The combined fragment features are then input into sequentially connected fully connected layers for convolution, and finally the index parameters of the stable organic crystal structure of the compound are read out.
[0131] For example, the attention mechanism can be implemented as follows: a linear transformation is performed on the message vector, followed by batch normalization before activation, and the output atom representations of each layer are extracted. To distinguish the importance of the output representations of different atoms, attention weighting can be used to obtain the aggregation of the global graph vectors. Specifically, after four layers of graph convolution operations, an attention mechanism is used to extract atom representations, assigning importance weights to each central atom fragment, and then obtaining the global graph embedding through weighted summation. The mathematical representation of this process can be:
[0132]
[0133] Where g is the graph vector, R is the readout function, G is the entire undirected graph, and T is the time step.
[0134]
[0135]
[0136] Where, softmax is the softmax activation function, W2 is the learned weight matrix, and H... T d represents the transpose of matrix H. H It is the size of matrix H.
[0137] After combining the features of each fragment through an attention mechanism, the combined fragment features are input into a fully connected layer, which can then read out the index parameters. The fully connected layer can output the index parameters of the stable organic crystal structure of the compound based on the combined fragment features.
[0138] It is worth noting that, generally speaking, the number of index parameters of the stable organic crystal structure in the final output should be one. However, polymorphic compounds have multiple stable crystal structures. Therefore, the average value of the index parameters of the stable crystal structure of the polymorphic compound can be used as the final output.
[0139] In this embodiment, the radius of the substructure segment of the central atom can be determined by using the output segment features of the graph convolutional layer and identifying the maximum output activation value of the most important central atom through an attention mechanism. Compared with the traditional circular fingerprint method that focuses on substructures with a constant radius, the graph convolutional network offers greater flexibility in extracting substructures. Furthermore, by considering the influence of other atoms on each central atom, it ensures the extraction of distinguishable atomic features, thereby effectively controlling computational complexity.
[0140] After obtaining the predicted index parameters and calculated index parameters as described above, this application can obtain a stable organic crystal structure based on the calculated index parameters, predicted index parameters, preset index parameter difference thresholds, and each organic crystal structure, such as... Figure 5 As shown, the above step S205 includes:
[0141] S501: Based on the calculated index parameters, predicted index parameters, and preset index parameter difference thresholds, select multiple intermediate organic crystal structures from multiple organic crystal structures.
[0142] Optionally, based on the calculated index parameters and the predicted index parameters, the difference between the index parameters of the organic crystal structure generated by the generative model and the stable organic crystal structure can be obtained. By comparing this difference with the preset index parameter difference threshold, organic crystal structures with excessively large differences (that is, those that are significantly different from the stable organic crystal structure) can be filtered out.
[0143] Optionally, the intermediate organic crystal structure can be an organic crystal structure whose difference satisfies a preset threshold for the difference of an index parameter.
[0144] S502: Cluster the intermediate organic crystal structures to obtain at least one clustered organic crystal structure, sort the clustered organic crystal structures according to the difference in index parameters of each clustered organic crystal structure, and obtain the stable organic crystal structure according to the sorting result.
[0145] It should be noted that during the crystal structure generation process, the newly generated crystal structure may already exist in the previously generated crystal structure. This can lead to structural redundancy and reduce the diversity of generated structures. Therefore, similar crystal structures can be clustered to retain unique values.
[0146] For example, two structures can be determined to be the same structure by directly comparing their crystal structure parameters: they have the same space group; the same Z-value; the difference between each corresponding length of the cell parameters A, B, and C of the two crystal structures is less than 10%; and the difference between each corresponding angle α, β, and γ of the two crystal structures is less than 10%. If two or more intermediate organic crystal structures meet these conditions, the same structures can be clustered, retaining unique values.
[0147] It is worth noting that determining the lattice parameters, length parameters, and angular parameters of a crystal structure requires crystal structure amplification for each generated crystal structure. Since the same crystal structure may be classified as dissimilar due to different edge labeling orders or coordinate systems, crystal structure amplification can be used to account for these different edge labels or coordinate systems. This allows a single crystal structure to be amplified into six identical crystal structures with different coordinate systems, improving the reliability of clustering similar crystal structures.
[0148] Optionally, after clustering the intermediate organic crystal structures, the clustered organic crystal structures can be sorted according to the difference in their index parameters, and the stable organic crystal structure can be obtained based on the sorting result. For example, the organic crystal structure with the smallest difference in index parameters after clustering can be taken as the stable organic crystal structure.
[0149] The following section will specifically explain the step in S501 above, which involves selecting multiple intermediate organic crystal structures from multiple organic crystal structures based on calculated index parameters, predicted index parameters, and preset index parameter difference thresholds. For example... Figure 6 As shown, the above S501 step includes:
[0150] S601: Determine the difference in index parameters for organic crystal structures based on calculated index parameters and predicted index parameters.
[0151] Optionally, the index parameter difference can be the difference between the predicted index parameter and the calculated index parameter.
[0152] For example, assuming the index parameter is crystal structure density, denoted by Δdensity, then Δdensity can be calculated as follows:
[0153] Δdensity=Abs(COrrected MolGAT density-Calculated density)
[0154] Here, the Corrected MolGAT density represents the density of the stable organic crystal structure predicted by the discriminant model, and the Calculated density represents the density of the organic crystal structure generated by the generative model.
[0155] S602: Select multiple intermediate organic crystal structures from multiple organic crystal structures based on the difference between the index parameters and the preset threshold value of the difference between the index parameters.
[0156] Optionally, the differences in each index parameter can be filtered based on the threshold value of the index parameter difference. Organic crystal structures with excessively large index parameter differences (i.e., those that differ significantly from stable organic crystal structures) can be filtered out, and multiple organic crystal structures whose index parameter differences meet the threshold value of the index parameter difference can be identified as intermediate organic crystal structures.
[0157] For example, assuming the index parameter is the density of the organic crystal structure, and the threshold value of the index parameter difference is set to 0-0.05 g / cm3 according to the screening experiment, then crystal structures with Δdensity less than 0.05 g / cm3 can be retained and these organic crystal structures can be regarded as intermediate organic crystal structures.
[0158] In this embodiment, organic crystal structures are screened by predicting index parameters, calculating index parameters, and using a threshold for the difference between index parameters. This filters out organic crystal structures that differ significantly from stable organic crystal structures, thereby quickly narrowing down the range of organic crystal structures. By clustering similar organic crystal structures and sorting the clustered structures according to the difference in index parameters, the speed of organic crystal structure prediction can be improved. Compared to the prior art of calculating free energy to determine stable organic crystal structures, this application's sorting based on the difference in index parameters effectively reduces the computational load.
[0159] The following is a description of the steps for determining the condition vector of the target compound based on its feature descriptor. Step S201 includes:
[0160] The feature descriptor of the target compound is obtained, and the feature descriptor is discretized to obtain the condition vector of the target compound.
[0161] For example, an electronic device can store the characteristic descriptors of the target compound in the form of a table and discretize the data in the table.
[0162] Optionally, the Tabular GAN framework can be used to discretize the feature descriptors. Tabular GAN represents discrete columns as one-hot vectors and continuous columns as one-hot vectors representing the number of patterns and a scalar representing the values within each pattern. For each continuous column, a Variational Gaussian Mixture Model (VGM) is used to estimate the number of patterns and fit a Gaussian mixture distribution, calculating the parameters for each pattern. All continuous columns are normalized to a scalar and a vector. By sampling from the given probability density of each pattern and normalizing the values, the conditional vector of the target compound is obtained, thus achieving data discretization.
[0163] In the embodiments of this application, by discretizing the feature descriptor of the target compound to obtain the condition vector of the target compound, combinatorial explosion caused by excessive feature correlation can be avoided, effectively improving the efficiency and accuracy of generating organic crystal structures.
[0164] Before obtaining multiple organic crystal structures by inputting the conditional vectors into the pre-trained generative adversarial network, the generative adversarial network can be trained first. The training steps include:
[0165] The sample condition vector and noise vector are input into the initial generative adversarial network (GAN). The generator in the initial GAN generates multiple sample organic crystal structures. The sample organic crystal structures and the pre-labeled organic crystal structures are then input into the discriminator in the initial GAN to determine the difference information between the sample organic crystal structures and the pre-labeled organic crystal structures. Based on the difference information, the loss value of the initial GAN is determined. The GAN is then iteratively corrected based on the loss value to obtain the final GAN.
[0166] In this application, the original dataset can be obtained from the Cambridge Structural Database (CSD). The dataset is preprocessed to address issues such as class imbalance in discrete data and distribution imbalance in continuous data, exclude data that does not meet crystallographic rules, exclude datasets with missing data, and obtain a dataset for training.
[0167] It is worth noting that the dataset in this application can be split using a three-dataset splitting strategy: training the model on a training subset, tuning the model's hyperparameters on a validation subset, and testing the model's performance on a test subset. The test subset can utilize a marketed drug testing dataset to improve the practicality and reliability of the model training.
[0168] Each organic crystal structure in the dataset can be represented by crystal structure parameters.
[0169] Optionally, the sample condition vector can be a feature descriptor of the target compound obtained by the electronic device from the dataset, and the noise vector can be a Gaussian noise vector (which can be a random value).
[0170] Noise vectors and conditional vectors are combined and input into the generator to produce an organic crystal structure vector. The real and generated crystal structure vectors are alternately input into the discriminator, which simultaneously evaluates the difference in class proportions between the discrete columns of the generated and real data, as well as the distance between the generated and real data. Subsequently, the gradient of each network weight is calculated through backpropagation, and this process is iterated using a gradient descent optimization algorithm to obtain the generative adversarial network (GAN).
[0171] Optionally, the pre-labeled organic crystal structure can be the actual crystal structure vector of the target compound. The difference information can be the difference in the category proportions of discrete columns between the generated and actual data, and the distance between the generated and actual data.
[0172] As one possible implementation, the difference information can be directly used as the loss value, and the generative adversarial network can be iteratively corrected based on the loss value to obtain the generative adversarial network.
[0173] As another possible implementation, the loss value can be determined based on the difference information, and the generative adversarial network can be iteratively corrected based on the loss value to obtain the generative adversarial network.
[0174] In this embodiment, the generator and discriminator can continuously adjust their parameters through a game of mutual interaction, eventually reaching a Nash equilibrium. The generator will then evolve to generate realistic organic crystal structures that critics cannot distinguish, thereby improving the reliability of the final generated organic crystal structure.
[0175] Quantum chemical computation methods require solving physical equations, thus their computational complexity is far greater than that of the machine learning model proposed in this application. For reference, traditional quantization computation methods take more than 10⁷ seconds per run under conventional hardware conditions. However, after multiple experiments and calculations, compared with currently used quantization computation methods, the method proposed in this application takes about 10² seconds under the same conventional hardware conditions, which is about 10,000 times less computational resources than existing technologies, greatly saving computational resources.
[0176] Based on the same inventive concept, this application also provides an organic crystal structure prediction device corresponding to the organic crystal structure prediction method. Since the principle of the device in this application is similar to the organic crystal structure prediction method described above in this application, the implementation of the device can refer to the implementation of the method, and the repeated parts will not be described again.
[0177] Reference Figure 7 The diagram shown is a schematic of an organic crystal structure prediction device provided in an embodiment of this application. The device includes: an acquisition module 701, a generation module 702, a determination module 703, a prediction module 704, and a screening module 705, wherein:
[0178] The acquisition module 701 is used to: acquire the molecular structure of the target compound and determine the condition vector of the target compound based on the feature descriptor of the target compound;
[0179] The generation module 702 is used to: input the conditional vector into the pre-trained generation model to generate at least one organic crystal structure;
[0180] Module 703 is used to: determine the calculation parameters of each organic crystal structure;
[0181] The prediction module 704 is used to: input the molecular structure of the target compound into the pre-trained discrimination model to obtain the prediction index parameters of the stable organic crystal structure corresponding to the target compound;
[0182] The screening module 705 is used to obtain a stable organic crystal structure based on the calculated index parameters, the predicted index parameters, the preset index parameter difference threshold, and each organic crystal structure.
[0183] Optionally, the generative model can be a generative adversarial network, and the generative module 702 is specifically used for:
[0184] The conditional vector is input into the generative adversarial network (GAN), and the generator of the GAN generates at least one organic crystal structure based on the conditional vector.
[0185] Optionally, the discriminant model can be a graph convolutional network, which includes: multiple graph convolutional layers connected in sequence and a prediction layer;
[0186] Prediction module 704 is specifically used for:
[0187] The molecular structure is input into the first graph convolutional layer, and according to the connection order of each graph convolutional layer, it is processed by each graph convolutional layer in turn to obtain the fragment features of the molecular structure.
[0188] The fragment features are input into the prediction layer to obtain the prediction index parameters of stable organic crystal structures.
[0189] Optionally, the filtering module 705 is specifically used for:
[0190] Based on the calculated index parameters, predicted index parameters, and preset index parameter difference thresholds, multiple intermediate organic crystal structures are selected from multiple organic crystal structures.
[0191] Intermediate organic crystal structures are clustered to obtain at least one clustered organic crystal structure. The clustered organic crystal structures are then sorted according to the differences in index parameters of each clustered organic crystal structure, and the stable organic crystal structure is obtained based on the sorting results.
[0192] Optionally, the filtering module 705 is also specifically used for:
[0193] The difference in index parameters for organic crystal structures is determined based on calculated and predicted index parameters.
[0194] Multiple intermediate organic crystal structures are selected from multiple organic crystal structures based on the difference between the index parameters and the preset threshold value of the difference between the index parameters.
[0195] Optionally, the acquisition module 701 is also specifically used for:
[0196] The feature descriptor of the target compound is obtained, and the feature descriptor is discretized to obtain the condition vector of the target compound.
[0197] Optionally, the device may also include a training module, specifically used for:
[0198] The sample condition vector and noise vector are input into the initial generative adversarial network (GAN). The generator in the initial GAN generates multiple sample organic crystal structures. The sample organic crystal structures and the pre-labeled organic crystal structures are then input into the discriminator in the initial GAN to determine the difference information between the sample organic crystal structures and the pre-labeled organic crystal structures. Based on the difference information, the loss value of the initial GAN is determined. The GAN is then iteratively corrected based on the loss value to obtain the final GAN.
[0199] The processing flow of each module in the device and the interaction flow between each module can be referred to the relevant descriptions in the above method embodiments, and will not be detailed here.
[0200] This application's embodiments utilize a generative model that extends beyond existing organic crystal templates in databases, exploring a wider range of organic crystal structures. Compared to existing database-based matching methods, this significantly reduces computational complexity. Instead of directly ranking matched organic crystal structures based on index parameters, this application uses a discriminative model to predict the index parameters of stable organic crystal structures. Then, it filters and ranks the organic crystal structures based on the predicted and calculated index parameters, effectively reducing computational load and quickly identifying stable organic crystal structures from multiple sources. In summary, this application's method predicts organic crystal structures by coupling generative and discriminative models. The generation and filtering process is entirely implemented using machine learning algorithms, effectively reducing computational complexity while ensuring prediction accuracy, thus demonstrating high practicality.
[0201] This application also provides an electronic device, such as... Figure 8 The diagram shown is a schematic representation of an electronic device structure provided in an embodiment of this application, including: a processor 81, a memory 82, and a bus. The memory 82 stores machine-readable instructions executable by the processor 81 (e.g., ...). Figure 7The device includes the execution instructions corresponding to the acquisition module 701, generation module 702, determination module 703, prediction module 704, and screening module 705. When the computer device is running, the processor 81 and the memory 82 communicate via a bus. When the machine-readable instructions are executed by the processor 81, the above-mentioned organic crystal structure prediction method is performed.
[0202] This application also provides a computer-readable storage medium storing a computer program, which, when run by a processor, executes the steps of the above-described organic crystal structure prediction method.
[0203] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the systems and devices described above can be referred to the corresponding processes in the method embodiments, and will not be repeated here. In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods can be implemented in other ways. The device embodiments described above are merely illustrative. For example, the division of modules is only a logical functional division, and in actual implementation, there may be other division methods. Furthermore, multiple modules or components can be combined or integrated into another system, or some features can be ignored or not executed. Another point is that the displayed or discussed mutual coupling or direct coupling or communication connection can be through some communication interfaces; the indirect coupling or communication connection of devices or modules can be electrical, mechanical, or other forms.
[0204] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. If the functions are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this invention, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0205] The above are merely specific embodiments of this application, but the scope of protection of this application is not limited thereto. Any changes or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application.
Claims
1. A method for predicting the structure of organic crystals, characterized in that, include: Obtain the molecular structure of the target compound and determine the condition vector of the target compound based on the feature descriptor of the target compound; The conditional vector is input into the pre-trained generative model to generate at least one organic crystal structure; Determine the computational index parameters for each of the described organic crystal structures; The molecular structure of the target compound is input into a pre-trained discriminant model to obtain the predicted index parameters of the stable organic crystal structure corresponding to the target compound. The calculated index parameters and the predicted index parameters are related to the stability of the organic crystal structure. Both the calculated index parameters and the predicted index parameters include structural parameters and energy parameters. The structural parameters include at least the following: density, three-dimensional atomic coordinates, bond length, bond angle, and dihedral angle. The energy parameters include at least the following: lattice energy and free energy. The stable organic crystal structure is obtained based on the calculated index parameters, the predicted index parameters, the preset index parameter difference threshold, and each organic crystal structure.
2. The method according to claim 1, characterized in that, The generative model is a generative adversarial network; The step of inputting the conditional vector into a pre-trained generative model to generate at least one organic crystal structure includes: The condition vector is input into the generative adversarial network, and the generator of the generative adversarial network generates at least one organic crystal structure based on the condition vector.
3. The method according to claim 1, characterized in that, The discrimination model is a graph convolutional network, which includes: multiple graph convolutional layers and a prediction layer connected in sequence. The step of inputting the molecular structure of the target compound into a pre-trained discriminant model to obtain prediction index parameters of the stable organic crystal structure corresponding to the target compound includes: The molecular structure is input into the first graph convolutional layer, and according to the connection order of each graph convolutional layer, it is processed sequentially by each graph convolutional layer to obtain the fragment features of the molecular structure. The fragment features are input into the prediction layer to obtain the prediction index parameters of the stable organic crystal structure.
4. The method according to claim 1, characterized in that, The step of obtaining the stable organic crystal structure based on the calculated index parameters, the predicted index parameters, the preset index parameter difference threshold, and each of the organic crystal structures includes: Based on the calculated index parameters, the predicted index parameters, and the preset index parameter difference threshold, multiple intermediate organic crystal structures are selected from multiple organic crystal structures. The intermediate organic crystal structures are clustered to obtain at least one clustered organic crystal structure. The clustered organic crystal structures are then sorted according to the difference in index parameters of each clustered organic crystal structure. The stable organic crystal structure is obtained based on the sorting results.
5. The method according to claim 4, characterized in that, The step of selecting multiple intermediate organic crystal structures from multiple organic crystal structures based on the calculated index parameters, the predicted index parameters, and a preset index parameter difference threshold includes: The difference in index parameters of the organic crystal structure is determined based on the calculated index parameters and the predicted index parameters. Multiple intermediate organic crystal structures are selected from multiple organic crystal structures based on the difference between the index parameters and a preset threshold value for the difference between the index parameters.
6. The method according to claim 1, characterized in that, The step of determining the condition vector of the target compound based on its feature descriptor includes: The feature descriptor of the target compound is obtained, and the feature descriptor is discretized to obtain the condition vector of the target compound.
7. The method according to any one of claims 1-6, characterized in that, Before the conditional vector is input into a generative adversarial network (GAN), and the generator of the GAN generates at least one organic crystal structure based on the conditional vector, the method further includes: The sample condition vector and noise vector are input into the initial generative adversarial network (GAN). The generator in the initial GAN generates multiple experimental organic crystal structures. The experimental organic crystal structures and pre-labeled organic crystal structures are input into the discriminator in the initial GAN to determine the difference information between the experimental organic crystal structures and the pre-labeled organic crystal structures. The loss value of the initial GAN is determined based on the difference information. The GAN is iteratively corrected based on the loss value to obtain the final GAN.
8. An organic crystal structure prediction device, characterized in that, include: The acquisition module is used to: acquire the molecular structure of the target compound and determine the condition vector of the target compound based on the feature descriptor of the target compound; The generation module is used to: input the conditional vector into a pre-trained generation model to generate at least one organic crystal structure; The determination module is used to: determine the computational index parameters of each of the organic crystal structures; The prediction module is used to: input the molecular structure of the target compound into a pre-trained discriminant model to obtain prediction index parameters of the stable organic crystal structure corresponding to the target compound. The calculated index parameters and the predicted index parameters are related to the stability of the organic crystal structure. Both the calculated index parameters and the predicted index parameters include: structural parameters and energy parameters. The structural parameters include at least: density, three-dimensional atomic coordinates, bond length, bond angle, and dihedral angle. The energy parameters include at least: lattice energy and free energy. The filtering module is used to obtain the stable organic crystal structure based on the calculated index parameters, the predicted index parameters, the preset index parameter difference threshold, and each organic crystal structure.
9. An electronic device, characterized in that, include: The device includes a processor, a storage medium, and a bus, wherein the storage medium stores program instructions executable by the processor, and when the electronic device is running, the processor communicates with the storage medium via the bus, and the processor executes the program instructions to perform the steps of the organic crystal structure prediction method as described in any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that, when executed by a processor, performs the steps of the organic crystal structure prediction method as described in any one of claims 1 to 7.