A traditional Chinese medicine compound recommendation method and system based on causal modeling and a medium

By combining generative adversarial networks and causal modeling, phenotypic data of traditional Chinese medicine (TCM) compound prescriptions are generated and causal inferences are made, which solves the problems of data scarcity and complex modeling in TCM compound prescription recommendation and achieves efficient and interpretable compound prescription recommendation.

CN122245647APending Publication Date: 2026-06-19UNIV OF ELECTRONICS SCI & TECH OF CHINA

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
UNIV OF ELECTRONICS SCI & TECH OF CHINA
Filing Date
2026-03-20
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing methods for recommending traditional Chinese medicine compound prescriptions have limitations in accurately identifying multiple effects and mechanisms between drugs due to their multi-component and multi-target characteristics. HTS platforms struggle to capture synergistic effects between compound components, phenotypic data is scarce and difficult to generate, multimodal data fusion is complex, and recommendation systems lack causal judgment, resulting in poor interpretability.

Method used

By combining generative adversarial networks (GANs) to generate phenotypic data and introducing a causal inference mechanism, the main drugs in compound prescriptions are accurately identified through multimodal feature fusion and causal modeling. A standardized causal modeling framework is constructed, and compound prescription recommendations are made in conjunction with knowledge graphs.

🎯Benefits of technology

It enhances the biological rationality and interpretability of traditional Chinese medicine compound recommendations, breaks through the single-component limitation of HTS experiments, realizes a high-throughput and highly reliable compound screening process, and significantly improves the accuracy and efficiency of the recommendation system.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122245647A_ABST
    Figure CN122245647A_ABST
Patent Text Reader

Abstract

This invention discloses a method, system, and medium for recommending traditional Chinese medicine (TCM) compound prescriptions based on causal modeling. The method includes: acquiring HTS data of TCM compound prescriptions and extracting multimodal features; aggregating the multimodal features to obtain an aggregated compound prescription feature representation; generating virtual phenotypic images based on generative adversarial networks (GANs) using phenotypic synthesis; constructing a standardized causal modeling framework based on the compound prescription feature representations and corresponding phenotypes; estimating the weighted causal effect of each herb in phenotypic changes based on the causal modeling framework to select a set of principal herbs; generating corresponding compound prescriptions for each principal herb using a completion mechanism based on the set of principal herbs; calculating a recommendation score based on the reliability score of the compound prescription and the weighted causal effect of the principal herbs; and obtaining the TCM compound prescription recommendation result based on the recommendation score. This invention accurately identifies the principal herbs in compound prescriptions while simultaneously addressing the problem of scarce phenotypic data in the field of TCM, thus improving the biological rationality and interpretability of the recommendation system.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of traditional Chinese medicine compound prescription recommendation technology, specifically to a method, system, and medium for recommending traditional Chinese medicine compound prescriptions based on causal modeling. Background Technology

[0002] With the development of phenomics and high-throughput screening (HTS) technologies, drug discovery is gradually shifting from a "target-oriented" model to a "phenotype-driven" (PDD) model. This is particularly true in the field of traditional Chinese medicine, where the multi-component and multi-target characteristics of compound prescriptions make them more suitable for understanding their mechanisms of action through holistic phenotypic responses.

[0003] In recent years, the HTS platform, combining multimodal data such as cell images, gene expression, and chemical structures, has provided a rich foundation for constructing "compound-phenotype" mapping relationships. Meanwhile, Generative Adversarial Networks (GANs) have shown increasing prominence in phenotypic data completion and simulation. For example, models combining Wasserstein GAN (WGAN) with transfer learning can synthesize high-quality, structurally controllable phenotypic data, effectively alleviating the problem of scarce experimental data. In intelligent recommendation, causal inference methods (such as GIES and IPW) have been gradually introduced into drug recommendation systems to improve the credibility and interpretability of recommendation results. Causal recommendation frameworks, represented by CIDGMed, have achieved significant results in tasks such as drug combination prediction.

[0004] Existing TCM compound prescription recommendation technologies mainly combine deep learning, graph neural networks (GNNs), and network pharmacology to achieve accurate drug or prescription recommendations. Deep learning helps identify potential drug effects by automatically analyzing the effects of drugs on symptoms; graph neural networks (GNNs) improve the accuracy of drug recommendations by constructing complex relationship graphs between medicinal materials and symptoms, especially in scenarios involving multiple drug interactions and multi-target diseases; network pharmacology provides biological mechanism support for recommendation systems to reveal the relationship between drugs and organisms, but these methods focus more on modeling the association between drugs and symptom descriptions. However, due to the multi-component and multi-target characteristics of TCM compound prescriptions, existing methods still have limitations in accurately identifying the multiple effects and mechanisms between drugs. Phenotypic drug screening (PDD) is mature in the field of chemical drugs, but it is still in the development stage in the field of TCM, especially facing challenges in accurately reproducing TCM "syndrome phenotypes" and identifying the effects of multiple components.

[0005] In summary, the existing methods for recommending traditional Chinese medicine compound prescriptions have the following technical shortcomings:

[0006] 1. HTS is limited to single-component screening and cannot directly cover the compound compatibility mechanism: The HTS platform is mainly designed for the efficacy evaluation of single compounds, making it difficult to capture the synergistic effect between components of traditional Chinese medicine compound. The screening process is complex and costly.

[0007] 2. The scarcity and high difficulty in generating phenotypic data for compound formulas, which existing methods struggle to provide: Due to the complexity and cost of experimental studies on traditional Chinese medicine components, many potential compound formulas lack sufficient phenotypic data, preventing AI models from establishing a comprehensive learning system for their mechanisms of action. Traditional generation methods also struggle to ensure the reliability and biological interpretability of the synthesized data.

[0008] 3. HTS generates multimodal data, which is difficult to integrate and complex to model: HTS experimental results cover cell images, gene expression profiles and molecular structures. Existing methods are difficult to effectively integrate this heterogeneous information and often use simplified processing such as PCA, which reduces the accuracy of prediction and screening.

[0009] 4. Recommendation methods are based on correlation, lack causal judgment, and have poor interpretability: Most current recommendation systems rely on statistical correlation (such as collaborative filtering and co-occurrence frequency) to combine and rank components, ignoring the real causal relationship between components and phenotypes, which may lead to recommendation results that deviate from biological reality.

[0010] In view of the above, this application is hereby submitted. Summary of the Invention

[0011] To address the challenges of missing phenotypic data, modeling difficulties, and a lack of reliable support in recommendation systems for compound traditional Chinese medicine (TCM) formulas in high-throughput experiments, this invention aims to provide a method, system, and medium for recommending TCM compound formulas based on causal modeling. By innovatively combining generative adversarial networks (GANs) to generate phenotypic data and introducing a causal inference mechanism, it integrates phenotypic generation and causal modeling to accurately identify the principal drugs in the compound formula. This approach simultaneously solves the problem of scarce phenotypic data in the field of TCM, improves the biological rationality and interpretability of the recommendation system, and thus provides a more scientific and innovative solution for recommending TCM compound formulas.

[0012] This invention is achieved through the following technical solution:

[0013] In a first aspect, the present invention provides a method for recommending traditional Chinese medicine compound prescriptions based on causal modeling, the method comprising:

[0014] HTS data of traditional Chinese medicine compound was obtained and multimodal features were extracted. The multimodal features were then fused to obtain the fused compound feature representation.

[0015] Based on the complex feature representation, a virtual phenotypic image is generated using a generative adversarial network-based phenotypic synthesis method. The generative adversarial network-based phenotypic synthesis method employs an improved Wasserstein GAN structure and combines it with a conditional input mechanism to generate virtual phenotypic images with the complex feature representation as the conditional input.

[0016] Based on the compound characteristics and corresponding phenotypes, a standardized causal modeling framework is constructed. Based on the causal modeling framework, the role of each medicinal material in phenotypic changes is estimated by weighted causal effect, and the set of main drugs with significant causal influence is screened out.

[0017] Based on the set of principal drugs, a corresponding compound prescription is generated for each principal drug through a completion mechanism; and a recommendation score is calculated based on the reliability score of the compound prescription and the weighted causal effect of the principal drugs; based on the recommendation score, the recommended results of the traditional Chinese medicine compound prescription are obtained.

[0018] Furthermore, multimodal features include image modal features, gene expression modal features, and molecular structure modal features;

[0019] Image modal features are cell images obtained after compound application and are directly used as compound-level features;

[0020] Gene expression modal features and molecular structure modal features are feature representations constructed based on single medicinal materials.

[0021] Furthermore, based on the complex feature representation, a virtual phenotypic image is generated using a generative adversarial network-based phenotypic synthesis method, including:

[0022] The complex feature representation and the noise vector sampled from the latent space are jointly input into the generator network, and the corresponding virtual phenotypic image is output based on the generator network.

[0023] Based on the virtual phenotypic image, an improved Wasserstein loss function is used and a gradient penalty term is introduced to construct an objective optimization function. The objective optimization function is then solved to generate a phenotypic generator and obtain the virtual phenotypic.

[0024] Furthermore, based on the complex feature representation, the generation of virtual phenotypic images using a generative adversarial network-based phenotypic synthesis method also includes:

[0025] A transfer learning strategy is employed to train the phenotypic generator model to fine-tune its parameters. This model training to fine-tune the parameters includes:

[0026] Pre-training phase: In a complex set of known phenotypic data Train the phenotypic generator to obtain parameter initialization: ,in, These are the pre-training parameters for the phenotype generator; For source complex set The constructed loss function of the Wasserstein generative adversarial network is used to optimize the model parameters of the phenotypic generator during the pre-training phase;

[0027] Fine-tuning phase: in the target combination set Above, use As initial parameter values, only some key parameters in the generator are updated, while the remaining parameters remain unchanged, resulting in the fine-tuned model parameter set: ,in, Indicates in Based on this, a set of generator model parameters is formed through a fine-tuning phase. The complete phenotypic generation model corresponding to this parameter set is used for the phenotypic generation of the target compound. This refers to a fine-tuning training process that updates only some key parameters on the target complex set.

[0028] Furthermore, the objective optimization function The expression is:

[0029] ;

[0030] in, This represents the mathematical expectation operation on a variable. This is the discriminator function, which determines whether the input phenotypic data comes from real HTS data. This represents the characteristics of a compound preparation; These are images of actual experimental phenotypes; For virtual phenotypic images; In order to be in and Interpolated samples randomly sampled between; Indicates the discriminator about The gradient; This is a gradient penalty term used to constrain the Lipschitz continuity of the discriminator and prevent model training instability. The weight coefficients for the gradient penalty term are used to ensure that the discriminator gradient is stable.

[0031] Furthermore, based on the compound's characteristic representation and corresponding phenotype, a standardized causal modeling framework is constructed. Based on this framework, a weighted causal effect estimation is performed on the role of each medicinal herb in phenotypic changes, and a set of principal drugs is selected, including:

[0032] Based on the compound feature representation and corresponding phenotype, the medicinal material information in the compound is represented as a binary matrix, and a medicinal material usage matrix is ​​constructed; the phenotypic data is uniformly represented as phenotypic response variables;

[0033] Based on the medicinal material usage matrix and the phenotypic response variables, a causal graph is constructed using an acyclic graph modeling method.

[0034] Based on the causal diagram, the causal effect of each herb in the compound formula on phenotypic changes is estimated, the causal effect coefficient of each herb on each phenotypic dimension is obtained, and a causal effect coefficient matrix is ​​constructed.

[0035] By using the causal effect coefficient matrix and the medicinal material usage matrix, a matrix of phenotypic response variables for multiple compound prescriptions is obtained. The phenotypic response variable matrices are statistically summarized to characterize the comprehensive phenotypic response of medicinal materials in a multidimensional phenotypic space.

[0036] An inverse probability weighting mechanism is introduced at the sample level to correct the sample contribution and form a weighted causal effect estimate.

[0037] The weighted causal effect estimate is compared with a preset intervention effect threshold, and all medicinal materials that satisfy the weighted causal effect estimate being greater than the intervention effect threshold are selected, and the selected medicinal materials are used as the main drug set.

[0038] Furthermore, based on the set of principal drugs, a corresponding compound prescription is generated for each principal drug through a completion mechanism; and a recommendation score is calculated based on the reliability score of the compound prescription and the weighted causal effect of the principal drugs; based on the recommendation score, the recommended results of the traditional Chinese medicine compound prescription are obtained, including:

[0039] Based on the set of principal drugs, corresponding compound prescriptions are generated for each principal drug through a prescription co-occurrence-guided completion mechanism, a compatibility rule constraint screening mechanism, and a knowledge graph functional path completion mechanism, forming a complete compound prescription structure for the principal drugs.

[0040] Based on the complete compound structure of the main drug, the recommended scores of different compound preparations are calculated using a pre-built unified scoring function;

[0041] The recommended scores are sorted in descending order to obtain the recommended results for traditional Chinese medicine compound prescriptions.

[0042] Furthermore, the formula for the unified scoring function is:

[0043] ;

[0044] in, Complete compound structure with main drug Recommended score, Main drug The weighted causal effect The complete compound structure confidence score of the main drug is calculated by weighting factors such as co-occurrence intensity, compatibility matrix score, and path support. To adjust the coefficients of causal weights and structural weights.

[0045] Secondly, this invention provides a traditional Chinese medicine compound prescription recommendation system based on causal modeling, the system comprising:

[0046] The single-modal aggregation unit is used to acquire HTS data of traditional Chinese medicine compound and extract multimodal features. The multimodal features are aggregated to obtain the aggregated compound feature representation.

[0047] The virtual phenotypic generation unit is used to generate virtual phenotypic images based on the complex feature representation and the phenotypic synthesis method based on generative adversarial networks. The phenotypic synthesis method based on generative adversarial networks adopts an improved Wasserstein GAN structure and combines a conditional input mechanism to realize the generation of virtual phenotypic images with the complex feature representation as the conditional input.

[0048] The main drug set identification unit constructs a standardized causal modeling framework based on the compound feature representation and corresponding phenotype; based on the causal modeling framework, it performs weighted causal effect estimation on the role of each medicinal material in phenotypic changes and selects the main drug set.

[0049] The traditional Chinese medicine compound recommendation unit is used to generate a corresponding compound for each main drug based on the set of main drugs through a completion mechanism; and to calculate a recommendation score based on the reliability score of the compound and the weighted causal effect of the main drugs; and to obtain the traditional Chinese medicine compound recommendation result based on the recommendation score.

[0050] Thirdly, the present invention also provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the aforementioned method for recommending traditional Chinese medicine compound prescriptions based on causal modeling.

[0051] Compared with the prior art, the present invention has the following advantages and beneficial effects:

[0052] 1. This invention provides a method, system, and medium for recommending traditional Chinese medicine compound prescriptions based on causal modeling. By innovatively combining generative adversarial networks (GANs) to generate phenotypic data and introducing a causal inference mechanism, it integrates phenotypic generation and causal modeling to accurately identify the principal drug in the compound prescription. At the same time, it solves the problem of scarce phenotypic data in the field of traditional Chinese medicine, improves the biological rationality and interpretability of the recommendation system, and thus provides a more scientific and innovative solution for recommending traditional Chinese medicine compound prescriptions.

[0053] 2. Improve the accuracy and biological consistency of compound screening: Integrate HTS modal features such as cell images, gene expression and molecular structure, and dynamically adjust feature weights through a self-attention mechanism; introduce cross-modal consistency regularization to ensure the continuity of biological semantics after information integration and improve the model's generalization ability.

[0054] 3. Supports compound phenotype simulation, breaking through the limitations of HTS: By constructing a compound-specific phenotype generation model through WGAN-GP, it simulates the cell response behavior of multi-component combinations; combined with transfer learning, it extends to low-data scenarios, achieving compound-level phenotype coverage and breaking the limitation of HTS experiments that can only handle single components.

[0055] 4. Credibility assurance of recommendations based on causal mechanisms: A causal reasoning graph is constructed based on the identification of the main drug. Through joint modeling of GLM and IPW, the medicinal materials that are truly effective in intervention are identified. The recommendation logic is based on weighted causal effects to avoid co-occurrence misleading and significantly improve the causal interpretability and pharmacological rationality of the results.

[0056] 5. Implement virtual screening process to improve efficiency and reduce costs: It provides a complete automated process from compound input → phenotype generation → main drug identification → compound construction → recommendation ranking, which significantly saves resources and time compared with traditional experimental verification methods, and promotes the development of traditional Chinese medicine research and development towards high throughput and high reliability. Attached Figure Description

[0057] The accompanying drawings, which are included to provide a further understanding of embodiments of the invention and form part of this application, do not constitute a limitation thereof. In the drawings:

[0058] Figure 1 This is a flowchart of a traditional Chinese medicine compound prescription recommendation method based on causal modeling according to the present invention;

[0059] Figure 2 This is a flowchart of the recommended traditional Chinese medicine compound formula of the present invention;

[0060] Figure 3 This is a structural block diagram of a traditional Chinese medicine compound prescription recommendation system based on causal modeling according to the present invention. Detailed Implementation

[0061] To make the objectives, technical solutions, and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the embodiments and accompanying drawings. The illustrative embodiments and descriptions of the present invention are only used to explain the present invention and are not intended to limit the present invention.

[0062] Focusing on the core problems in the screening of traditional Chinese medicine (TCM) compound prescriptions, such as data scarcity, modeling difficulties, and unreliable recommendations, this invention integrates key technologies such as high-throughput screening (HTS), generative adversarial networks (WGAN-GP), multimodal feature fusion, causal inference, and TCM knowledge reasoning. It combines phenotypic generation and causal modeling for compound prescription recommendations. Its main innovations are as follows:

[0063] 1. HTS Multimodal Feature Fusion Method

[0064] The cell images, gene expression and molecular structure data output by HTS are processed in a unified manner. A modality mapping mechanism is used to fuse image, gene and structural features in a unified manner. The dynamic integration of modal features is achieved through a weight matrix, integrating multimodal signals and improving modeling depth and feature completeness.

[0065] 2. Mechanism of compound-specific phenotype generation (HTS + WGAN)

[0066] A compound-phenotype generator is constructed using WGAN-GP to address the scarcity of compound HTS data. A conditional GAN ​​structure is introduced to align the generated results with the compound vectors, and transfer learning is combined to improve the synthesis quality of small-sample compound compounds, achieving low-cost, high-throughput virtual screening of compound compounds.

[0067] 3. Causal modeling mechanism for identifying the main drug from compound-phenotypic data

[0068] The compound sample is decomposed into "medicinal herb usage matrix + phenotypic response". Structural learning and generalized linear models are used to characterize the comprehensive phenotypic response of each medicinal herb in the multidimensional phenotypic space. An inverse probability weighting (IPW) mechanism is introduced at the sample level to correct selection bias and construct a principal drug identification model based on weighted causal effects.

[0069] 4. Combining knowledge-based reasoning with a complex construction and recommendation mechanism

[0070] Based on the principal drug, and combining prescription co-occurrence frequency, compatibility rule matrix, and path reasoning from the traditional Chinese medicine knowledge graph, the system completes the supplementary drug selection and compound prescription construction. A recommendation list is output through a causal contribution + structural credibility fusion scoring mechanism, achieving a closed-loop recommendation logic from causal driving to knowledge completion.

[0071] A systematic path was constructed, which involves generating phenotypes from compound prescriptions, identifying principal drugs, completing compound prescriptions, and ranking and recommending them, thereby improving the interpretability, reliability, and deployment efficiency of traditional Chinese medicine compound prescription screening.

[0072] Example 1

[0073] like Figure 1 As shown, this invention provides a method for recommending traditional Chinese medicine compound prescriptions based on causal modeling. The method includes:

[0074] Step 1: Obtain HTS data of traditional Chinese medicine compound and extract multimodal features; fuse the multimodal features to obtain the fused compound feature representation.

[0075] Among them, multimodal features include image modal features, gene expression modal features, and molecular structure modal features; image modal features are cell images obtained after applying the compound formula and are directly used as compound-level features; gene expression modal features and molecular structure modal features are feature representations constructed based on single medicinal materials.

[0076] In this embodiment, in order to characterize the overall effects of traditional Chinese medicine compound at the multimodal level, a unified representation framework of three modal features—image modality, gene expression modality, and molecular structure modality—was constructed, and the features were aggregated within the modality to ultimately form a unified overall representation of the compound.

[0077] (1) Image modality: In the HTS platform, each compound When applied to a cell, a corresponding RGB image is obtained. Because images from different experimental batches exhibit variations in brightness and contrast, it is necessary to normalize the image pixel values.

[0078]

[0079] in, These represent the minimum and maximum pixel values ​​in the current batch of images, respectively. This represents the normalized image tensor. It is then passed through an image encoder network. Extracting image features:

[0080] in, This represents the compound-level features of the compound in the image modality, comprehensively characterizing the main effects of the compound at the cell morphology level, with the following dimensions. .

[0081] (2) Gene expression modality: A traditional Chinese medicine compound contains multiple medicinal materials. Let the compound be... Includes a collection of medicinal materials For each medicinal herb Both can yield gene expression profile vectors after their effects. First, Z-score normalization was performed on the expression values ​​of each gene:

[0082]

[0083] in, Indicates the first In the experimental sample, the first The original expression values ​​of each gene, wherein the experimental sample is a biological sample obtained after processing the corresponding medicinal material or compound; , Indicates the first The mean and standard deviation of each gene in all medicinal material samples for the corresponding dimension; This represents the standardized expression value. To reduce dimensionality and improve model interpretability, the SHAP value method is further introduced to extract the top-k values ​​that contribute the most to the output prediction. s At the gene level, constructing medicinal herb-level expression characteristics:

[0084]

[0085] in, This represents the total number of original representation dimensions. TopShap(·) denotes the feature selection function evaluated based on the SHAP value, used to select from all... Select from the dimensions s The dimension that contributes the most to the prediction. Dimensions representing the expressive characteristics of medicinal materials.

[0086] The expression characteristics of all medicinal materials in the compound were aggregated within a modality to obtain the gene expression modality representation of the compound:

[0087]

[0088] in, Represents aggregate functions, This indicates the gene expression characteristics at the compound level.

[0089] (2) Molecular structural modes: each medicinal material The chemical structure is represented by the string SMILES, and its molecular fingerprint is constructed using the Morgan fingerprinting method. Simultaneously construct its molecular graph structure Here, V represents the set of atoms, and E represents the set of chemical bonds. On the graph structure, a graph neural network (GNN) is used to learn the embedding representation of each node. Let the initial features of the i-th atom in layer 0 be denoted as . In the first layer The update features are:

[0090]

[0091] in, Represents aggregate functions, For activation function, Let be the trainable transformation matrix of the 0th layer of the graph neural network. Represents a node The feature representation of the neighboring nodes at layer 0, Let i represent the set of nodes adjacent to node i. By converging the representations of all nodes in the graph, we obtain the structural feature vector of the entire medicinal herb graph. .

[0092] Ultimately, the molecular structure characteristics of each medicinal herb can be represented as a concatenation of a graph fingerprint and a GNN vector:

[0093]

[0094] splicing operation Representing connections along dimensional directions, the final structural modal features are obtained. , among which dimension ,in This indicates the dimension of the feature vector of the molecular fingerprint constructed based on the molecular fingerprint (Morgan fingerprint). Let 'a' represent the dimension of the molecular graph embedding feature vector learned through a graph neural network (GNN).

[0095] By aggregating the structural features of all medicinal materials in the compound formula, the compound-level structural modal features are obtained:

[0096]

[0097] To achieve uniform alignment of the three modal data in the modeling input, a modality mapping matrix is ​​used to project each modal feature to the same dimension and then concatenate them to form a comprehensive representation of the complex.

[0098]

[0099] in, , , This is the mapping weight matrix for images, genes, and structural modalities; This represents the unified and fused complex vector, used as the conditional input for subsequent phenotypic generation and the causal modeling input for causal recommendation models. Through multimodal feature extraction and unified fusion, not only is the complete information of the complex at the levels of cellular response, molecular structure, and transcriptional features preserved, but a standardized and scalable model input format is also provided for subsequent tasks.

[0100] Step 2: Based on the complex feature representation, a virtual phenotypic image is generated using a generative adversarial network-based phenotypic synthesis method. The generative adversarial network-based phenotypic synthesis method uses an improved Wasserstein GAN structure and combines a conditional input mechanism to generate a virtual phenotypic image with the complex feature representation as the conditional input.

[0101] Specifically, step 2 includes:

[0102] The complex feature representation and the noise vector sampled from the latent space are jointly input into the generator network, and the corresponding virtual phenotypic image is output based on the generator network.

[0103] Based on the virtual phenotypic image, an improved Wasserstein loss function is used and a gradient penalty term is introduced to construct an objective optimization function. The objective optimization function is then solved to generate a phenotypic generator and obtain the virtual phenotypic.

[0104] The above technical solutions, in actual high-throughput screening (HTS) experiments of traditional Chinese medicine compound prescriptions, often fail to cover the cell phenotypic data corresponding to all potential compound prescription combinations due to high experimental costs, complex sample preparation, and variable component combinations. To compensate for insufficient experimental samples and expand the modeling coverage, a phenotypic synthesis method based on generative adversarial networks is proposed to simulate the "phenotypic response generated after compound prescription administration," thereby providing data support for subsequent causal modeling and recommendation system training. In step 2, an improved Wasserstein GAN structure (WGAN-GP) is used, combined with a conditional input mechanism (Conditional GAN, cGAN), to achieve virtual phenotypic generation with the feature representation of traditional Chinese medicine compound prescriptions as conditional input. Furthermore, to improve the model's adaptability in small-sample new compound prescription scenarios, a transfer learning mechanism is further introduced to train the phenotypic generator model to fine-tune parameters.

[0105] The improved Wasserstein GAN structure is derived by introducing transfer learning into the Wasserstein GAN structure and using the Fraser Inception distance (FID) and Pearson correlation coefficient (PCC) as joint training feedback.

[0106] Step 2 uses the compound characteristics obtained in Step 1 as an example. and the noise vector sampled from the latent space. Joint Input Generator Network Output the corresponding virtual phenotypic image. :

[0107]

[0108] in, The generated virtual phenotypic image serves as an approximate simulation of the target phenotypic. This virtual phenotypic image can be further mapped to phenotypic feature vectors through a feature extraction network for subsequent causal modeling. To improve generation quality and training stability, an improved Wasserstein loss function is employed, and a gradient penalty term is introduced to construct the objective optimization function. Objective optimization function The expression is:

[0109] ;

[0110] in, This represents the mathematical expectation operation on a variable. This is the discriminator function, which determines whether the input phenotypic data comes from real HTS data. This represents the characteristics of a compound preparation; These are images of actual experimental phenotypes; For virtual phenotypic images; In order to be in and Interpolated samples randomly sampled between; Indicates the discriminator about The gradient; This is a gradient penalty term used to constrain the Lipschitz continuity of the discriminator and prevent model training instability. is the weight coefficient of the gradient penalty term, used to ensure the discriminator gradient is stable, and is usually set to 10. When the true phenotypic data of some target complexes is missing or the samples are extremely few, a transfer learning strategy is adopted to transfer the parameters of the phenotypic generator across complexes. Training is divided into two stages:

[0111] Pre-training phase: In a complex set of known phenotypic data Train the phenotypic generator to obtain parameter initialization:

[0112]

[0113] in, These are the pre-training parameters for the phenotype generator; For source complex set The constructed loss function of the Wasserstein generative adversarial network is used to optimize the model parameters of the phenotypic generator during the pre-training phase;

[0114] Fine-tuning phase: in the target combination set Above, use As initial parameter values, only some key parameters in the generator are updated, while the remaining parameters remain unchanged, resulting in the fine-tuned model parameter set:

[0115]

[0116] in, Indicates in Based on this, a set of generator model parameters is formed through a fine-tuning stage. The complete phenotypic generation model corresponding to this parameter set is used for the phenotypic generation of the target compound. This is a fine-tuning training process that updates only some key parameters on the target complex set. To evaluate the quality and usability of the generated virtual phenotypes, the following joint loss function is introduced:

[0117]

[0118] in α represents the loss term of the original generator; α and β are the weighting coefficients of the two evaluation metrics, dynamically adjusted based on the performance on the validation set. Two evaluation metrics are used in this design. and The credibility of the generated results is measured from the perspectives of structural consistency and feature level, respectively.

[0119] (1) This represents the phenotypic image distribution consistency evaluation index value calculated based on Fréchet Inception Distance:

[0120]

[0121] in, This represents the mean and covariance matrix of the true phenotypic image in the Inception feature space. Represents the corresponding statistical characteristics of virtual phenotypic images. The trace operation represents the sum of the diagonal elements of a matrix.

[0122] The smaller the value, the closer the generated virtual phenotypic is to the real data.

[0123] (2) This represents the correlation evaluation index value in the feature space of the phenotypic features extracted from virtual and real phenotypic images, calculated based on the Pearson correlation coefficient.

[0124]

[0125] in, and Let represent the values ​​of the phenotypic features obtained from the real and virtual phenotypic images respectively, in the i-th feature dimension. and These are their respective mean values. The closer to 1, the more statistically reliable the dummy phenotype is.

[0126] This joint loss serves as a feedback signal during the training phase, guiding the generator to simultaneously optimize the alignment of structural distribution and feature representation, thereby improving the quality of generated phenotypic data. The final generated phenotypic data exhibits high consistency at both the structural and feature levels, enhancing the credibility of causal modeling inputs and expanding the complex space applicable to the recommender system.

[0127] Through the constructed phenotypic generation mechanism, this invention can synthesize virtual phenotypic data with reasonable structure and credible biological significance when the real phenotypic data of compound preparations are missing or limited in number. This provides more training samples for causal inference, expands the coverage of recommendation models, and improves the overall applicability and generalization level of the system.

[0128] Step 3: Based on the compound's characteristic representation and corresponding phenotype, construct a standardized causal modeling framework; based on this framework, estimate the weighted causal effect of each herb in the phenotypic changes, and select the set of principal herbs with significant causal influence; specifically, Step 3 includes:

[0129] Step 31: Based on the compound feature representation and corresponding phenotype, represent the medicinal material information in the compound as a binary matrix and construct the medicinal material usage matrix; represent the phenotypic data as phenotypic response variables in a unified manner.

[0130] Step 32: Based on the medicinal material usage matrix and phenotypic response variables, construct a causal graph using the acyclic graph modeling method;

[0131] Step 33: Based on the causal diagram, estimate the causal effect of each herb in the compound on the phenotypic changes, obtain the causal effect coefficient of each herb on each phenotypic dimension, and construct the causal effect coefficient matrix.

[0132] Step 34: Using the causal effect coefficient matrix and the medicinal material usage matrix, obtain the phenotypic response variable matrix of multiple compound prescriptions, and statistically summarize the phenotypic response variable matrix to characterize the comprehensive phenotypic response of medicinal materials in a multidimensional phenotypic space.

[0133] Step 35: Introduce an inverse probability weighting mechanism at the sample level to correct the sample contribution and form a weighted causal effect estimate.

[0134] Step 36: Compare the weighted causal effect estimate with the preset intervention effect threshold, select all medicinal materials that meet the condition that the weighted causal effect estimate is greater than the intervention effect threshold, and use the selected medicinal materials as the main drug set.

[0135] The present invention proposes a principal drug identification mechanism based on causal inference to identify key traditional Chinese medicine components, i.e., the so-called "principal drugs," that play a dominant role in the intervention of phenotypic response from the compound-phenotype relationship. This mechanism utilizes the compound feature representation provided in steps 1 and 2. Based on corresponding real or virtual phenotypes, a standardized causal modeling framework is constructed to estimate the weighted causal effect of each medicinal herb in phenotypic changes, thereby screening out the set of principal drugs with significant causal influence. The entire process consists of three parts: compound-medicinal herb causal modeling transformation, causal structure learning and effect estimation, and principal drug set screening. Details are as follows:

[0136] First, compound F iThe medicinal material information is represented as a binary matrix. Let there be N compound prescription samples (the selection of compound prescription samples is based on the intersection of the Traditional Chinese Medicine Prescription Knowledge Base and the HTS dataset, screening for compound prescription pairs with clear medicinal material combinations and phenotypic response data; at the same time, samples with incomplete medicinal material information or missing phenotypic dimensions are excluded to ensure data quality and modeling reproducibility), and p candidate medicinal materials (single Chinese herbs). Construct the medicinal material usage matrix:

[0137]

[0138] in, This indicates whether sample i used medicinal material j. Simultaneously, the phenotypic data of the samples are uniformly represented as phenotypic response variables. Each compound sample i corresponds to a k-dimensional phenotypic response vector yᵢ, whose m-th dimension is denoted as yᵢ. m and use Let be a random variable representing the m-th phenotypic response dimension at the population level. After data transformation, a causal graph structure between medicinal materials and phenotypes is constructed using structure learning methods. To meet the requirements of interpretability and structural sparsity, acyclic graph (DAG) modeling methods (such as NOTEARS or GIES algorithms) are preferred to construct the causal graph:

[0139]

[0140] Among them, the node set , representing the medicinal material variable and the phenotypic response variable; edge set It satisfies a directed acyclic structure, where → y m Indicates medicinal materials For phenotype y m There is a direct causal effect.

[0141] After determining the structural relationships, a linear structural causal model (Linear SCM) is introduced to parametrically model the medicinal herb-phenotypic relationship, for each phenotypic dimension. Its structural equation is defined as:

[0142]

[0143] in, Indicates medicinal materials For the Causal effect coefficients for each phenotypic dimension If and only if there are edges in the causal graph ,otherwise Under this causal structure constraint, the causal effect coefficient matrix is ​​estimated using generalized linear regression. , These are independent noise terms. Representing the structural equations for all phenotypic dimensions in matrix form, we obtain:

[0144]

[0145] in: The matrix represents the phenotypic response variables; A matrix for the use of medicinal materials; This is the causal effect coefficient matrix of medicinal materials on each phenotypic dimension; This represents the Gaussian error term. To provide a unified characterization of multidimensional phenotypic responses at the sample level, the comprehensive phenotypic response of sample i is defined. Its average response value across all phenotypic dimensions:

[0146]

[0147] in, for The value of the element in the i-th row and m-th column represents the response value of the i-th compound sample in the m-th phenotypic dimension;

[0148] Considering that HTS experiments are susceptible to interference from background variables (such as sample distribution and target bias), an inverse probability weighting (IPW) mechanism is further introduced to correct for the contribution of training samples. The weighted causal effect estimation form is as follows:

[0149]

[0150] Where N represents the number of compound samples participating in the training. It is a binary variable, representing a sample. Did medicinal herbs use? ; This indicates that the sample is in Response values ​​on each phenotypic dimension. For the sample The inverse weights, of which Indicates the first The combination of compound herbs applied to each sample (i.e., the subset of herbs contained therein); Indicates the first Conditional variables for each sample, such as physical condition, baseline phenotype, or background covariates; Indicates the combination of compound drugs in the overall sample The marginal probability of occurrence; Indicates sample Conditional and background variables Accepting compound The probability of treatment. This operation can eliminate the problem of overestimation or misjudgment of effects caused by treatment selection bias.

[0151] Ultimately, the intervention effect threshold was set. Filter all that meet the criteria The collection of medicinal materials is denoted as the main medicine collection:

[0152]

[0153] The medicinal materials in this set will serve as the foundational components for subsequent compound prescription construction and recommendation ranking, and will be regarded as the "core medicinal material starting point" for causal recommendations.

[0154] Step 3 completes the entire process of identifying single Chinese herbal medicines with significant causal effects on specific phenotypes or target symptoms from compound-phenotype samples. The resulting set of main drugs is not only based on statistical causality, but also has sufficient systematic explanatory power and structural traceability, which is the logical core for the subsequent construction of compound combination and scoring mechanisms.

[0155] Step 4: Based on the set of principal drugs, generate corresponding compound prescriptions for each principal drug through a completion mechanism; calculate recommendation scores based on the reliability scores of the compound prescriptions and the weighted causal effects of the principal drugs; and obtain the recommended results of traditional Chinese medicine compound prescriptions based on the recommendation scores.

[0156] Specifically, step 4 includes:

[0157] Step 41: Based on the set of main drugs, generate corresponding compound prescriptions for each main drug in sequence through the prescription co-occurrence guided completion mechanism, the compatibility rule constraint screening mechanism, and the knowledge graph functional path completion mechanism, forming a complete compound prescription structure of the main drugs;

[0158] Step 42: Based on the complete compound structure of the main drug, calculate the recommended score for different compound preparations using a pre-built unified scoring function;

[0159] Step 43: Sort the recommended scores in descending order to obtain the recommended results for traditional Chinese medicine compound prescriptions.

[0160] The above technical solution, after completing the identification of important principal drugs in step 3, further proposes a compound prescription completion and recommendation mechanism that integrates traditional Chinese medicine knowledge graphs and prescription experience rules. This aims to construct a structurally sound and logically traceable complete compound prescription based on the principal drugs, and to provide a unified scoring basis for recommendation ranking. This mechanism simultaneously considers causal dominance and knowledge rationality, achieving interpretable compound prescription generation and recommendation system output through integrated modeling of co-occurrence frequency, compatibility, and functional path consistency. The input for step 4 is the set of principal drugs identified in step 3. Each of the main drugs This refers to a single Chinese herbal medicine that has a significant interventional ability in a specific target phenotypic response. To construct a compound formula that can be used in practical applications, it is necessary to supplement it with a set of auxiliary medicines that work synergistically, based on pharmacological rationality and utilizing the knowledge system of traditional Chinese medicine.

[0161] Adopting a "drug-driven, knowledge-screening" strategy, in each drug... Based on this, the corresponding compound prescriptions are generated through the following three types of completion mechanisms:

[0162] (1) Prescription co-occurrence guided completion mechanism, which is based on the statistical analysis of the main drugs in the Chinese herbal prescription knowledge base. Historically co-occurring medicinal materials: Extract the set of medicinal materials with the highest co-occurrence frequency:

[0163]

[0164] in, express and Co-occurrence frequency among known prescriptions The co-occurrence threshold is used to control quality and scale, and the screening result is one of the candidate pools for excipients.

[0165] (2) The compatibility rule constraint screening mechanism introduces the compatibility rules of traditional Chinese medicine such as the four natures and five flavors, and the principal, assistant, adjuvant and guide principles, and constructs a knowledge rule matrix. This is used to determine whether a combination of herbs conforms to the rules of compatibility in traditional Chinese medicine.

[0166]

[0167] This matrix eliminates unsuitable combinations such as drug property conflicts and heat-cold incompatibility, ensuring the integrity of the pharmacological structure and compatibility with traditional knowledge.

[0168] Furthermore, for any main drug It can be derived from the rule matrix. Generate a set of excipients that can be legally combined with it, denoted as:

[0169]

[0170] Indicates the main drug The set of all candidate adjuvants that are structurally compatible and have coordinated medicinal properties according to the rules of traditional Chinese medicine compatibility.

[0171] (3) Knowledge graph functional path completion mechanism to construct a knowledge graph of traditional Chinese medicine Nodes contain entities such as medicinal materials, symptoms, efficacy, and pathogenesis, while edges represent functional relationships. In the graph, let the principal drug be... Can be connected to a certain functional node Then connect to other medicinal materials Define the path weight as follows:

[0172]

[0173] in, This represents the set of all functional nodes in a knowledge graph. This is an indicator function for the existence of a path. The weight or credibility of the functional node; select the path score before Complete the knowledge graph of medicinal materials composition and complete the collection of medicinal materials. .

[0174] Based on the above three completion paths, the final recommended compound prescription is defined as the union of the active ingredient and the legally approved adjuvants after screening:

[0175]

[0176] in, Indicates the main drug The core of the complete compound structure is that all excipients meet the triple conditions of historical co-occurrence, pharmacological compatibility, and functional pathway consistency.

[0177] To rank and recommend different generated compound prescriptions, a unified scoring function is constructed. The causal contribution of the main drug and the structural credibility of the corresponding compound are integrated and unified:

[0178] ;

[0179] in, Complete compound structure with main drug Recommended score, Main drug The weighted causal effect The complete compound structure confidence score of the main drug is calculated by weighting factors such as co-occurrence intensity, compatibility matrix score, and path support. To adjust the coefficients of causal weights and structural weights.

[0180] The final output consists of all according to The scores are sorted in descending order to form a list of candidate compound prescriptions with clear structural explanations and well-defined combination criteria.

[0181] Step 4 completes the transition from "causally effective single drugs" to "structurally rational compound prescriptions," which not only enhances the TCM knowledge explanation ability of the recommendation method, but also provides a mechanism-oriented compound prescription construction logic foundation for subsequent ranking optimization and practical deployment.

[0182] For details of steps 3 and 4, please refer to... Figure 2 , Figure 2 This is a flowchart of the recommended traditional Chinese medicine compound formula of the present invention.

[0183] Example 2

[0184] The difference between this embodiment and Embodiment 1 lies in that, to ensure the final output of the recommended traditional Chinese medicine compound formula possesses both causal reliability and structural compliance with the constraints of traditional Chinese medicine theory, a multi-objective joint optimization mechanism is proposed. This mechanism uses a recommendation score function... Based on this, by constructing three types of loss terms—ranking accuracy, causal consistency, and structural compliance—the directional optimization of the recommendation model is achieved during the training process of the recommendation method.

[0185] (1) Ranking accuracy: Recommendation methods should prioritize outputting combination of compounds with higher scores (i.e., greater causal contribution and stronger structural rationality). For this purpose, a pairwise ranking loss is defined. :

[0186]

[0187] in, Indicates all that should be satisfied The set of complex pairs, This is the minimum sorting interval constant, which prevents the model from ignoring slight sorting reversals. This loss encourages the model to prioritize actual high-quality complexes, thus enhancing the predictive sorting performance.

[0188] (2) Causal consistency constraint: Considering that the recommendation system takes causal contribution as the core decision basis, in order to ensure that the recommendation score is consistent with the actual intervention ability of the main drug, a causal consistency loss term is introduced. :

[0189]

[0190] in, This indicates the final compound score. The main drug of this compound The weighted causal effect ensures that the scoring system does not deviate from its causal basis, thus improving the interpretability of the recommendation model's mechanism.

[0191] (3) Structural rationality constraints (compatibility compliance regularization): The recommendation of traditional Chinese medicine compound prescriptions not only needs to be based on data learning, but also must conform to traditional compatibility logic. To avoid compatibility contraindications or structural conflicts, a structural compliance loss term is constructed. :

[0192]

[0193] in: This is a compatibility rule matrix, representing the medicinal materials. and Compatibility; This is an indicator function for whether drug pairs co-occur in a compound prescription; if a pair of drugs are actually used together but are marked as incompatible in the compatibility rules, the combination will be penalized by this loss term. This regularization term ensures that the final recommended compound prescription is realistically usable under the knowledge graph and prescription norms.

[0194] (4) Joint Optimization Objective Function: To comprehensively improve the ranking performance, causal interpretability, and structural legality of the recommendation method, this invention proposes the following joint optimization objective:

[0195]

[0196] in, , These are adjustable hyperparameters, controlling the importance weights of causal consistency constraints and structural rationality constraints in the total loss. This optimization objective can be achieved during the training phase by randomly sampling complex pairs and structural combinations to progressively minimize the aforementioned loss function, resulting in final recommendation model parameters that possess ranking accuracy, causal validity, and knowledge interpretability.

[0197] By constructing multi-objective loss and joint optimization, a systematic closed-loop control for compound prescription recommendations was achieved. While ensuring the rationality of the recommendations, the interpretability and credibility of the recommendations were significantly improved, providing a unified and executable evaluation standard for the deployment of traditional Chinese medicine compound prescriptions in intelligent recommendation methods.

[0198] Example 3

[0199] like Figure 3 As shown, the difference between this embodiment and Embodiment 1 is that this embodiment provides a traditional Chinese medicine compound prescription recommendation system based on causal modeling, which corresponds one-to-one with the traditional Chinese medicine compound prescription recommendation method based on causal modeling in Embodiment 1; the system includes:

[0200] The single-modal aggregation unit is used to acquire HTS data of traditional Chinese medicine compound and extract multimodal features. The multimodal features are aggregated to obtain the aggregated compound feature representation.

[0201] The virtual phenotypic generation unit is used to generate virtual phenotypic images based on the complex feature representation and the phenotypic synthesis method based on generative adversarial networks. The phenotypic synthesis method based on generative adversarial networks adopts an improved Wasserstein GAN structure and combines a conditional input mechanism to realize the generation of virtual phenotypic images with the complex feature representation as the conditional input.

[0202] The main drug set identification unit constructs a standardized causal modeling framework based on the compound feature representation and corresponding phenotype; based on the causal modeling framework, it performs weighted causal effect estimation on the role of each medicinal material in phenotypic changes and selects the main drug set.

[0203] The traditional Chinese medicine compound recommendation unit is used to generate a corresponding compound for each main drug based on the set of main drugs through a completion mechanism; and to calculate a recommendation score based on the reliability score of the compound and the weighted causal effect of the main drugs; and to obtain the traditional Chinese medicine compound recommendation result based on the recommendation score.

[0204] The execution process of each unit can be carried out according to the steps of the traditional Chinese medicine compound prescription recommendation method based on causal modeling in Example 1, and will not be described in detail in this example.

[0205] This invention generates compound phenotypic data using GANs, integrates multimodal information, identifies principal drugs through causal inference, and constructs compound prescriptions using knowledge-driven methods, forming an end-to-end traditional Chinese medicine compound prescription recommendation system. This invention improves the interpretability, reliability, and deployment efficiency of traditional Chinese medicine compound prescription screening.

[0206] Meanwhile, the present invention also provides a computer-readable storage medium storing a computer program, which, when executed by a processor, implements the aforementioned method for recommending traditional Chinese medicine compound prescriptions based on causal modeling.

[0207] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0208] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0209] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0210] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0211] The specific embodiments described above further illustrate the purpose, technical solution, and beneficial effects of the present invention. It should be understood that the above description is only a specific embodiment of the present invention and is not intended to limit the scope of protection of the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.

Claims

1. A method for recommending traditional Chinese medicine compound prescriptions based on causal modeling, characterized in that, The method includes: HTS data of traditional Chinese medicine compound is obtained and multimodal features are extracted. The multimodal features are aggregated to obtain the aggregated compound feature representation. Based on the complex feature representation, a virtual phenotypic image is generated using a generative adversarial network-based phenotypic synthesis method. The generative adversarial network-based phenotypic synthesis method employs an improved Wasserstein GAN structure and combines it with a conditional input mechanism to generate a virtual phenotypic image with the complex feature representation as the conditional input. Based on the compound feature representation and corresponding phenotype, a standardized causal modeling framework is constructed; based on the causal modeling framework, the role of each medicinal material in phenotypic changes is estimated by weighted causal effect, and the set of main medicinal materials is selected. Based on the set of principal drugs, a corresponding compound prescription is generated for each principal drug through a completion mechanism; and a recommendation score is calculated based on the reliability score of the compound prescription and the weighted causal effect of the principal drugs; based on the recommendation score, the recommendation result of the traditional Chinese medicine compound prescription is obtained.

2. The method for recommending traditional Chinese medicine compound prescriptions based on causal modeling according to claim 1, characterized in that, The multimodal features include image modal features, gene expression modal features, and molecular structure modal features; The image modal features are cell images obtained after the compound application, and are directly used as compound-level features. The gene expression modal features and molecular structure modal features are feature representations constructed based on single medicinal materials.

3. The method for recommending traditional Chinese medicine compound prescriptions based on causal modeling according to claim 1, characterized in that, Based on the aforementioned complex feature representation, a virtual phenotypic image is generated using a generative adversarial network-based phenotypic synthesis method, including: The complex feature representation and the noise vector sampled from the latent space are jointly input into the generator network, and the corresponding virtual phenotypic image is output based on the generator network. Based on the virtual phenotypic image, an objective optimization function is constructed using an improved Wasserstein loss function and by introducing a gradient penalty term. The objective optimization function is then solved to generate a phenotypic generator, thus obtaining the virtual phenotypic.

4. The method for recommending traditional Chinese medicine compound prescriptions based on causal modeling according to claim 3, characterized in that, Based on the aforementioned complex feature representation, a virtual phenotypic image is generated using a generative adversarial network-based phenotypic synthesis method, further comprising: A transfer learning strategy is employed to train the phenotypic generator model to fine-tune its parameters. This model training to fine-tune the parameters includes: Pre-training phase: In a complex set of known phenotypic data Train the phenotypic generator to obtain parameter initialization: ,in, These are the pre-training parameters for the phenotype generator; For source complex set The constructed loss function of the Wasserstein generative adversarial network is used to optimize the model parameters of the phenotypic generator during the pre-training phase; Fine-tuning phase: in the target combination set Above, use As initial parameter values, only some key parameters in the generator are updated, while the remaining parameters remain unchanged, resulting in the fine-tuned model parameter set: ,in, Indicates in Based on this, a set of generator model parameters is formed through a fine-tuning phase. The complete phenotypic generation model corresponding to this parameter set is used for the phenotypic generation of the target compound. This refers to a fine-tuning training process that updates only some key parameters on the target complex set.

5. The method for recommending traditional Chinese medicine compound prescriptions based on causal modeling according to claim 3, characterized in that, The objective optimization function The expression is: ; in, This represents the mathematical expectation operation on a variable. This is the discriminator function, which determines whether the input phenotypic data comes from real HTS data. This represents the characteristics of a compound preparation; These are images of actual experimental phenotypes; For virtual phenotypic images; In order to be in and Interpolated samples randomly sampled between; Indicates the discriminator about The gradient; This is a gradient penalty term used to constrain the Lipschitz continuity of the discriminator and prevent model training instability. The weight coefficients for the gradient penalty term are used to ensure that the discriminator gradient is stable.

6. The method for recommending traditional Chinese medicine compound prescriptions based on causal modeling according to claim 1, characterized in that, Based on the compound feature representation and corresponding phenotype, a standardized causal modeling framework is constructed; Based on the aforementioned causal modeling framework, a weighted causal effect estimation is performed on the role of each medicinal herb in phenotypic changes, and a set of principal herbs is selected, including: Based on the compound feature representation and corresponding phenotype, the medicinal material information in the compound is represented as a binary matrix, and a medicinal material usage matrix is ​​constructed; the phenotypic data is uniformly represented as phenotypic response variables; Based on the medicinal material usage matrix and the phenotypic response variables, a causal graph is constructed using an acyclic graph modeling method. Based on the causal diagram, the causal effect of each herb in the compound formula on phenotypic changes is estimated, the causal effect coefficient of each herb on each phenotypic dimension is obtained, and a causal effect coefficient matrix is ​​constructed. By using the causal effect coefficient matrix and the medicinal material usage matrix, a matrix of phenotypic response variables for multiple compound prescriptions is obtained. The phenotypic response variable matrices are statistically summarized to characterize the comprehensive phenotypic response of medicinal materials in a multidimensional phenotypic space. An inverse probability weighting mechanism is introduced at the sample level to correct the sample contribution and form a weighted causal effect estimate. The weighted causal effect estimate is compared with a preset intervention effect threshold, and all medicinal materials that satisfy the weighted causal effect estimate being greater than the intervention effect threshold are selected, and the selected medicinal materials are used as the main drug set.

7. The method for recommending traditional Chinese medicine compound prescriptions based on causal modeling according to claim 1, characterized in that, Based on the set of main drugs, a corresponding compound is generated for each main drug through a completion mechanism; And based on the reliability score of the compound and the weighted causal effect of the main drug, the recommendation score is calculated; Based on the recommended scores, the recommended results for traditional Chinese medicine compound prescriptions are obtained, including: Based on the set of principal drugs, a corresponding compound is generated for each principal drug through a prescription co-occurrence-guided completion mechanism, a compatibility rule constraint screening mechanism, and a knowledge graph functional path completion mechanism, forming a complete compound structure of the principal drugs. Based on the complete compound structure of the main drug, the recommended scores for different compound preparations are calculated using a pre-constructed unified scoring function; The recommended scores are sorted in descending order to obtain the recommended results for traditional Chinese medicine compound prescriptions.

8. The method for recommending traditional Chinese medicine compound prescriptions based on causal modeling according to claim 7, characterized in that, The formula for the unified scoring function is: ; in, Complete compound structure with main drug Recommended score, Main drug The weighted causal effect The complete compound structure confidence score of the main drug is calculated by weighting co-occurrence intensity, compatibility matrix score, and path support. To adjust the coefficients of causal weights and structural weights.

9. A traditional Chinese medicine compound prescription recommendation system based on causal modeling, characterized in that, The system includes: A single-modal aggregation unit is used to acquire HTS data of traditional Chinese medicine compound and extract multimodal features, and aggregate the multimodal features to obtain the aggregated compound feature representation; The virtual phenotype generation unit is used to generate a virtual phenotype image based on the complex feature representation and a phenotype synthesis method based on generative adversarial networks. The phenotype synthesis method based on generative adversarial networks adopts an improved Wasserstein GAN structure and combines a conditional input mechanism to realize the virtual phenotype generation with the complex feature representation as the conditional input. The main drug set identification unit constructs a standardized causal modeling framework based on the compound feature representation and corresponding phenotype; based on the causal modeling framework, it performs weighted causal effect estimation on the role of each medicinal material in phenotypic changes and selects the main drug set. The traditional Chinese medicine compound recommendation unit is used to generate a corresponding compound for each main drug based on the set of main drugs through a completion mechanism; and to calculate a recommendation score based on the reliability score of the compound and the weighted causal effect of the main drugs; and to obtain the traditional Chinese medicine compound recommendation result based on the recommendation score.

10. A computer-readable storage medium storing a computer program, characterized in that, When the computer program is executed by the processor, it implements a method for recommending traditional Chinese medicine compound prescriptions based on causal modeling as described in any one of claims 1 to 8.