Drug synergy prediction method and system fusing residual graph isomorphic network and cross attention

By integrating residual graph isomorphic networks and cross-attention mechanisms, the problem of insufficient structural discrimination ability and stability in drug synergy prediction in existing technologies is solved, achieving more efficient drug combination prediction and improving accuracy and cross-dataset adaptability.

CN122245840APending Publication Date: 2026-06-19SHIHEZI UNIVERSITY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SHIHEZI UNIVERSITY
Filing Date
2026-03-20
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing graph neural networks suffer from limitations in predicting drug synergy, including limited ability to discriminate molecular topology, instability during deep training, lack of explicit modeling of drug interactions, and weak generalization ability across datasets.

Method used

We employ a method that integrates residual graph isomorphic networks and cross-attention. The residual graph isomorphic network enhances the structure discrimination capability, while the cross-attention mechanism explicitly models drug interactions and integrates multimodal features for end-to-end prediction.

Benefits of technology

It significantly improves the accuracy, stability, and cross-dataset generalization ability of drug synergy prediction, can capture drug molecule topological information more precisely, solves the problems of oversmoothing and gradient vanishing in deep networks, and achieves more efficient drug combination screening and evaluation.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122245840A_ABST
    Figure CN122245840A_ABST
Patent Text Reader

Abstract

This invention discloses a method and system for predicting drug synergy by integrating residual graph isomorphic networks and cross-attention, belonging to the field of biomedical information technology. The method includes the core steps of molecular graph representation, feature representation based on graph isomorphic networks, feature representation based on LSTM networks, and interactive attention learning and prediction. Specifically, it enhances the molecular topology discrimination ability by introducing a graph isomorphic network with residual connections, utilizes a long short-term memory network to fuse multi-scale features, and employs a cross-attention mechanism to explicitly model interactions between drug pairs. Finally, it combines cell line features to achieve end-to-end prediction. This invention effectively solves the problems of insufficient structural representation ability, unstable deep training, non-explicit interaction modeling, and weak generalization ability in existing methods. Validation on multiple public datasets demonstrates that this invention can significantly improve the accuracy and robustness of drug synergy prediction, providing an efficient and reliable tool for drug combination screening.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of biomedical information technology, specifically to a method and system for predicting drug synergistic effects by integrating residual graph isomorphic networks and cross-attention. Background Technology

[0002] Treatment of complex diseases such as cancer and infectious diseases often faces challenges such as limited efficacy of single drugs and the development of drug resistance. In the medical field, to improve treatment outcomes and reduce side effects, drug combination therapy has gradually become an important clinical approach. Drug combinations may produce different effects, such as synergistic, additive, or antagonistic effects, among which synergistic effects have significant clinical value. However, relying on prior biological knowledge to validate all possible drug combinations is not only time-consuming and labor-intensive but also costly, making large-scale screening difficult. Therefore, developing efficient and accurate computational prediction methods to assist in the screening and evaluation of drug synergies has become a key pathway in the development of combination therapies.

[0003] Research in the field of drug combination prediction has evolved from theoretical reasoning relying on prior biological knowledge, to traditional machine learning based on handcrafted features, and then to drug combination prediction in recent years with the introduction of deep learning and graph neural networks. Early methods largely depended on biological theories or computational network pharmacological reasoning. While these methods had theoretical value, they heavily relied on complete and accurate biological pathway knowledge, making large-scale applications difficult and limiting their generalization capabilities. With the establishment of high-throughput drug screening databases, data-driven machine learning methods, such as artificial neural networks and support vector machines, have become mainstream. However, these methods typically rely on expert-designed molecular descriptors or biological features, resulting in subjectivity in feature extraction and difficulty in effectively handling unstructured data such as molecular structures.

[0004] In recent years, deep learning methods have made significant progress in drug combination prediction. However, the following key issues still exist in the field of drug synergy prediction: First, existing graph neural network models have limited ability to discriminate molecular topological structures, making it difficult to fully capture the multi-scale structural features of drugs. Second, deep graph neural networks are prone to oversmoothing, limiting the model's representation depth and stability. Third, existing methods have relatively simple mechanisms for modeling interactions between drug pairs, lacking explicit interaction modeling capabilities. Finally, the generalization ability and robustness of models on different datasets still need improvement. Therefore, the application of deep learning in this field requires further exploration to improve the accuracy and stability of drug synergy prediction. Summary of the Invention

[0005] This invention aims to improve the accuracy and stability of drug synergy prediction technology and address the technical problem of insufficient generalization ability in existing graph neural network application methods. To this end, this invention provides a drug synergy prediction method and system that integrates residual graph isomorphic networks and cross-attention. This method combines residual graph isomorphic networks and attention mechanisms. Addressing the technical shortcomings of existing methods, such as limited ability to discriminate molecular structures, instability in deep training, non-explicit drug interaction modeling, and weak generalization ability across datasets, this invention enhances structural discrimination ability by introducing a residual graph isomorphic network, explicitly models drug interactions by combining a cross-attention mechanism, and integrates multimodal features for end-to-end prediction. This effectively improves the aforementioned problems and enhances the accuracy and stability of drug synergy prediction.

[0006] Therefore, the present invention provides the following technical solution:

[0007] On the one hand, this invention provides a method for predicting drug synergistic effects by integrating residual graph isomorphic networks and cross-attention, comprising the following steps:

[0008] Molecular graph characterization: construct molecular graphs for each drug in the target drug pair and obtain the initial features of each node in the molecular graph;

[0009] Based on the feature representation of the graph isomorphic network, the initial features of all nodes are input into the K-layer graph isomorphic network to obtain an output feature matrix containing multi-scale structural information. Each layer of the graph isomorphic network outputs a set of output features containing the features of all nodes, where K is a positive integer.

[0010] Based on the feature representation of the LSTM network, the output features of each node in different layers of the graph isomorphic network are reconstructed into a time series of length K, which is then used as the input of the LSTM network to obtain the updated node feature matrix.

[0011] Interactive learning based on attention mechanism is used to obtain the pooling features of each drug in the GIN path and LSTM path by multi-head attention pooling operation on the output feature matrix of the last layer graph isomorphic network and the node feature matrix of the LSTM network. Then, the pooling features of the GIN path and LSTM path of all drugs are concatenated and fused to obtain the fused representation vector.

[0012] Cell line feature fusion involves concatenating the fused representation vector with the cell line features to form the final feature vector.

[0013] The prediction output will be based on the feature vector to obtain the prediction result of drug synergy.

[0014] Optionally, the graph isomorphic network is a residual graph isomorphic network, which is composed of K stacked ResGIN layers, wherein the ResGIN layers are formed by introducing residual connections into the GIN layers;

[0015] In this process, the initial features of a node are used as the input to the first ResGIN layer, and the output of the previous ResGIN layer is used as the input to the next ResGIN layer. Each ResGIN layer adds the original output of its own GIN layer to the input of the current ResGIN layer to obtain the output of the current ResGIN layer.

[0016] Optionally, the mathematical model of the ResGIN layer is as follows:

[0017] For node v, the raw output of the GIN layer in the k-th ResGIN layer. Defined as:

[0018]

[0019] in, R is a real vector space, and d is the hidden layer dimension of the node features. Represents a node The set of neighboring nodes u; It is a learnable scalar parameter. It is a multilayer perceptron that operates at each node. and These are the outputs of node v and its neighboring node u in the (k-1)th ResGIN layer, respectively.

[0020] By skipping connections, Compared with the original output Adding them together gives the output vector of node v in the k-th ResGIN layer. :

[0021]

[0022] Therefore, the output feature matrix of the k-th ResGIN layer is expressed as: T is the matrix transpose symbol, and N is the number of nodes. These are the output vectors of the 1st, 2nd, and Nth nodes in the kth ResGIN layer, respectively.

[0023] Optionally, the LSTM network consists of K LSTM layers, corresponding one-to-one with the K ResGIN layers of a graph isomorphic network. The node feature update model of the LSTM network is as follows:

[0024]

[0025] in, It is a node In the The output of layer ResGIN is used as the LSTM network in the 1st layer. One of the inputs for each time step; and Is the LSTM network processing the first... Layer and first Hidden states after layer features;

[0026] The node feature matrix S output by the LSTM network is represented as: , These are the hidden states output by the 1st, 2nd, and Nth nodes in the Kth LSTM layer, respectively.

[0027] Optionally, for both the output feature matrix of the GIN path and the node feature matrix of the LSTM path, the multi-head attention pooling operation is as follows:

[0028] First, the interaction attention score between the two drug molecules in the drug pair is calculated using the cross-attention mechanism;

[0029] Then, the interaction attention scores are aggregated along the dimensions of the other drug and normalized to obtain the importance weight of each node for each drug.

[0030] Finally, the importance weights are used to perform weighted aggregation on the corresponding output feature matrix or node feature matrix to generate pooled features for the corresponding GIN path or LSTM path.

[0031] Optionally, features of the ResGIN path , Taking an example, let's examine the output feature matrices of drug A and drug B in the last ResGIN layer of the target drug pair. , The implementation process of multi-head attention pooling operation is as follows:

[0032] First, calculate the interaction attention score matrix between drug pairs:

[0033]

[0034]

[0035] in, and This is the interaction attention score matrix for drug A and drug B along the ResGIN path. and These are the output feature matrices of drug A and drug B, respectively. It is a trainable weight matrix used to project node features onto a common weight matrix. Dimensional attention space; Its elements The first part of drug A was quantified. The node and the first in drug B The interaction strength between nodes and elements Similarly; and These represent the number of nodes corresponding to drug A and drug B, respectively. The number of elements contained in the feature vector of each node, i.e., the dimension of the node feature;

[0036] Then, the attention scores are aggregated along the dimensions of the opposing drug and normalized to obtain the importance weight of each node for each drug:

[0037]

[0038]

[0039] in, and These are the node importance vectors for drug A and drug B on the ResGIN path, respectively.

[0040] Finally, using the calculated importance weights, the final, interactively enhanced graph-level representation vector is obtained:

[0041]

[0042]

[0043] In the formula, and These are the graph-level representation vectors of drug A and drug B along the ResGIN path, respectively, i.e., GIN feature A and GIN feature B after pooling. The weight matrix is ​​a transform matrix for trainable values, where the superscripts i and j represent nodes.

[0044] Optionally, the initial feature construction process of the node is as follows: construct a molecular graph based on the SMILES string of the drug molecule and generate the basic feature vector of the node in the molecular graph, and convert the gene expression profile of a specific cell line into cell line features; fuse the cell line features with the basic feature vector of each node to obtain the initial features of each node.

[0045] Optionally, the above method further includes: constructing samples and sample labels, and optimizing model parameters by minimizing the cross-entropy loss function, wherein the model parameters include at least graph isomorphic network parameters, LSTM network parameters, and attention mechanism parameters.

[0046] Secondly, the present invention also provides a system based on the above method, comprising:

[0047] The molecular graph characterization module is used to construct molecular graphs of each drug in the target drug pair and obtain the initial features of each node in the molecular graph.

[0048] The feature representation module based on the graph isomorphic network is used to input the initial features of all nodes into the K-layer graph isomorphic network to obtain the output feature matrix. Each layer of the graph isomorphic network outputs a set of output feature matrices containing the features of all nodes, where K is a positive integer.

[0049] The feature representation module based on the LSTM network is used to reconstruct the output features of each node in different layers of the graph isomorphic network into a time series of length K, which is then used as input to the LSTM network to obtain an updated node feature matrix.

[0050] The graph attention pooling module is used to obtain the pooling features of each drug in the GIN path and LSTM path by taking the output feature matrix of the last layer graph isomorphic network and the node feature matrix of the LSTM network through multi-head attention pooling operation. Then, the pooling features of the GIN path and LSTM path of all drugs are concatenated and fused to obtain the fused representation vector.

[0051] The cell line feature fusion module is used to concatenate the fused representation vector with cell line features to form the final feature vector;

[0052] The prediction output module is used to perform prediction output based on the feature vector to obtain the prediction result of drug synergy.

[0053] In three aspects, the present invention also provides a computer device, comprising:

[0054] One or more processors;

[0055] Memory, used to store one or more computer programs;

[0056] When the one or more computer programs are executed by the one or more processors, the computer device implements the steps of the above-described method and system for predicting drug synergy by fusing residual graph isomorphic networks and cross-attention.

[0057] Fourthly, the present invention also provides a readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the steps of the above-described method and system for predicting drug synergy by fusing residual graph isomorphic networks and cross-attention.

[0058] Compared with the prior art, the present invention achieves the following progress and effects:

[0059] 1. This invention employs a hierarchical coupling design of graph isomorphic networks and long short-term memory (LSTM) networks, significantly enhancing the ability to discriminate the topological structure of drug molecules and stabilizing deep network training. Specifically, this invention establishes a one-to-one correspondence between a K-layer residual graph isomorphic network (ResGIN) and a K-layer LSTM, reconstructing the output features of each node in different GIN layers into a time series, and adaptively fusing multi-scale structural information from local to global through the gating mechanism of LSTM. This design differs from the simple splicing or average pooling fusion methods in existing technologies, enabling more precise capture of the evolution of node features in deep networks, effectively alleviating the oversmoothing and gradient vanishing problems that easily occur in deep graph neural networks, allowing the model to stably construct deeper network structures and capture a wider range of molecular topological information.

[0060] 2. The preferred technical solution of this invention proposes a residual-enhanced graph isomorphic network, which solves the instability problem of deep training while maintaining the strongest structural discrimination ability. The residual graph isomorphic network (ResGIN) used in this invention is based on the graph isomorphic network (GIN) that strictly follows the WL test framework and has the strongest graph structure discrimination ability, and performs residual enhancement through skip connections. Compared with existing residual GCN and other networks, ResGIN, while maintaining the strong representation ability of GIN to pass the WL test, solves the problems of over-smoothing and gradient vanishing in deep training, enabling the model to build a deeper network structure without losing node feature discrimination, which is more in line with the application needs of this technical field.

[0061] 3. This invention also achieves explicit drug interaction modeling through an innovative multi-head attention pooling mechanism. This invention innovatively integrates cross-attention and graph pooling operations, particularly by calculating the interaction attention score matrix between drug pairs bidirectionally and symmetrically, aggregating the scores along the drug's dimension, and then normalizing them to generate node importance weights with clear physical meaning—the weight of each node directly reflects the contribution of that atomic structure to the synergistic effect of the drugs. Furthermore, this invention applies the same mechanism to both the ResGIN and LSTM paths, enabling the final graph-level representation to simultaneously contain interaction information at different levels of abstraction.

[0062] 4. This invention achieves more accurate multimodal feature learning through cell line-aware node representation and dual-path heterogeneous information fusion. Specifically, this invention preferably projects cell line features and adds them to the initial features of each atomic node, integrating cell line information into node representation learning from the network's bottom layer. This enables drug structure perception within a cell line context, effectively capturing the synergistic mechanism of drugs on specific cell lines. Simultaneously, by concatenating the pooling features of the ResGIN and LSTM paths instead of simply adding or averaging them, the heterogeneity of features at different abstraction levels is preserved, avoiding information loss and improving the richness and discriminative power of the final representation.

[0063] 5. This invention significantly improves the model's cross-dataset generalization ability through a fully end-to-end optimization framework. This invention constructs an end-to-end training framework from molecular graph construction to synergistic effect prediction. All modules (including feature extraction, interactive modeling, and prediction output) are jointly optimized through backpropagation, avoiding the problem of feature extraction and prediction tasks being separated in traditional methods, thus achieving task-oriented feature learning. Experimental results on five publicly available datasets—O'Neil, ALMANAC, Oncology Screen, DrugCombDB, and DrugComb—show that this invention significantly outperforms existing methods in terms of AUC, ACC, and BACC: on the most representative O'Neil dataset, the AUC reaches 0.921, and both ACC and BACC reach 0.840; on the other four datasets, the AUC values ​​are stable between 0.873 and 0.912, and the standard deviations of all reported results are at a low level. This result indicates that this invention not only possesses top-tier prediction performance but also strong cross-dataset generalization ability and high robustness and reproducibility, providing an efficient and reliable computational tool for the screening and evaluation of drug synergistic effects. Attached Figure Description

[0064] Figure 1 This is a technical roadmap provided by the technical solution of this invention;

[0065] Figure 2 This is a comparison chart of the performance of various evaluation indicators of the embodiments of the present invention on different datasets;

[0066] Figure 3 This is a comparison chart of the AUC performance of different methods in the embodiments of the present invention on a benchmark dataset. Detailed Implementation

[0067] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. The technical features involved in the various embodiments of the invention described below can be combined with each other as long as they do not conflict with each other.

[0068] It should be noted that although functional modules are divided in the device schematic diagram and a logical order is shown in the flowchart, in some cases, the steps shown or described may be performed in a different order than the module division in the device or the order in the flowchart. The terms "first," "second," etc., in the specification, claims, and the aforementioned drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence.

[0069] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of this application only and is not intended to limit this application.

[0070] To address the shortcomings of existing graph neural network-based drug synergy prediction methods in terms of structure dependence and generalization ability, this invention provides a drug synergy prediction method that integrates residual graph isomorphic networks and cross-attention. This method enhances structure discrimination by introducing a residual graph isomorphic network, explicitly models drug interactions using a cross-attention mechanism, and performs end-to-end prediction by fusing multimodal features, effectively improving the accuracy and stability of drug synergy prediction.

[0071] The present invention will be described below with reference to specific embodiments.

[0072] In some embodiments, the present invention provides a method for predicting drug synergistic effects by fusing residual graph isomorphic networks and cross-attention, comprising the following steps:

[0073] Step 1: Molecular graph characterization. In this embodiment, a molecular graph is constructed based on the SMILES string of the drug molecule, and the basic feature vectors of the nodes in the molecular graph are generated. The gene expression profile of a specific cell line is converted into cell line features. Then, the cell line features are fused with the basic feature vector of each node to obtain the initial features of each node.

[0074] Given a SMILES (Simplified Molecular Input Line EntrySystem) string for a drug molecule, this invention uses the open-source cheminformatics toolkit RDKit to convert the SMILES string into a molecular graph. ,in Represents a set of atomic nodes (or simply "nodes"). The set representing chemical bonds (edges), for the i-th node Using feature encoding function Extract Nodes basic feature vectors , Let be a real vector space. The dimension of the atomic fundamental characteristics;

[0075]

[0076] Retrieves gene expression profile vectors of specific cancer cell lines from the Encyclopedia of Cancer Cell Lines (CCLE). Here, "specific cancer cell line" refers to the target cell line to which the drug combination studied in this invention is intended to act. It should be noted that in the same prediction task, all atomic nodes of drug A and drug B share the same cell line feature vector. This is because cell line features describe the environmental context of drug action, which is consistent across all atoms of the drug molecule. Adding the cell line features to the initial features of each atomic node aims to integrate cell line information into node representation learning from the network's bottom layer, achieving cell line-aware extraction of drug structural features.

[0077] To adapt it to the atomic feature space, a specialized multilayer perceptron (MLP) is used for dimensionality reduction:

[0078]

[0079] In the formula, This represents a multilayer perceptron operation, where c represents the gene expression profile vector. Cell line feature vectors after multilayer perceptron operation.

[0080] like Figure 1 The feature concatenation operation shown for drugs DrugA and DrugB merges the transformed cell line feature vector with the basic feature vector of each atom through an addition operation, forming the final node. initial features :

[0081]

[0082] It should be understood that the above nodes initial features The method of obtaining the information is the preferred and feasible method in the embodiments of the present invention. In other embodiments, the techniques for obtaining the node features of the drug molecule map by referring to the prior art are also considered as feasible techniques and fall within the protection scope of the present invention.

[0083] Step 2: Based on the feature representation of the graph isomorphic network, the initial features of the nodes are input into the graph isomorphic network for multi-layer feature extraction and updating, resulting in an output feature matrix containing multi-scale structural information. In this embodiment, the preferred graph isomorphic network is a residual graph isomorphic network, which is constructed by introducing skip connections into the standard graph isomorphic network layer GIN. The output of each layer is obtained by adding the original output calculated by the graph isomorphic network of that layer to the output from the previous layer. In other embodiments, using a standard graph isomorphic network layer is also feasible, the difference being in the effect.

[0084] In molecular graph representation learning, the core of graph neural networks lies in updating node representations through neighborhood aggregation. However, many common GNN (Graph Neural Network) variants have upper limits in their representational capabilities, failing to distinguish certain different topologies; that is, they are not "graph isomorphism tests." To address this, this invention employs a graph isomorphism network (GIN), whose design strictly adheres to the Weisfeiler-Lehman (WL) graph isomorphism test framework, and has proven to be one of the most representational architectures in the GNN system.

[0085] A standard GIN layer updates node features through a learnable, injective aggregation function, ensuring better structural discriminative power. For a node... In its first The embedding vector of the layer is:

[0086]

[0087] in, Represents a node The set of neighboring nodes u; It is a learnable scalar parameter used to adjust the importance of node features in aggregation; It is a multilayer perceptron that operates on each node, which endows the present invention with powerful nonlinear feature transformation capabilities; It is a node exist The feature vector of the layer; It is a node exist The feature vector of the layer, is the hidden layer dimension of the node features. It should be understood that two nodes in the molecular graph that have an edge relationship E are considered to be adjacent nodes.

[0088] Despite GIN's powerful representational capabilities, deep GINs generally face challenges related to smoothing and gradient vanishing. As the number of network layers increases, the receptive field of nodes expands, causing features from different nodes to converge to the same value, thus losing the unique structural information corresponding to each node. To address this issue and construct deeper and more powerful molecular representation networks, this invention introduces residual connections into the GIN layers to form ResGIN layers, creating a residual graph isomorphic network module (Residual GIN module) composed of K stacked ResGIN layers.

[0089] First, the node The first ResGIN layer takes its input as input, and the output of the previous ResGIN layer takes its input as input. Each ResGIN layer is formed by adding the original output of its own GIN layer to the input of the current ResGIN layer. The mathematical model is as follows:

[0090] For node v, the raw output of the GIN layer in the k-th ResGIN layer. Defined as:

[0091]

[0092] in, Then, through a jump connection, the output of the previous layer is... Compared with the original output The summation yields the final output of this layer, which is the feature vector of node v in the k-th ResGIN layer. :

[0093]

[0094] After stacking K layers of ResGIN, the final feature vector of each node is obtained, such as the final feature vector of node v. .

[0095] To facilitate efficient tensor computation in the subsequent attention pooling module, this invention stacks the feature vectors of all nodes in each ResGIN layer row by row to form a node feature matrix. That is, the feature vector of each node is turned into a row, and then all rows are stacked up and down. Through multiple ResGIN layers, a set of node features containing multi-level structural information is output as the final output of the residual GIN module.

[0096] The output feature matrix of the first residual GIN layer (the first ResGIN layer) The local chemical environment is encoded, while the deep features are the output feature matrix of the Kth layer residual GIN. This integrates more global molecular topological information. In order to more adaptively fuse these multi-scale features from local to global, inspired by MR-GNN, this invention introduces a Long Short-Term Memory (LSTM) network to model the feature sequence.

[0097] After the drug is processed through a K-layer residual GIN module, the output feature matrices of each layer are obtained. The output feature matrix of the Kth layer Represented as:

[0098]

[0099] in, This represents the total number of nodes in the molecular graph. The dimension embedded for this node. Matrix each line Corresponding node The transposed feature vector , where T is the matrix transpose symbol.

[0100] Step 3: Feature representation based on LSTM network. The output features of each node in different layers of the graph isomorphic network are reconstructed into time series of length K, which are used as input to the LSTM network, ultimately fusing node feature matrices that integrate local to global structural information;

[0101] For each node in the graph , and in all residuals Features in the layer As a length of The time series data is input into an LSTM module. The LSTM module consists of K LSTM layers (which can be understood as K time steps of the LSTM), corresponding one-to-one with K ResGIN layers. The input of each node to each LSTM layer consists of the output of the corresponding ResGIN layer and the output of the previous LSTM layer.

[0102] Each LSTM layer processes the sequence sequentially, and its update process is as follows:

[0103]

[0104] in, It is a node In the The output of layer ResGIN is used as the LSTM module in the 1st layer. Input at each time step; The LSTM module processes the first... The hidden state after the layer features, which fuses the features from layer 1 to layer 2. Layer feature information; initial hidden state of LSTM module Initialize to a zero vector.

[0105] After the sequence is processed, the final hidden state That is, a node Enhanced representation incorporating multi-scale information. All nodes undergo feature fusion operations using an LSTM module to obtain the node feature matrix. ,in is the dimension of the LSTM hidden layer, and N represents the total number of nodes in the molecular graph.

[0106] Step 4: Interactive learning based on attention mechanism. For the drug pair to be predicted, the interaction attention score between the two drug molecules in the drug pair is calculated based on their respective node feature matrices through the cross-attention mechanism. Based on this, a node importance weight reflecting the strength of the interaction influence is generated. Then, the node features are weighted and aggregated using the node importance weight to generate an information-enhanced graph-level representation vector.

[0107] Step 5: Prediction step. The graph-level representation vectors from different feature paths are concatenated and fused with cell line features. The result is then input into the prediction module, which outputs the predicted probability that the drug combination has a synergistic effect.

[0108] Predicting drug synergistic effects hinges not only on the properties of individual drugs but also on the interactions between drug pairs. To capture this complex pairwise interaction information and simultaneously identify key chemical nodes that significantly contribute to synergistic effects, this invention designs a graph attention pooling module. The core of this module is a cross-attention pooling mechanism, the goal of which is to perform graph attention pooling on drug pairs... A set of importance weights that reflect the interaction between them is generated, and an enhanced graph-level representation is obtained accordingly.

[0109] Specifically, the implementation process of the graph attention pooling module is as follows: for drug A and drug B in the residual... The output feature matrix of the last ResGIN layer in the module , Both GIN features are obtained by multi-head attention pooling, resulting in pooled GIN features A and B. The node feature matrices output by the LSTM path for drugs A and B are also considered. , The pooled RNN features A and B are obtained by multi-head attention pooling. The feature concatenation module concatenates the pooled GIN features A, GIN features B, and RNN features A and B.

[0110] Based on the characteristics of the ResGIN path , (Corresponding to the output feature matrix of the last ResGIN layer mentioned above) , Taking (e.g., ) as an example, we will explain the multi-head attention pooling operation. Specifically:

[0111] First, calculate the interaction attention score matrix between drug pairs:

[0112]

[0113]

[0114] in, and This is the interaction attention score matrix for drug A and drug B along the ResGIN path. and These are the output feature matrices of drug A and drug B, respectively. It is a trainable weight matrix ( and The corresponding weight matrices are the same, used to project node features onto a common matrix. Dimensional attention space; Its elements The first part of drug A was quantified. The node and the first in drug B The interaction strength between nodes and elements Similarly; and These represent the number of nodes corresponding to drug A and drug B, respectively.

[0115] Next, the interaction attention scores are aggregated along the dimension of the other drug and normalized to obtain the importance weight of each node for each drug:

[0116]

[0117]

[0118] in, and These are the node importance vectors for drug A and drug B on the ResGIN path, respectively. Each element in the vector represents the relative importance of that node in the drug synergistic combination.

[0119] Finally, using the calculated importance weights, the final, interactively enhanced graph-level representation vector is obtained:

[0120]

[0121]

[0122] In the formula, and These are the graph-level representation vectors of drug A and drug B along the ResGIN path, respectively, i.e., GIN feature A and GIN feature B after pooling. Transform the weight matrix for trainable values.

[0123] Features of LSTM paths , Perform the exact same process, the above formula will and Replace with Specifically:

[0124]

[0125]

[0126]

[0127]

[0128]

[0129]

[0130] in, and This is the interaction attention score matrix between drug A and drug B along the LSTM path; It is a trainable weight matrix; Its elements The first part of drug A was quantified. The node and the first in drug B The interaction strength between nodes and Similarly. and These are the node importance vectors for drug A and drug B, respectively. Each element in the vector represents the relative importance of that node in the synergistic combination of drugs. These are the graph-level representation vectors of drug A and drug B in the LSTM path, respectively, i.e., pooled RNN feature A and pooled RNN feature B; Let be a trainable value transformation weight matrix.

[0131] Therefore, the graph-level representation vector on the LSTM path simplifies to:

[0132]

[0133] Here, AP represents the graph attention pooling module operation on the LSTM path.

[0134] The final output of the graph attention pooling module is the concatenation of four graph-level representation vectors from two paths. Assume each graph-level representation vector... Then the concatenated fused representation vector is:

[0135]

[0136] Among them, symbols This represents the concatenation operation of vectors along the feature dimension.

[0137] The prediction module receives the fused representation vector from the graph attention pooling module. This is then concatenated with the projected cell line features to form the final feature vector:

[0138]

[0139] Then, the feature vector The input is fed into a multilayer perceptron to obtain the final prediction output:

[0140]

[0141] For binary classification tasks, for Function, output This indicates the predicted frequency of synergistic effects of drug combinations.

[0142] The same approach that could be used in binary classification tasks Compared to functions, The function has significant advantages: it can directly output the probability of a single class without calculating the probability of the other class, thus making the computation more efficient; at the same time, it is mathematically a natural match with the binary cross-entropy loss function, the gradient derivation is stable, which is more conducive to model optimization and is the standard practice for this type of task.

[0143] The training in this invention optimizes all parameters by minimizing the cross-entropy loss function, which is defined as:

[0144]

[0145] in, It is the total number of training samples. It is a sample The true label, This is the sample predicted by the present invention. The probability of collaboration. Among them, the true label. To determine whether a drug pair has a synergistic effect, conventional techniques in the field can be used to determine the label value.

[0146] Experimental demonstration

[0147] First, the performance of this invention was evaluated on five widely used public benchmark datasets, including O'Neil, Oncology Screen, ALMANAC, DrugCombDB, and DrugComb. Statistical information for these datasets is shown in Table 1.

[0148] Table 1. Statistical information of the experimental dataset

[0149] Dataset Name Drug quantity Number of cell lines Number of drug combination samples o'Neil 38 39 23062 Oncology Screen 21 29 4176 ALMANAC 118 118 296503 DrugCombDB 600 68 60932 DrugComb 354 170 330917

[0150] To comprehensively evaluate the performance of this invention, it is compared with several representative state-of-the-art methods, covering a range of techniques from graph neural networks to multimodal learning:

[0151] AttenSyn's method utilizes a hierarchical attention mechanism to learn the deep interactions between drug pairs and cell line gene expression profiles. Its core lies in aggregating cell line features through drug-perceived attention, and aggregating drug pair features through cell line-perceived attention, thereby effectively capturing key signals from multimodal information.

[0152] DTSyn is a method that delves into the synergistic mechanisms of drugs in specific cell line contexts. DTSyn constructs a heterogeneous network composed of relationships such as drug-target and disease-gene, and learns low-dimensional representations of drugs and cell lines through network embedding techniques. Finally, it utilizes deep neural networks to predict synergistic probabilities. Its advantage lies in integrating rich prior knowledge of biological networks.

[0153] MR-GNN is a method that uses a multi-relationship graph neural network to model the chemical structure of drugs. MR-GNN treats the molecular graph as a multi-relationship graph and learns the information transmitted by different types of chemical bonds between atoms through graph convolutional networks, thereby extracting more expressive drug characterizations. Then, it combines cell line characteristics for collaborative prediction.

[0154] DeepSynergy is one of the earliest and classic methods to apply deep learning in this field. It uses a simple multilayer perceptron, directly concatenating the chemical descriptor of a drug with the gene expression profile of a cell line as input. Although relatively simple in structure, this method laid the foundation for deep learning-based drug co-prediction and remains an important benchmark for comparison.

[0155] To comprehensively evaluate the performance of different methods, this invention employs a set of multi-dimensional metrics. Accuracy (ACC) measures overall classification precision; Precision (PREC) and Recall (RECALL) evaluate the accuracy of predicting positive examples and the coverage of true positive examples, respectively; True Positive Rate (TPR) and True Negative Rate (TNR) reflect the invention's ability to identify positive and negative examples, respectively. Furthermore, to eliminate evaluation bias caused by class imbalance, this invention specifically uses Balanced Accuracy (BACC), the arithmetic mean of TPR and TNR, as a key metric for evaluating the balanced performance of each method. The calculation formulas for each metric are as follows:

[0156]

[0157] In the formula, TP represents the number of positive samples correctly predicted as positive, TN represents the number of negative samples correctly predicted as negative, FP represents the number of negative samples incorrectly predicted as positive, and FN represents the number of positive samples incorrectly predicted as negative.

[0158] To comprehensively evaluate the model's generalization ability and reduce bias caused by the randomness of data partitioning, this invention employs five-fold cross-validation. Specifically, the entire dataset is randomly divided into five mutually exclusive subsets. In each experiment, one subset is used as the test set, and the other four are used as the training set. The final performance is calculated as the average and standard deviation of the five test results. This strategy ensures that all samples participate in and only participate in one test, thus obtaining a more robust estimate of the model's performance.

[0159] The overall performance is shown in Table 2:

[0160] Table 2. Performance of the present invention on different benchmark datasets

[0161] method AUC ACC F1 PREC Recall BACC o'Neil 0.921±0.005 0.840±0.009 0.829±0.009 0.829±0.020 0.829±0.016 0.840±0.008 ALMANAC 0.873±0.006 0.801±0.007 0.795±0.013 0.803±0.015 0.784±0.019 0.794±0.010 Oncology Screen 0.906±0.009 0.827±0.009 0.816±0.007 0.821±0.017 0.820±0.016 0.834±0.009 DrugCombDB 0.912±0.004 0.829±0.013 0.818±0.005 0.824±0.015 0.824±0.013 0.835±0.005 DrugComb 0.895±0.007 0.812±0.013 0.809±0.010 0.815±0.017 0.804±0.016 0.815±0.007

[0162] This invention achieved superior performance on the most representative O'Neil dataset, ranking first in all six evaluation metrics, particularly with an AUC of 0.921 and classification accuracy metrics ACC and BACC of 0.840. This fully demonstrates the effectiveness of this invention in capturing key features of drug synergistic effects. The invention maintained high performance on the other four datasets, with AUC values ​​consistently ranging from 0.873 to 0.912. This result indicates that the feature representation and prediction mechanism learned in this invention does not overfit to a single data distribution but possesses strong cross-dataset generalization ability, adapting to variations arising from different experimental conditions and data sources. A comparison of the performance of this invention across different datasets is shown below. Figure 2 .

[0163] In terms of PREC and Recall, two metrics crucial for practicality, this invention demonstrates balanced performance across all datasets. For example, on the O'Neil dataset, both PREC and Recall reach 0.829, indicating that while discovering as many real synergistic drug combinations as possible, this invention also ensures high reliability of the predicted synergistic combinations. This balance is crucial for guiding real-world biomedical experiments and avoiding resource waste. The standard deviations of all reported results are low, reflecting minimal performance fluctuations across different data partitions, demonstrating high robustness and reproducibility.

[0164] Experimental results on five benchmark datasets (o'Neil, ALMANAC, Oncology Screen, DrugCombDB, DrugComb) show that the proposed model outperforms existing methods such as Attensyn, DTSyn, MR-GNN, and DeepSynergy in AUC values ​​across all datasets. It achieves the best performance in AUC evaluation metrics across all five benchmark datasets, fully validating the significant accuracy and generalization ability of this invention in the task of predicting drug synergistic effects. A comparison of the AUC performance of this invention with other methods on benchmark datasets is provided below. Figure 3 .

[0165] In summary, the experiments in this section systematically demonstrate that the present invention not only achieves top performance on specific datasets, but is also a solution with strong generalization ability and robust and reliable prediction results, providing a powerful computational tool for predicting drug synergy.

[0166] In some embodiments, the present invention also provides a system based on the above method, including a molecular graph representation module, a feature representation module based on a graph isomorphic network, a feature representation module based on an LSTM network, a graph attention pooling module, a cell line feature fusion module, and a prediction output module connected sequentially or interconnected.

[0167] The molecular graph characterization module is used to construct molecular graphs for each drug in the target drug pair and obtain the initial features of each node in the molecular graph.

[0168] The feature representation module based on graph isomorphic networks is used to input the initial features of all nodes into a K-layer graph isomorphic network to obtain the output feature matrix. Each layer of the graph isomorphic network outputs a set of output feature matrices containing the features of all nodes, where K is a positive integer.

[0169] The feature representation module based on the LSTM network is used to reconstruct the output features of each node in different layers of the graph isomorphic network into a time series of length K, which is then used as input to the LSTM network to obtain the updated node feature matrix.

[0170] The graph attention pooling module is used to perform multi-head attention pooling operations on the output feature matrix of the last layer graph isomorphic network and the node feature matrix of the LSTM network for each drug to obtain the pooling features of each drug in the GIN path and LSTM path. Then, the pooling features of the GIN path and LSTM path of all drugs are concatenated and fused to obtain the fused representation vector.

[0171] The cell line feature fusion module is used to concatenate the fused representation vector with cell line features to form the final feature vector.

[0172] The prediction output module is used to generate predictions based on feature vectors to obtain the prediction results of drug synergy.

[0173] It should be understood that the specific implementation process of each module is described in the foregoing method embodiments, and will not be repeated here. The division of functional modules in the above device is merely illustrative. In some embodiments, some functional modules may be combined, some functional modules may be split, and each functional module may be implemented in software, hardware, or a combination of software and hardware.

[0174] In some embodiments, the present invention also provides a computer device, comprising: one or more processors and a memory, the memory being used to store one or more computer programs; when the one or more computer programs are executed by the one or more processors, the computer device enables the implementation of the steps of the above-described method and system for predicting drug synergy by fusing residual graph isomorphic networks and cross-attention.

[0175] It should be understood that, in the embodiments of the present invention, the processor may be a central processing unit (CPU), or it may be other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.

[0176] In some embodiments, the present invention also provides a readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the steps of the above-described method and system for predicting drug synergy by fusing residual graph isomorphic networks and cross-attention.

[0177] For details on the implementation of each step, please refer to the description of the aforementioned drug synergy prediction method embodiment.

[0178] The aforementioned readable storage medium is a computer-readable storage medium, which can be an internal storage unit of the hardware and software device described in any of the foregoing embodiments, such as the hard drive or memory of the controller. The readable storage medium can also be an external storage device of the controller, such as a plug-in hard drive, Smart Media Card (SMC), Secure Digital (SD) card, or Flash Card equipped on the controller. Further, the readable storage medium can include both internal storage units and external storage devices of the controller. The readable storage medium is used to store the computer program and other programs and data required by the controller. The readable storage medium can also be used to temporarily store data that has been output or will be output.

[0179] Based on this understanding, the technical solution of the present invention, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned readable storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0180] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code. This application refers to flowchart illustrations and / or instructions executed by a processor of a method, apparatus (system), and computer program product according to embodiments of this application to create means for implementing the functions specified in one or more flowchart illustrations and / or one or more block diagrams. These computer program instructions may also be stored in a computer-readable storage medium capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means that implement the functions specified in one or more flowchart illustrations and / or one or more block diagrams. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer-implemented process, such that the instructions, which execute on the computer or other programmable apparatus, provide steps for implementing the functions specified in one or more flowcharts and / or one or more blocks of a block diagram.

[0181] It should be emphasized that the examples of this invention are illustrative rather than limiting. Therefore, this invention is not limited to the examples in the specific embodiments. Any other embodiments derived by those skilled in the art based on the technical solutions of this invention, without departing from the spirit and scope of this invention, whether modifications or substitutions, are also within the protection scope of this invention.

Claims

1. A method for predicting drug synergistic effects by integrating residual graph isomorphic networks and cross-attention, characterized in that: Includes the following steps: Molecular graph characterization: construct molecular graphs for each drug in the target drug pair and obtain the initial features of each node in the molecular graph; Based on the feature representation of graph isomorphic networks, the initial features of all nodes are input into a K-layer graph isomorphic network to obtain an output feature matrix containing multi-scale structural information. Each layer of the graph isomorphic network outputs a set of output features containing the features of all nodes, where K is a positive integer. Based on the feature representation of the LSTM network, the output features of each node in different layers of the graph isomorphic network are reconstructed into a time series of length K, which is then used as the input of the LSTM network to obtain the updated node feature matrix. Interactive learning based on attention mechanism is used to obtain the pooling features of each drug in the GIN path and LSTM path by multi-head attention pooling operation on the output feature matrix of the last layer graph isomorphic network and the node feature matrix of the LSTM network. Then, the pooling features of all drugs in the GIN path and LSTM path are concatenated and fused to obtain the fused representation vector. Cell line feature fusion involves concatenating the fused representation vector with the cell line features to form the final feature vector. The prediction output is based on the feature vector to obtain the prediction result of drug synergy.

2. The method according to claim 1, characterized in that: The graph isomorphic network is a residual graph isomorphic network, which is composed of K stacked ResGIN layers. The ResGIN layers are formed by introducing residual connections into the GIN layers. The initial features of a node serve as the input to the first ResGIN layer, and the output of the previous ResGIN layer serves as the input to the next ResGIN layer. Each ResGIN layer adds the original output of its respective GIN layer to the input of the current ResGIN layer to obtain the output of the current ResGIN layer.

3. The method according to claim 2, characterized in that: The mathematical model for the ResGIN layer is as follows: For node v, the raw output of the GIN layer in the k-th ResGIN layer. Defined as: ; in, R is a real vector space, and d is the hidden layer dimension of the node features. Represents a node The set of neighboring nodes u; It is a learnable scalar parameter. It is a multilayer perceptron that operates at each node. and These are the outputs of node v and its neighboring node u in the (k-1)th ResGIN layer, respectively. By skipping connections, Compared with the original output Adding them together gives the output vector of node v in the k-th ResGIN layer. : ; Therefore, the output feature matrix of the k-th ResGIN layer is expressed as: T is the matrix transpose symbol, and N is the number of nodes. These are the output vectors of the 1st, 2nd, and Nth nodes in the kth ResGIN layer, respectively.

4. The method according to claim 2, characterized in that: The LSTM network consists of K LSTM layers, which correspond one-to-one with the K ResGIN layers of a graph isomorphic network. The node feature update model of the LSTM network is as follows: ; in, It is a node In the The output of layer ResGIN is used as the LSTM network in the 1st layer. One of the inputs for each time step; and Is the LSTM network processing the first... Layer and first Hidden states after layer features; The node feature matrix S output by the LSTM network is represented as: , These are the hidden states output by the 1st, 2nd, and Nth nodes in the Kth LSTM layer, respectively.

5. The method according to claim 1, characterized in that: For both the output feature matrix of the GIN path and the node feature matrix of the LSTM path, the multi-head attention pooling operation is as follows: First, the interaction attention score between the two drug molecules in the drug pair is calculated using the cross-attention mechanism; Then, the interaction attention scores are aggregated along the dimension of the other drug and normalized to obtain the importance weight of each node for each drug. Finally, the importance weights are used to perform weighted aggregation on the corresponding output feature matrix or node feature matrix to generate pooled features for the corresponding GIN path or LSTM path.

6. The method according to claim 5, characterized in that: The output feature matrices of drug A and drug B in the last ResGIN layer of the target drug pair. , The implementation process of multi-head attention pooling operation is as follows: First, calculate the interaction attention score matrix between drug pairs: ; in, and This is the interaction attention score matrix for drug A and drug B along the ResGIN path. and These are the output feature matrices of drug A and drug B, respectively; It is a trainable weight matrix used to project node features onto a common weight matrix. Dimensional attention space; Its elements The first part of drug A was quantified. The node and the first in drug B The interaction strength between nodes and elements Similarly; and These represent the number of nodes corresponding to drug A and drug B, respectively. The dimension of the node features; Then, the attention scores are aggregated along the dimensions of the opposing drug and normalized to obtain the importance weight of each node for each drug: ; in, and These are the node importance vectors of drug A and drug B on the ResGIN path, respectively. Finally, the output feature matrix is ​​weighted and aggregated using importance weights to obtain the final interactive graph-level representation vector: ; In the formula, and These are the graph-level representation vectors of drug A and drug B along the ResGIN path, respectively, i.e., GIN feature A and GIN feature B after pooling. The weight matrix is ​​a transform matrix for trainable values, where the superscripts i and j represent nodes.

7. The method according to claim 1, characterized in that: The initial feature construction process of the nodes is as follows: construct a molecular graph based on the SMILES string of the drug molecule and generate the basic feature vector of the nodes in the molecular graph, and convert the gene expression profile of a specific cell line into cell line features; then fuse the cell line features with the basic feature vector of each node to obtain the initial features of each node.

8. A system based on the method of any one of claims 1-7, characterized in that: include: The molecular graph characterization module is used to construct molecular graphs of each drug in the target drug pair and obtain the initial features of each node in the molecular graph. The feature representation module based on the graph isomorphic network is used to input the initial features of all nodes into the K-layer graph isomorphic network to obtain the output feature matrix. Each layer of the graph isomorphic network outputs a set of output features containing the features of all nodes, where K is a positive integer. The feature representation module based on the LSTM network is used to reconstruct the output features of each node in different layers of the graph isomorphic network into a time series of length K, which is then used as input to the LSTM network to obtain an updated node feature matrix. The graph attention pooling module is used to obtain the pooling features of each drug in the GIN path and LSTM path by taking the output feature matrix of the last layer graph isomorphic network and the node feature matrix of the LSTM network through multi-head attention pooling operation. Then, the pooling features of the GIN path and LSTM path of all drugs are concatenated and fused to obtain the fused representation vector. The cell line feature fusion module is used to concatenate the fused representation vector with cell line features to form the final feature vector; The prediction output module is used to perform prediction output based on the feature vector to obtain the prediction result of drug synergy.

9. A computer device, characterized in that: include: One or more processors; Memory, used to store one or more computer programs; When the one or more computer programs are executed by the one or more processors, the computer device performs the steps of the method as described in any one of claims 1-7.

10. A readable storage medium having a computer program stored thereon, characterized in that: When the program is executed by the processor, it implements the steps of the method as described in any one of claims 1-7.