A Drug-Target Prediction Method Based on Structure-Aware Message Passing

By using a structure-aware message-passing neural network, combined with multi-channel feature extraction and sequence generation tree, the structure and mutual information of drugs and targets are captured, solving the problem of missing information in drug-target prediction in existing technologies and improving prediction accuracy.

CN117238362BActive Publication Date: 2026-06-30UNIV OF ELECTRONICS SCI & TECH OF CHINA

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
UNIV OF ELECTRONICS SCI & TECH OF CHINA
Filing Date
2023-09-28
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing technologies ignore the structural information of molecular maps in drug-target interaction prediction, ignore the physical, chemical and biological properties of protein feature embedding, and ignore the mutual information between drugs and targets.

Method used

A structure-aware message-passing neural network-based approach is adopted to predict drug-target interactions by preprocessing multi-channel target features, extracting structural features based on sequence generation trees, and extracting interactive information based on Transformer and message-passing neural networks.

Benefits of technology

It improves the accuracy of drug-target interaction prediction by acquiring rich drug structure information and multidimensional protein information to capture the bidirectional interaction between drugs and targets.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117238362B_ABST
    Figure CN117238362B_ABST
Patent Text Reader

Abstract

This invention belongs to the field of bioinformatics and relates to drug-target interaction (DTI) identification technology, specifically providing a drug-target prediction method based on structure-aware message passing. This invention proposes a structure feature extraction module based on sequence generation trees to model the multi-scale structure of drugs and extract features, incorporating structural information into the representation update process of the drug molecular graph, thereby enriching the molecular graph representation. Furthermore, multiple protein correlation matrices are used to fuse information such as protein structure and physicochemical properties, enhancing protein representation. Finally, an interaction feature extraction module based on transformers and message passing neural networks is used to better capture the bidirectional interaction between drugs and targets, thereby improving the accuracy of drug-target prediction.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of bioinformatics and relates to drug-target interaction (DTI) identification technology, specifically providing a drug-target prediction method based on structure-aware message passing. Background Technology

[0002] Drug targets refer to biomolecules in the body that possess pharmacological functions and can interact with drugs to produce the expected therapeutic effect, such as some proteins and nucleic acids. Among them, proteins account for a large proportion of drug targets in the human body. Drug-target interaction (DTI) identification is one of the important issues in the field of new drug development, namely, determining whether there is an interaction between drug molecules and targets, and based on this, searching for drug molecules that can act on specific targets. At the same time, DTI identification is also the foundation for multiplex pharmacology and drug repositioning studies. Therefore, conducting research on predicting the interaction relationships between drug compound molecules and target proteins is of great significance. However, conducting DTI research through biological experiments is time-consuming, costly, and risky. With the rapid advancement of information science, artificial intelligence has been widely applied in the field of drug development, and computer-aided drug design has become an effective means of DTI research.

[0003] For example, the paper "Communicative representation learning on attributed moleculargraphs" (IJCAI) discloses a directed graph-based message-passing neural network (MPNN) that interactively updates the embeddings of edges and nodes, enhancing message interaction between nodes and edges, thereby improving the embedding of molecular graphs. Another example is the paper "Predicting drug-protein interaction using quasi-visual question answering system" (Nature Machine Intelligence), which transforms the drug-target interaction prediction problem into a visual question-answering problem: the image is a distance map of proteins, the question is a one-dimensional sequence of drugs, and the answer is whether they interact; it uses an attention-based dynamic convolutional neural network to learn a fixed-size representation of the variable-length distance map of proteins, and introduces a bidirectional long short-term memory network to extract drug features from the one-dimensional sequence.

[0004] Currently, artificial intelligence technology has made some progress in the application of drug-target prediction, but there are still many problems: 1) It ignores the structural information of molecular graphs when transmitting messages, while substructure information is closely related to DTI; 2) Protein feature embedding ignores its physical, chemical and biological properties; 3) It ignores the mutual information between drug and target when predicting drug-target interactions. Summary of the Invention

[0005] The purpose of this invention is to address the numerous problems existing in the prior art by proposing a drug-target interaction prediction method based on a structure-aware message-passing neural network. This method improves the accuracy of drug-target interaction prediction by using multi-channel target feature preprocessing, structural feature extraction based on sequence generation trees, and interactive information extraction based on transformers and message-passing neural networks.

[0006] To achieve the above objectives, the technical solution adopted by the present invention is as follows:

[0007] A drug-target prediction method based on structure-aware message passing, characterized by the following steps:

[0008] Step 1: Construct model inputs, including target inputs and drug inputs;

[0009] The target input SetM protein Represented as:

[0010]

[0011] in, Represents the protein amino acid coding matrix. Represents a protein distance map. Represents the protein evolution matrix. This represents a matrix showing the differences in the physicochemical properties of proteins. The matrix represents the three-dimensional structural features of a protein at the statistical potential level. A matrix representing the three-dimensional structural features of protein contact energy;

[0012] The drug input includes a SMILES sequence and a drug molecule map. The SMILES sequence of the drug is represented as: Smi=(d1,…,d…). l′ ), where d i Let l' represent the i-th character in the sequence, and l' be the length of the SMILES sequence; the drug molecule graph is represented as G = (V, E), where V represents the set of nodes and E represents the set of edges;

[0013] Step 2: Construct a target feature preprocessing module, including a multi-channel feature extraction module and a feature selection module; the multi-channel feature extraction module is used to extract the protein attention feature matrix of each channel in the target input. The feature selection module is used to calculate the attention score of the protein feature matrix of each channel in the target input. The target feature matrix P is obtained by weighted summation.

[0014] Step 3: Construct a drug feature preprocessing module, including a structural feature extraction module based on subsequence generation tree, a drug molecule graph atomic feature initialization module, and a drug molecule graph edge feature initialization module. The structural feature extraction module based on subsequence generation tree is used to extract the drug structural feature matrix S from the drug's SMILES sequence. The drug molecule graph atomic feature initialization module encodes all nodes in the point set V of the drug molecule graph, obtaining the corresponding node feature vectors. The drug molecule graph edge feature initialization module encodes all edges in the edge set E of the drug molecule graph, resulting in corresponding edge feature vectors.

[0015] Step 4: Construct a drug-target interaction feature extraction module, inputting the node features X of the drug molecule graph. v Sum of edge features The drug structure feature matrix S and the target feature matrix P are used for structure-aware message passing, and after L iterations, the drug feature vector D is output. final and target feature vector P final ;

[0016] Step 5: Construct a drug-target interaction prediction module and use a multilayer perceptron model to calculate the prediction result Y:

[0017] Y = sigmoid(W2ReLU(W1[P]) final D final ]))

[0018] W1 and W2 are weight matrices.

[0019] Furthermore, in step 2, the multi-channel feature extraction module consists of a dynamic convolutional neural network and a self-attention block, wherein,

[0020] The dynamic convolutional neural network consists of multiple ResNet blocks and an average pooling layer, taking the target input matrix as input. Transform into a matrix

[0021] The self-attention block is a two-layer unbiased perceptron model, and the weight matrix is ​​obtained through the self-attention block.

[0022]

[0023] in, and This is the weight matrix;

[0024] The protein attention feature matrix is ​​calculated based on the weight matrix.

[0025]

[0026] This yields the protein attention feature matrices for all channels, forming a set SetM protein ":

[0027]

[0028] Furthermore, in step 2, within the feature selection module, a perceptron model is first applied to calculate the weight scores of the protein feature matrices for each channel.

[0029]

[0030] in, This is the weight matrix. Let q be the bias matrix, and q represent the shared attention vector.

[0031] Then, the softmax function is applied to normalize the weight scores to obtain the attention scores of the protein feature matrices for each channel.

[0032]

[0033] Finally, the target feature matrix P is calculated:

[0034] Furthermore, in step 3, the structural feature extraction module based on the subsequence generation tree consists of a multi-scale subsequence generation tree construction module and a feature extraction module;

[0035] In the multi-scale subsequence spanning tree construction module, a subsequence spanning tree is constructed based on the drug's SMILES sequence Smi. Each character in the drug SMILES sequence is modeled as a leaf node, a continuous subsequence of length k in the drug SMILES sequence is modeled as a substructure node, and the entire drug SMILES sequence is modeled as a global node. During the tree construction process, the leaf nodes corresponding to the characters contained in the subsequence are regarded as the child nodes of the corresponding substructure nodes of the continuous subsequence. If two subsequences have overlapping parts, the corresponding leaf nodes are copied. After generating subsequence spanning trees at k scales, the multi-scale subsequence spanning tree construction module performs initialization operations.

[0036] The feature extraction module iteratively updates the subsequence generation tree at each scale, uses the hidden vector of the global node in the updated subsequence generation tree as the drug feature vector, and concatenates the drug feature vectors of the subsequence generation trees at k scales to obtain the drug structure feature matrix S.

[0037] Furthermore, the initialization operation is as follows: for leaf nodes, use the word embedding model GloVe to obtain the node embedding D. i For each substructure node, a convolutional neural network model is used to learn and extract the substructure embedding (Subs). i For global nodes, average pooling is performed on the substructure embeddings to obtain their initial embeddings.

[0038] Iterative updates involve updating the representation of each node in the spanning tree of each subsequence. The specific process is as follows:

[0039] 1) Follow each other;

[0040] A multi-head attention mechanism is applied, mapping the center node and source node to the query vector Q and key vector K, respectively; for the i-th center node c and any of its source nodes s, the function L is applied respectively. Q and L K Perform a linear mapping:

[0041]

[0042]

[0043] Among them, h i ( ) represents the hidden vector of the node inside the parentheses. Let Q be the linear mapping function between the center node c and the source node s. i (c) K i (s) represents the result of the linear mapping;

[0044] Calculate the attention score Att(c,s) of each source node s with respect to the center node c:

[0045] Att(c,s)=softmax([AttHead 1 (c,s),…,AttHead kH (c,s)])

[0046]

[0047] Where, d k Let [kH] be the dimension of the key vector, [·] denotes the join operation, and kH be the number of attention heads.

[0048] 2) Message extraction;

[0049] From having linear projection The message vector Mes is extracted from the source node s of the attention head. i (c,s):

[0050]

[0051] Merging the multi-head attention message vectors yields the message vector Mes(c,s) passed from each source node to the central node:

[0052] Mws(c,s)=[Mes 1 (c,s),…,Mes kH (c,s)])

[0053] 3) Message fusion;

[0054] The hidden vector of the center node is updated as follows:

[0055]

[0056] Where N(c) represents the set of neighboring nodes of the central node c;

[0057] Next, the hidden vector is updated using the nonlinear activation function σ and residual connections to obtain h′. i (c):

[0058]

[0059] Finally, a feedforward fully connected neural network (FFN) is applied to obtain the feature representation of the central node c.

[0060]

[0061] 4) Iterative updates;

[0062] A bottom-up strategy is adopted to update every node of the spanning tree for each subsequence.

[0063] Furthermore, in step 3, the atomic features of the drug molecule map are initialized as follows: all nodes in the point set V are encoded according to the atomic features Atom type, Degree, Num of H, Valence, and Aromaticity, and all are encoded using one-hot encoding.

[0064] The edge features of the drug molecule graph are initialized as follows: all edges in edge set E are encoded according to the chemical bond features Bond type, Conjugation, Ring, and Stereo, and one-hot encoding is also used.

[0065] Furthermore, step 4 specifically involves:

[0066] First, gather the representations of all incoming edges for each node v∈V in graph G. Simultaneously, the drug feature structure matrix S is added to obtain the message vector m. k (v):

[0067]

[0068] in, MP stands for message passing function;

[0069] Hide the current node's state h k-1 (v), message vector m k (v) Connect, and update the node's hidden state via the communication function CF.

[0070]

[0071] Among them, h 0 (v)=X v ;

[0072] The Interformer module learns the mutual information between the drug and the target, and updates the representation vectors of the drug and the target:

[0073]

[0074] Among them, P 0 =P;

[0075] right Update the hidden state of edges using the update function UF:

[0076] h k (e v,w )=UF(h k (v),h k-1 (e w,v ))

[0077] After L iterations, a final iteration is performed to obtain the feature vectors D of the drug and target. final and P final :

[0078]

[0079] h(v)=CF(m(v),h L (v),X v )

[0080] h final (v),P final=Interformer(h(v),P L )

[0081]

[0082] Here, Readout represents the readout layer.

[0083] Furthermore, in step 4, the Interformer module is implemented by two interactive transformer decoders. Each decoder consists of multiple identical layers, and each layer includes three sub-layers: a multi-head self-attention layer, a multi-head interactive attention layer, and a feedforward network layer.

[0084] Based on the above technical solution, the beneficial effects of the present invention are as follows:

[0085] This invention proposes a drug-target prediction method based on structure-aware message passing. First, a structural feature extraction module based on a subsequence generation tree is proposed. This module extracts drug structural features and uses them in the subsequent message passing process, ensuring that the transmitted message includes not only neighborhood information but also structural information, resulting in a drug molecule representation with rich information. Second, a target feature preprocessing module is proposed, considering multidimensional protein information, including structural information as well as physical, chemical, and biological information. Multiple features are then selected to ensure sufficient target feature richness while reducing potential information redundancy. Finally, an interaction information extraction module based on Transformer and message passing neural networks is proposed to better capture the bidirectional interaction between the drug and target, rather than simply concatenating drug and target features, thus obtaining better interaction prediction results. Attached Figure Description

[0086] Figure 1 This is a flowchart illustrating the drug-target prediction method based on structure-aware message passing in this invention.

[0087] Figure 2 This is a schematic diagram of the structure of the dynamic convolutional neural network in the multi-channel feature extraction module of this invention.

[0088] Figure 3 This is a schematic diagram of the structure of the self-attention block in the multi-channel feature extraction module of the present invention.

[0089] Figure 4 This is a schematic diagram of the structural feature extraction module based on subsequence generation tree in this invention.

[0090] Figure 5 This is a schematic diagram of the structure of the drug-target interaction feature extraction module in this invention. Detailed Implementation

[0091] To make the objectives, technical solutions, and beneficial effects of this invention clearer, the invention will be further described in detail below with reference to the accompanying drawings and embodiments.

[0092] This embodiment provides a drug-target interaction prediction method based on a structure-aware message passing neural network. The data used comes from the Yamanashi and DrugBank datasets, which include 1481 drugs, 1408 proteins and 9880 DTIs. The target data used are all proteins, so the concepts of target and protein are not distinguished.

[0093] The flowchart of the drug-target interaction prediction method based on structure-aware message-passing neural network in this embodiment is as follows: Figure 1 As shown, the specific steps are as follows:

[0094] Step 1: Model Input

[0095] Step 1.1, Target Input Representation

[0096] Assume the original amino acid sequence of the protein is represented as Seq=(a1,…,a…). l ), where a i Let represent the i-th amino acid in the sequence, and l represent the length of the amino acid sequence;

[0097] The protein input in the model is represented by 6 l×l matrices, and the matrix set is as follows:

[0098]

[0099] in, Let f:A represent the protein amino acid coding matrix. pr ×A pr →N pr A pr A represents a set of amino acids. pr ×A pr = {<a,b>; a,b∈A} pr} represents set A pr Cartesian product, N pr Represents the set of integers, where <a,b> and <b,a> map to set N. pr The same element; based on this surjective mapping, the protein amino acid coding matrix can be obtained.

[0100] A protein distance map is used to generate and compare protein 3D structures. It is a compact representation of the protein's three-dimensional structure, characterized by pairwise distances between amino acids. The formula for calculating the pairwise distances between amino acids is as follows:

[0101]

[0102] Wherein d(a i ,a j ) represents the distance between Cα values ​​of amino acids i and j, and d0 represents the distance between adjacent Cα values.

[0103] Represents the protein evolution matrix, Represents the protein physicochemical property difference matrix, The matrix representing the three-dimensional structural features of a protein (statistical potential level) The Aaindex matrix represents the three-dimensional structural features of proteins (at the contact energy level). It is a matrix database representing the physicochemical and biochemical properties of amino acids and amino acid pairs. This invention uses four 20×20 amino acid pair matrices. For the amino acid sequence of a protein, the values ​​of the corresponding amino acid pairs are searched in the four Aaindex matrices to ultimately obtain the protein matrix.

[0104] Step 1.2, Drug Input Representation

[0105] The model uses two numerical representation methods for drug input: SMILES sequences and drug molecule diagrams; among them,

[0106] Drug SMILES Sequences: The Simplified Molecular Input Line Entry Specification (SMILES) is a specification that explicitly describes the structure of drug molecules using ASCII strings. The SMILES sequences of drug compounds are extracted from the PubChem compound database based on PubChem CID. The SMILES sequence of a drug is represented as Smi = (d1, ..., d...). l′ ), where d i Let represent the i-th character in the sequence, and l′ be the length of the SMILES sequence;

[0107] Drug molecule graph: The drug molecule graph uses atoms as nodes and the chemical bonds between atoms as edges, represented as G=(V,E), where V represents the set of nodes and E represents the set of edges. The node set V and edge set E of the drug molecule graph are obtained by converting the SMILES sequence using the open source software package RDKit.

[0108] Step 2: Target feature preprocessing module, including multi-channel feature extraction module and feature selection module;

[0109] Step 2.1, for the multi-channel feature extraction module

[0110] The multi-channel feature extraction module consists of a dynamic convolutional neural network and a self-attention block. For the input of 6 l×l protein representation matrices, the dynamic convolutional neural network can handle variable-length inputs, and the self-attention block reflects the importance of different amino acids, ultimately resulting in 6 fixed-size protein attention feature matrices.

[0111] Dynamic convolutional neural networks, such as Figure 2 As shown, it consists of multiple ResNet blocks (the number of ResNet blocks is a model hyperparameter, and the specific number is obtained through training) and an average pooling layer. The protein matrix of this network... Transformed into

[0112] Self-attention blocks such as Figure 3 As shown, this is a two-layer unbiased perceptron model. If the number of neurons *r* in the last layer is considered as the number of attention heads, then a weight matrix is ​​obtained through this self-attention block.

[0113]

[0114] in, and The weight matrix is ​​a learnable matrix;

[0115] Then, the weight matrix With protein matrix The protein attention feature matrix is ​​obtained by multiplying.

[0116]

[0117] Finally, the protein attention feature matrices for all channels are obtained, and their set is as follows:

[0118]

[0119] Step 2.2, for the feature selection module

[0120] For multi-channel protein attention feature matrices, a feature-level attention network is used for feature selection to learn the importance of each feature matrix for DTI prediction. This module yields the protein feature matrix after feature selection.

[0121] First, the perceptron model is used to calculate the weight scores of the protein feature matrices for each channel:

[0122]

[0123] in, The weight matrix is ​​a learnable matrix. Let q be the bias matrix, and q represent a shared attention vector.

[0124] Next, the attention scores of the protein feature matrices for each channel are obtained by normalizing the weight scores using the softmax function:

[0125]

[0126] Then, the final representation of the protein feature matrix can be obtained:

[0127]

[0128] Step 3: Drug Characterization Preprocessing Module

[0129] Step 3.1: Structural feature extraction module based on subsequence generation tree.

[0130] The structural feature extraction module based on subsequence generation tree, such as... Figure 4 As shown, it mainly consists of a multi-scale subsequence generation tree construction module and a feature extraction module;

[0131] To obtain substructure and global structure information of drug compound molecules, a multi-scale subsequence spanning tree is constructed. The subsequence spanning tree is built based on the drug's SMILES sequence Smi. Each character in the drug's SMILES sequence is modeled as a leaf node, consecutive subsequences of length k in the drug's SMILES sequence are modeled as substructure nodes, and the entire drug SMILES sequence is modeled as a global node. During tree construction, the leaf nodes corresponding to characters in a subsequence are considered as child nodes of the corresponding substructure node of that consecutive subsequence. If two subsequences overlap, the corresponding leaf nodes are copied. Figure 4 A schematic diagram of a 2-subsequence spanning tree is given, for the drug Smi=(d1d2…d…). l′ Model each character d i As leaf nodes, model all consecutive 2-subsequences d1d2, d2d3… as substructure nodes, and model the entire SMILES sequence d1d2…d… l′ Modeled as a global node;

[0132] The model generates k tree structures at different scales (k is a model hyperparameter), and then performs initialization operations for the generated trees; specifically, for leaf nodes, the word embedding model GloVe is used to obtain the node embedding D. i For each substructure node, a convolutional neural network model is used to learn and extract the substructure embedding (Subs). i For global nodes, average pooling is performed on the substructure embeddings to obtain their initial embeddings.

[0133] To perform feature extraction, this invention constructs a tree transformer layer based on the transformer model to iteratively update the representation of each node in the tree generated by each subsequence. This mainly includes four steps: mutual attention, message extraction, message fusion, and iterative update. Note that this invention represents the current node that needs updating as the center node, and nodes connected to the center node by edges as source nodes. For a specific node in the model, h is used. i This represents its hidden vector representation;

[0134] 1) Mutual following

[0135] The importance of each source node to the center node is estimated by establishing mutual attention between the center node and the source nodes. Specifically, a multi-head attention mechanism is first applied, mapping the center node and the source nodes to the query vector Q and the key vector K, respectively. That is, for the i-th center node c and any of its source nodes s, the function L is applied respectively. Q and L K Perform the linear mapping as follows:

[0136]

[0137]

[0138] Then, calculate the attention score Att(c,s) of each source node s with respect to the center node c:

[0139] Att(c,s)=softmax([AttHead 1 (c,s),…,AttHead kH (c,s)])

[0140]

[0141] Where, d k Let [kH] be the dimension of the key vector, [·] denotes the join operation, and kH be the number of attention heads.

[0142] 2) Message Extraction

[0143] This process involves extracting information passed from the source node s to the central node c. Specifically, firstly, it involves extracting information from the source node s that has a linear projection. The message vector Mes is extracted from the source node s of the attention head. i (c,s):

[0144]

[0145] Then, the multi-head attention message vectors are merged to obtain the message vector Mes(c,s) passed from each source node to the central node:

[0146] Mes(c,s) = [Mes 1 (c,s),…,Mes kH (c,s)])

[0147] 3) Message fusion

[0148] This process obtains the context node representation by aggregating neighborhood messages using the corresponding attention scores. The hidden vector update of the central node is represented as follows:

[0149]

[0150] Where N(c) represents the set of neighboring nodes of the central node c;

[0151] Next, the hidden vector representation is updated using a nonlinear activation function σ and residual connections:

[0152]

[0153] Finally, a feedforward fully connected neural network (FFN) is applied to obtain the final representation of the center node c:

[0154]

[0155] 4) Iterative updates

[0156] This invention adopts a bottom-up update strategy, where coarse-grained nodes aggregate fine-grained node features to obtain rich substructure features and drug feature representations. Specifically, substructure nodes obtain fine-grained information from leaf nodes to update substructure feature representations, and global nodes obtain fine-grained information from substructure nodes to update drug feature representations.

[0157] Finally, the features of global nodes under different scale spanning trees are integrated to obtain drug features rich in substructure and global structure. Specifically, the integration method is to concatenate the drug feature vectors of different scale substructure spanning trees to obtain the final drug structure feature matrix S as the structural information for subsequent message passing.

[0158] Step 3.2: Initialization of Atomic Features of Drug Molecular Map

[0159] All nodes in the point set V are encoded according to the five features of atoms. The detailed description of the atom features is shown in Table 1. For these five features, the present invention uses one-hot encoding.

[0160] Table 1

[0161]

[0162] For the atomic feature "Atom type", there are 43 known atomic descriptors. "Unknown" is defined as any atom other than these 43. One-hot encoding is used to encode this atomic feature, with a one-hot vector length of 44. For the atomic features "Degree", "Num of H", and "Valence", the range of the number of adjacent atoms, hydrogen atoms, and implicit hydrogen atoms is set to [0,10]. Therefore, one-hot encoding is used to encode these three atomic features, with a one-hot vector length of 11 for each. For the atomic feature "Aromaticity", there are only two possibilities: whether the atom is in an aromatic structure. Therefore, when using one-hot encoding to encode this atomic feature, the one-hot vector length is set to 1. Concatenating the one-hot encoded vectors of these five atomic features yields a vector of length 78, which is the node feature vector.

[0163] Step 3.3: Initialization of drug molecule graph edge features

[0164] Similarly, all edges in edge set E are encoded according to the four characteristics of chemical bonds, and a detailed description of the chemical bond characteristics is shown in Table 2; one-hot encoding is also used for these four characteristics.

[0165] Table 2

[0166] feature Feature description Size Bond type single, double, triple, aromatic 4 Conjugation Are chemical bonds conjugated? 1 Ring Are chemical bonds located in a ring? 1 Stereo Stereochemical structure of chemical bonds 4 Total 10

[0167] "Bond type" encodes the type of chemical bond, with a one-hot vector length of 4; "Conjugation" and "Ring" encode whether the chemical bond is conjugated and whether it is in a ring, respectively, with a one-hot vector length of 1; "Stereo" encodes the stereochemical structure of the chemical bond, with a one-hot vector length of 4. Concatenating the one-hot encoded vectors of these four atomic features yields a vector of length 10, which is the edge feature vector.

[0168] Step 4: Drug-Target Interaction Feature Extraction Module

[0169] This implementation uses the Transformer architecture (Interformer) and message-passing neural networks to create a drug-target interaction feature extraction module, such as... Figure 5 As shown;

[0170] The drug molecule model is a directed graph G = (V, E), with input node features. Drug feature structure matrix S and edge features And the protein feature matrix P, through structure-aware message passing, learns the output drug feature vector D, which learns interactive features after L iterations. final and target feature vector P final ;

[0171] Specifically, first, gather the representations h of all incoming edges for each node v∈V in graph G. k-1 (e u,v ), Simultaneously, the drug feature structure matrix S is added, so that the transmitted message contains not only information about neighboring nodes but also structural information, thus obtaining the message vector m. k (v):

[0172]

[0173] in, MP stands for message passing function;

[0174] Hide the current node's state h k-1 (v), message vector m k (v) Connect, and update the node's hidden state via the communication function CF.

[0175]

[0176] Among them, h 0 (v)=X v ;

[0177] The communication function here uses a multilayer perceptron model, which can exchange messages from feature vectors of different dimensions:

[0178]

[0179] in, It is a learnable weight matrix;

[0180] The Interformer module learns the mutual information between the drug and the target, and updates the representation vectors of the drug and the target:

[0181]

[0182] Among them, P 0 =P;

[0183] right Update the hidden state of edges using the update function UF:

[0184]

[0185] Here, a multilayer perceptron model is used to implement the update function:

[0186]

[0187] m k (e v,w ) = h k (v)-h k-1 (e w,v )

[0188] in, Let m be a learnable weight matrix. k (e v,w ) is passed from node v to edge e v,w The message vector;

[0189] The Interformer in this module is implemented by two interactive transformer decoders to extract drug and target features that can learn mutual information; each decoder consists of multiple identical layers, each layer including three sub-layers: a multi-head self-attention layer, a multi-head interactive attention layer, and a feedforward network layer.

[0190] After L iterations, a final iteration is performed to obtain the feature vectors D of the drug and target. final and P final .

[0191]

[0192] h(v)=CF(m(v),h L (v),X v )

[0193] h final (v),P final =Interformer(h(v),P L )

[0194]

[0195] Where Readout represents the readout layer;

[0196] Step 5: Drug-target interaction prediction module;

[0197] The prediction results are calculated using a multilayer perceptron model:

[0198] Y = sigmoid(W2ReLU(W1[P]) final D final ]))

[0199] Among them, P final D is the target feature vector. finalLet W1 be the drug feature vector, and W2 be the learnable weight matrices.

[0200] The above description is merely a specific embodiment of the present invention. Any feature disclosed in this specification may be replaced by other equivalent or similar features unless otherwise specified. All disclosed features, or steps in all methods or processes, may be combined in any way except for mutually exclusive features and / or steps.

Claims

1. A drug-target prediction method based on structure-aware message passing, characterized in that, Includes the following steps: Step 1: Construct model inputs, including target inputs and drug inputs; The target input Represented as: ={ } in, Represents the protein amino acid coding matrix. Represents a protein distance map. Represents the protein evolution matrix. This represents a matrix showing the differences in the physicochemical properties of proteins. The matrix represents the three-dimensional structural features of a protein at the statistical potential level. A matrix representing the three-dimensional structural features of protein contact energy; The drug input includes a SMILES sequence and a drug molecule map. The SMILES sequence of the drug is represented as follows: ,in, Indicates the first position in the sequence One character, The length of the SMILES sequence; the drug molecule diagram is represented as follows. Where V represents the set of nodes and E represents the set of edges; Step 2: Construct a target feature preprocessing module, including a multi-channel feature extraction module and a feature selection module; the multi-channel feature extraction module is used to extract the protein attention feature matrix of each channel in the target input. , The feature selection module is used to calculate the attention score of the protein feature matrix of each channel in the target input. The target feature matrix is ​​obtained by weighted summation. ; Step 3: Construct a drug feature preprocessing module, including a structural feature extraction module based on subsequence generation tree, a drug molecule graph atomic feature initialization module, and a drug molecule graph edge feature initialization module. The structural feature extraction module based on subsequence generation tree is used to extract the drug structural feature matrix S from the drug's SMILES sequence. The drug molecule graph atomic feature initialization module encodes all nodes in the point set V of the drug molecule graph, obtaining the corresponding node feature vectors. The drug molecule graph edge feature initialization module encodes all edges in the edge set E of the drug molecule graph, resulting in corresponding edge feature vectors. ; Step 4: Construct a drug-target interaction feature extraction module, inputting the node features of the drug molecule graph. Sum of edge features The drug structure feature matrix S and the target feature matrix P are used to output the drug feature vector after L iterations via structure-aware message passing. and target feature vector ; Step 5: Construct a drug-target interaction prediction module and use a multilayer perceptron model to calculate the prediction results. : in, and This is the weight matrix.

2. The drug-target prediction method based on structure-aware message passing as described in claim 1, characterized in that, In step 2, the multi-channel feature extraction module consists of a dynamic convolutional neural network and a self-attention block, wherein, The dynamic convolutional neural network consists of multiple ResNet blocks and an average pooling layer, which converts the matrix of each channel in the target input into a single layer. Transform into a matrix , ; The self-attention block is a two-layer unbiased perceptron model, and the weight matrix is ​​obtained through the self-attention block. : in, and This is the weight matrix; The protein attention feature matrix is ​​calculated based on the weight matrix. : This yields the protein attention feature matrices for all channels, forming a set. : ={ }。 3. The drug-target prediction method based on structure-aware message passing as described in claim 1, characterized in that, In step 2, within the feature selection module, the perceptron model is first applied to calculate the weight scores of the protein feature matrices for each channel. : in, This is the weight matrix. The bias matrix, Represents a shared attention vector; Then, the softmax function is applied to normalize the weight scores to obtain the attention scores of the protein feature matrices for each channel. : Finally, the target feature matrix is ​​calculated. : .

4. The drug-target prediction method based on structure-aware message passing as described in claim 1, characterized in that, In step 3, the structural feature extraction module based on the subsequence generation tree consists of a multi-scale subsequence generation tree construction module and a feature extraction module; In the multi-scale subsequence spanning tree construction module, a subsequence spanning tree is constructed based on the drug's SMILES sequence Smi. Each character in the drug SMILES sequence is modeled as a leaf node, a continuous subsequence of length k in the drug SMILES sequence is modeled as a substructure node, and the entire drug SMILES sequence is modeled as a global node. During the tree construction process, the leaf nodes corresponding to the characters contained in the subsequence are regarded as the child nodes of the corresponding substructure nodes of the continuous subsequence. If two subsequences have overlapping parts, the corresponding leaf nodes are copied. After generating subsequence spanning trees at k scales, the multi-scale subsequence spanning tree construction module performs initialization operations. The feature extraction module iteratively updates the subsequence generation tree at each scale, uses the hidden vector of the global node in the updated subsequence generation tree as the drug feature vector, and concatenates the drug feature vectors of the subsequence generation trees at k scales to obtain the drug structure feature matrix S.

5. The drug-target prediction method based on structure-aware message passing as described in claim 4, characterized in that, The initialization operation is as follows: For leaf nodes, the node embeddings are obtained using the word embedding model GloVe. For each substructure node, a convolutional neural network model is used to learn and extract the substructure embedding. ; For global nodes, average pooling is performed on the substructure embeddings to obtain their initial embeddings; Iterative updates involve updating the representation of each node in the spanning tree of each subsequence. The specific process is as follows: 1) Mutual following; A multi-head attention mechanism is applied, mapping the center node and source node to the query vector Q and key vector K, respectively; for the i-th center node c and any of its source nodes s, the function is applied respectively. and Perform a linear mapping: in, The hidden vector representing the node within the brackets. , This represents a linear mapping function between the center node c and the source node s. , This represents the result of a linear mapping; Calculate the attention score of each source node s with respect to the center node c. : in, The dimension of the key vector. This indicates a join operation, where kH is the number of attention heads; 2) Message extraction; From having linear projection The message vector is extracted from the source node s of the attention head. : Merging the multi-head attention message vectors yields the message vector passed from each source node to the central node. : 3) Message fusion; The hidden vector of the center node is updated as follows: Where N(c) represents the set of neighboring nodes of the central node c; Next, a nonlinear activation function is applied. The hidden vector is updated by connecting with the residual, resulting in... : Finally, a feedforward fully connected neural network is applied. Obtain the feature representation of the central node c : 4) Iterative updates; A bottom-up strategy is adopted to update every node of the spanning tree for each subsequence.

6. The drug-target prediction method based on structure-aware message passing as described in claim 1, characterized in that, In step 3, the atomic features of the drug molecule map are initialized as follows: all nodes in the point set V are encoded according to the atomic features Atom type, Degree, Num of H, Valence, and Aromaticity, and all are encoded using one-hot encoding. The edge features of the drug molecule graph are initialized as follows: all edges in edge set E are encoded according to the chemical bond features Bond type, Conjugation, Ring, and Stereo, and one-hot encoding is also used.

7. The drug-target prediction method based on structure-aware message passing as described in claim 1, characterized in that, Step 4 specifically involves: First, aggregate each node. Representation of all incoming edges in graph G Simultaneously, the drug feature structure matrix S is added to obtain the message vector. : in, , This represents a message passing function; Hide the current node's state Message vectors Connection, via communication function Update the hidden state of the node to : in, ; The Interformer module learns the mutual information between the drug and the target, and updates the representation vectors of the drug and the target: in, ; right Use update function Update the hidden state of the edge: After L iterations, a final iteration is performed to obtain the feature vectors of the drug and the target. and : in, Indicates the readout layer.

8. The drug-target prediction method based on structure-aware message passing as described in claim 7, characterized in that, In step 4, the Interformer module is implemented by two interactive transformer decoders. Each decoder consists of multiple identical layers, and each layer includes three sub-layers: a multi-head self-attention layer, a multi-head interactive attention layer, and a feedforward network layer.