Peptide-protein interaction prediction method and system based on multi-source features and feature fusion

By employing a multi-source feature fusion method, combined with the Mamba-2 model and CNN-BiLSTM network, the problem of insufficient structural information in existing peptide-protein interaction prediction methods is solved, achieving high-precision and efficient peptide-protein interaction prediction.

CN122201407APending Publication Date: 2026-06-12HAINAN UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
HAINAN UNIV
Filing Date
2026-03-02
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

Existing peptide-protein interaction prediction methods rely on structural information and lack the ability to fuse heterogeneous features, resulting in low prediction accuracy and high computational complexity, making it difficult to balance modeling ability and computational efficiency.

Method used

We employ a multi-source feature and feature fusion approach, which extracts residue-level hand-coded features, contact map structure features, and ESM-2 pre-trained residue-level embedding features from peptide sequences, as well as sequence-level statistical features and ESM-2 global pooling features from protein sequences. These features are then combined with the Mamba-2 model, global-local channel attention, CNN-BiLSTM network, and multi-scale sparse cross attention for feature fusion, and finally, prediction is performed using a multilayer perceptron.

🎯Benefits of technology

It achieves high-precision and high-efficiency peptide-protein interaction prediction, enhances the ability to mine complementary relationships between peptide-protein features, improves prediction accuracy, and reduces computational complexity.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122201407A_ABST
    Figure CN122201407A_ABST
Patent Text Reader

Abstract

The present application relates to a kind of peptide-protein interaction prediction method and system based on multi-source feature and feature fusion.The method comprises: defining peptide sequence, protein sequence;For peptide sequence, extract residue level hand-coded feature, contact map structure feature, ESM-2 pre-training residue level embedding feature;For protein sequence, extract sequence level statistical feature, ESM-2 global pooling feature;Peptide segment multi-source feature is input into peptide segment branch, respectively after being processed by Mamba-2 model, global-local channel attention, CNN-BiLSTM network, multi-scale sparse cross attention is fused to multi-source feature, and peptide segment feature representation is output;Protein feature is input into protein branch, and global-local channel attention, multi-scale linear attention is processed in turn, and protein feature representation is output;The prediction probability of peptide-protein interaction is output by passing through multilayer perception machine.It can realize high-precision, high-efficiency, low-structure-dependent peptide-protein interaction prediction.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of bioinformatics, and in particular to a method and system for predicting peptide-protein interactions based on multi-source features and feature fusion. Background Technology

[0002] Peptides are biomolecules composed of several to thousands of amino acid residues, playing crucial regulatory roles in physiological processes. Due to their high safety, good human tolerability, and balanced structural flexibility and stability, they have become ideal frameworks for new drug design. Peptide-protein interactions (PepPIs) are a core mechanism of basic cellular activities and the development of complex diseases, participating in processes such as signal transduction, protein transport, programmed cell death, and gene regulation. For example, short peptides can initiate or inhibit intracellular signaling pathways by binding to cell surface receptors (especially receptor tyrosine kinases). PepPIs are closely related to a variety of diseases, including cancer, amyloidosis, and neurodegenerative diseases. Studies have shown that secreted peptides can act as signaling molecules, regulating growth, development, immune responses, and environmental adaptation through receptor signaling pathways. Genome-wide studies indicate that PepPIs account for 15-40% of all protein-protein interactions (PPIs). This proportion is even higher when many PPIs involve disordered regions, as binding sites are often formed from short peptide-like fragments. Accurate identification of PepPIs is of great significance for understanding molecular binding patterns, revealing disease regulatory mechanisms, and accelerating the discovery of antimicrobial / anticancer peptides. It is an important frontier in bioinformatics and drug development.

[0003] Existing methods for predicting peptide-protein interactions mainly include experimental and structural computation-based methods, traditional machine learning-based methods, and deep learning-based methods. Among these, experimental and structural computation-based methods are time-consuming and costly, have high technical barriers, are difficult to detect weak interactions, have high false positive rates, and the high flexibility of peptide-protein complexes affects accuracy and depends on three-dimensional structures. Among artificial intelligence-based prediction tools, some methods require the three-dimensional structure of proteins or high-precision structural models as input. However, in practical applications, many proteins lack reliable three-dimensional structural data, limiting the applicability of these methods. Furthermore, current PepPIs prediction methods mostly neglect the effective fusion of the peptide's own multi-source heterogeneity features. When processing long protein sequences, fully connected attention-based models have high computational overhead, making it difficult to balance modeling ability and computational efficiency.

[0004] Therefore, traditional peptide-protein interaction prediction methods often suffer from low prediction accuracy due to their strong dependence on structural information, insufficient ability to fuse heterogeneous features, and high computational complexity of long program sequence modeling. Summary of the Invention

[0005] In order to solve the above-mentioned technical problems, a method and system for predicting peptide-protein interactions based on multi-source features and feature fusion is provided, which can achieve high-precision, high-efficiency, and low-structure-dependent peptide-protein interaction prediction.

[0006] A peptide-protein interaction prediction method based on multi-source features and feature fusion, the method comprising:

[0007] Acquire various peptide-protein interaction data, and define each peptide-protein interaction data as a peptide sequence or a protein sequence based on the amino acid sequence length;

[0008] For the peptide sequence, residue-level hand-coded features, contact diagram structural features, and ESM-2 pre-trained residue-level embedding features are extracted as peptide multi-source features; for the protein sequence, sequence-level statistical features and ESM-2 global pooling features are extracted as protein features.

[0009] The peptide multi-source features are input into the peptide branch, and after being processed by the Mamba-2 model, global-local channel attention, and CNN-BiLSTM network respectively, the multi-source features are fused through multi-scale sparse cross attention to output the peptide feature representation; the protein features are input into the protein branch, and after being processed by global-local channel attention and multi-scale linear attention to output the protein feature representation.

[0010] The peptide feature representation and protein feature representation are concatenated and then input into a multilayer perceptron, which outputs the predicted probability of peptide-protein interaction.

[0011] In one embodiment, various peptide-protein interaction data are acquired, and each peptide-protein interaction data is defined as a peptide sequence or a protein sequence based on its amino acid sequence length, including:

[0012] Data on various peptide-protein interactions were selected from the STRING database based on confidence levels.

[0013] Determine the amino acid sequence length corresponding to each of the peptide-protein interaction data;

[0014] Residues with an amino acid sequence length greater than a length threshold are defined as protein sequences, and residues with an amino acid sequence length less than or equal to a length threshold are defined as peptide sequences.

[0015] In one embodiment, for the peptide sequence, residue-level hand-coded features, contact map structure features, and ESM-2 pre-trained residue-level embedding features are extracted as peptide multi-source features, including:

[0016] For each amino acid residue in the peptide sequence, binary encoding, BLOSUM62 matrix encoding, AAIndex physicochemical property index encoding, and PC6 six-dimensional physicochemical descriptor encoding are performed to obtain each encoding vector;

[0017] The individual encoded vectors are concatenated along the channel dimension to obtain residue-level hand-coded features;

[0018] A contact map matrix is ​​constructed based on the residue length of the peptide sequence. The elements of the contact map matrix are assigned values ​​according to the spatial interaction rules of amino acid residues in the peptide sequence to characterize the spatial proximity relationship between residues in the peptide sequence. The contact map matrix is ​​then converted into a residue-level vector to generate contact map structural features.

[0019] The peptide sequence is input into a pre-trained ESM-2 protein language model, and the hidden state vector corresponding to each residue position is extracted from the output of the last layer of the ESM-2 protein language model to generate residue-level embedding features.

[0020] In one embodiment, sequence-level statistical features and ESM-2 global pooling features are extracted from the protein sequence as protein features, including:

[0021] The amino acid composition characteristics, dipeptide composition characteristics, and amino acid entropy characteristics of the protein sequence were calculated respectively.

[0022] The amino acid composition features, dipeptide composition features, and amino acid entropy features are spliced ​​along the feature dimensions to obtain sequence-level statistical features;

[0023] The protein sequence is fed into a pre-trained ESM-2 protein language model, and the hidden state vector corresponding to each residue position is extracted from the output of the last layer of the ESM-2 protein language model to generate protein residue-level embedding features.

[0024] A global average pooling operation is performed on the protein residue-level embedding features to obtain global pooled features.

[0025] In one embodiment, the multi-source features of the peptide are input into the peptide branch and processed by the Mamba-2 model, global-local channel attention, and CNN-BiLSTM network, respectively, including:

[0026] The residue-level hand-coded features and contact map structure features are linearly projected onto the dimensional space of the ESM-2 pre-trained residue-level embedding features to obtain the peptide multi-source feature tensor.

[0027] The peptide multi-source feature tensor is input into the Mamba-2 model, and long-range dependency modeling is performed on the peptide multi-source feature tensor based on the structured state-space duality algorithm to output the first peptide feature representation.

[0028] The peptide multi-source feature tensor is input into the global-local channel attention. Local channel dependencies are extracted through local branches, and global channel interaction information is extracted through global branches. The local channel dependencies and global channel interaction information are fused and superimposed with residual connections to output the second peptide feature representation.

[0029] The peptide multi-source feature tensor is input into a CNN-BiLSTM network. The local interaction features of the peptide sequence are captured by a multi-scale convolutional network. The sequence context information is modeled by a BiLSTM network. The local interaction features and sequence context information are concatenated and linearly projected to output the third peptide feature representation.

[0030] In one embodiment, multi-source features are fused through multi-scale sparse cross-attention to output peptide feature representations, including:

[0031] Input the first peptide feature representation, the second peptide feature representation, and the third peptide feature representation into the multi-scale sparse cross attention;

[0032] The multi-scale sparse cross-attention is used to perform multi-window average pooling to mine multi-scale information for each peptide feature representation. Then, the peptide feature representation is generated by weighted fusion of sparse attention weights and learnable parameters.

[0033] In one embodiment, the protein features are input into a protein branch, and processed sequentially through global-local channel attention and multi-scale linear attention to output a protein feature representation, including:

[0034] The sequence-level statistical features and ESM-2 global pooling features are linearly projected to the same feature dimension to obtain the projected protein features.

[0035] The projected protein features are input into global-local channel attention. Local channel dependencies of the sequence-level statistical features are extracted through local branches, and global channel interaction information of the ESM-2 global pooling features is extracted through global branches. The residual connections are then stacked to output the first protein feature representation.

[0036] The first protein feature representation is input into multi-scale linear attention. Pseudo-sequences are constructed through feature segmentation. The ReLU linear attention mechanism is applied to perform global dependency modeling. Multi-scale feature relationships are obtained by combining scale-aware position bias, and the final protein feature representation is output.

[0037] In one embodiment, the peptide feature representation and the protein feature representation are concatenated and then input into a multilayer perceptron, outputting the predicted probability of peptide-protein interaction, including:

[0038] A global average pooling operation is performed on the peptide feature representation, and a layer normalization process is performed on the protein feature representation to align the dimensions of the peptide feature representation and the protein feature representation.

[0039] The aligned peptide feature representations and protein feature representations are concatenated and spliced ​​along the feature dimension to generate a fused feature vector.

[0040] The fused feature vector is input into a multilayer perceptron and then subjected to nonlinear transformation through at least two fully connected layers to output the predicted probability of peptide-protein interactions.

[0041] A peptide-protein interaction prediction system based on multi-source features and feature fusion, the system comprising:

[0042] The data acquisition module is used to acquire various peptide-protein interaction data and define each peptide-protein interaction data as a peptide sequence or a protein sequence according to the amino acid sequence length.

[0043] The feature extraction module is used to extract residue-level hand-coded features, contact diagram structure features, and ESM-2 pre-trained residue-level embedding features for the peptide sequence as peptide multi-source features; and to extract sequence-level statistical features and ESM-2 global pooling features for the protein sequence as protein features.

[0044] The feature learning module is used to input the multi-source features of the peptide into the peptide branch, process them through the Mamba-2 model, global-local channel attention, and CNN-BiLSTM network respectively, and then fuse the multi-source features through multi-scale sparse cross attention to output the peptide feature representation; the protein features are input into the protein branch, and processed through global-local channel attention and multi-scale linear attention in sequence to output the protein feature representation.

[0045] The feature fusion and prediction module is used to concatenate and splice the peptide feature representation and protein feature representation, input them into a multilayer perceptron, and output the predicted probability of peptide-protein interaction.

[0046] The aforementioned peptide-protein interaction prediction method and system based on multi-source features and feature fusion extracts residue-level hand-coded features, contact map structural features, and ESM-2 pre-trained residue-level embedding features from peptide sequences, and extracts sequence-level statistical features and ESM-2 global pooling features from protein sequences. Furthermore, it achieves cross-modal and cross-scale feature fusion through peptide and protein branching, fully exploring the complementary relationships between peptide and protein features. For peptide sequences, it combines Mamba-2 and CNN-BiLSTM to achieve long-range dependency and fine-grained context modeling; for protein sequences, it uses multi-scale linear attention to capture long-range dependencies with linear complexity, making feature mining of both sequences more targeted and improving prediction accuracy. The use of a multilayer perceptron for prediction efficiently processes large-scale peptide-protein interaction data, improving prediction efficiency. Attached Figure Description

[0047] Figure 1 This is an application environment diagram of a peptide-protein interaction prediction method based on multi-source features and feature fusion in one embodiment;

[0048] Figure 2 This is a flowchart illustrating a peptide-protein interaction prediction method based on multi-source features and feature fusion in one embodiment.

[0049] Figure 3 This is a schematic diagram of the application model framework of a peptide-protein interaction prediction method based on multi-source features and feature fusion in one embodiment;

[0050] Figure 4 A schematic diagram showing the comparison and analysis between the MCMSC model and state-of-the-art models;

[0051] Figure 5 A schematic diagram of the ablation experiment analysis of each segment of the MCMSC model;

[0052] Figure 6 This is a block diagram of a peptide-protein interaction prediction system based on multi-source features and feature fusion in one embodiment.

[0053] Figure 7 This is an internal structural diagram of a computer device in one embodiment. Detailed Implementation

[0054] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.

[0055] It is understood that the terms "first," "second," etc., used herein may be used to describe peptide feature representations, but these peptide feature representations are not limited by these terms. These terms are only used to distinguish one peptide feature representation from another. For example, without departing from the scope of this application, a first peptide feature representation may be referred to as a second peptide feature representation, and similarly, a second peptide feature representation may be referred to as a first peptide feature representation. Both the first and second peptide feature representations are peptide feature representations, but they are not the same peptide feature representation.

[0056] The peptide-protein interaction prediction method based on multi-source features and feature fusion provided in this application can be applied to, for example... Figure 1 The application environment shown. For example... Figure 1 As shown, the application environment includes computer device 110. Computer device 110 can acquire various peptide-protein interaction data, defining each peptide-protein interaction data as a peptide sequence and a protein sequence based on the amino acid sequence length. For peptide sequences, computer device 110 can extract residue-level hand-coded features, contact map structural features, and ESM-2 pre-trained residue-level embedding features as peptide multi-source features; for protein sequences, it can extract sequence-level statistical features and ESM-2 global pooling features as protein features. Computer device 110 can input the peptide multi-source features into the peptide branch, process them through a Mamba-2 model, global-local channel attention, and a CNN-BiLSTM network respectively, and then fuse the multi-source features through multi-scale sparse cross-attention to output the peptide feature representation. It can also input the protein features into the protein branch, process them sequentially through global-local channel attention and multi-scale linear attention to output the protein feature representation. Finally, computer device 110 can concatenate the peptide feature representation and the protein feature representation and input them into a multilayer perceptron to output the predicted probability of peptide-protein interactions. Among them, computer equipment 110 may include, but is not limited to, various personal computers, laptops, robots, tablets and other devices.

[0057] In one embodiment, such as Figure 2 As shown, a peptide-protein interaction prediction method based on multi-source features and feature fusion is provided, including the following steps:

[0058] Step 202: Obtain the peptide-protein interaction data for each peptide and define each peptide-protein interaction data as a peptide sequence or a protein sequence based on the amino acid sequence length.

[0059] Computer devices can retrieve peptide-protein association information from databases, derived from experimental verification and functional inference. This information includes both direct physical interactions and indirect functional relationships, covering multiple species. The acquired peptide-protein interaction data enables models to learn the interaction characteristics between peptide sequences of different lengths and proteins of different sizes, thereby improving the applicability and stability of the models in practical applications.

[0060] In one embodiment, a peptide-protein interaction prediction method based on multi-source features and feature fusion may further include a data acquisition and processing process, specifically including: selecting each peptide-protein interaction data from the STRING database according to a confidence index; determining the amino acid sequence length corresponding to each peptide-protein interaction data; defining residues with amino acid sequence lengths greater than a length threshold as protein sequences, and defining residues with amino acid sequence lengths not greater than a length threshold as peptide sequences.

[0061] The STRING database is a publicly available bioinformatics resource. Computer equipment can first screen plant-derived peptide-protein interaction data from the STRING database, using the "combined_score" provided by the STRING database as the interaction confidence index. To ensure data quality, only peptide-protein interactions with a combined_score greater than 0.95 are retained as positive samples to reduce potential errors and minimize the introduction of false positives.

[0062] Next, samples can be partitioned based on amino acid sequence length. Specifically, sequences with no more than 100 residues in length are defined as peptide sequences, and sequences with more than 100 residues are defined as protein sequences. To avoid overfitting during model training due to excessive sequence homology, the CD-HIT tool is used to remove redundancy from peptide and protein sequences. The sequence consistency threshold for redundancy removal is set to 80% to delete highly similar repetitive sequences. Simultaneously, when constructing negative sample data, samples are randomly drawn from peptide-protein combinations without interaction annotations in the STRING database, ensuring the number of negative samples matches the number of positive samples, thus constructing a dataset with balanced class distribution.

[0063] For example, in this embodiment, two model plants with well-annotated genomes and mature research foundations, Arabidopsis thaliana and Solanum lycopersicum, can be selected as research subjects to construct species-specific datasets. Specifically, a first dataset, BD1, is constructed for Arabidopsis thaliana, containing 3966 positive interaction samples involving 243 peptide sequences and 1039 protein sequences; a second dataset, BD2, is constructed for Solanum lycopersicum, containing 6196 positive interaction samples involving 294 peptide sequences and 1263 protein sequences.

[0064] In the subsequent model training and evaluation process, the BD1 and BD2 datasets can be divided separately. 80% of the data can be randomly divided into training and validation data for model training and five-fold cross-validation, while the remaining 20% ​​of the data can be used as an independent test set to evaluate the generalization performance of the model.

[0065] Step 204: For peptide sequences, extract residue-level hand-coded features, contact diagram structural features, and ESM-2 pre-trained residue-level embedding features as peptide multi-source features; for protein sequences, extract sequence-level statistical features and ESM-2 global pooling features as protein features.

[0066] Since peptide sequences and protein sequences have different characteristics, differential feature extraction methods can be used to capture both local and global information, thereby improving the accuracy and robustness of peptide-protein interaction prediction.

[0067] For peptide sequences (short sequences), residue-level encoding can be used, including binary encoding (BE), the BLOSUM62 matrix, selected AAIndex physicochemical properties, and the PC6 six-dimensional physicochemical descriptor, to feature each amino acid residue. Prior structural information: A contact map is introduced to encode the spatial proximity relationships of residues in the peptide, used to capture local interaction context information. A pre-trained ESM-2 model is used to extract residue-level embedding features, generating an L×1280-dimensional feature matrix (L is the peptide length).

[0068] Specifically, in one embodiment, a peptide-protein interaction prediction method based on multi-source features and feature fusion may further include a feature extraction process for the peptide sequence. This process includes: encoding each amino acid residue in the peptide sequence using binary encoding, BLOSUM62 matrix encoding, AAIndex physicochemical property index encoding, and PC6 six-dimensional physicochemical descriptor encoding to obtain various encoding vectors; concatenating these encoding vectors according to channel dimensions to obtain residue-level hand-coded features; constructing a contact graph matrix based on the residue length of the peptide sequence, assigning element values ​​to the contact graph matrix according to the spatial interaction rules of amino acid residues in the peptide sequence to characterize the spatial proximity relationship between residues in the peptide sequence; converting the contact graph matrix into residue-level vectors to generate contact graph structural features; and inputting the peptide sequence into a pre-trained ESM-2 protein language model, extracting the hidden state vector corresponding to each residue position from the output of the last layer of the ESM-2 protein language model to generate residue-level embedding features.

[0069] Among them, residue-level hand-coded features capture the physicochemical properties, evolutionary information, and compositional information of each amino acid residue by numerically encoding it; the contact graph is a two-dimensional binary matrix used to encode the spatial proximity relationship between residues in a peptide segment, providing prior structural information; and the ESM-2 pre-trained residue-level embedding features use the large-scale protein language model ESM-2 to perform end-to-end feature extraction on peptide sequences, capturing deep information such as sequence semantics, evolutionary conservation, and potential structural features of peptide sequences, which are high-dimensional semantic features.

[0070] Computer equipment can convert each amino acid in a peptide sequence into a corresponding standard letter code, thereby performing binary encoding, BLOSUM62 matrix encoding, AAIndex physicochemical property index encoding, and PC6 six-dimensional physicochemical descriptor encoding on each amino acid residue in the peptide sequence in sequence. The four types of encoded features are then spliced ​​along the feature dimensions to obtain residue-level manually encoded features.

[0071] In the three-dimensional structure of a protein, if the Euclidean distance between specific atoms of two residues is less than a preset threshold, the two residues are considered to be in contact. The contact map is a symmetric matrix that records whether all residue pairs are in contact. In this embodiment, a computer device predicts the contact map, thereby obtaining the three-dimensional structure of the peptide sequence, and calculates the spatial distance between each pair of non-adjacent residues based on the three-dimensional structure to obtain the contact map structural features.

[0072] Computer equipment can convert peptide sequences into the standard input format of the ESM-2 pre-trained model. After inputting into the ESM-2 pre-trained model, feature learning is performed through the Transformer encoder layer of the model. The high-dimensional embedding feature vector corresponding to each residue in the hidden layer output peptide sequence of the last / specified layer of the model is extracted, thereby obtaining the ESM-2 pre-trained residue-level embedding features.

[0073] For protein sequences (long sequences), computer devices can employ sequence-level statistical features, including amino acid composition (AAC), dipeptide composition (DPC), and amino acid entropy (AAE), to summarize the global evolution and compositional information of the protein. The ESM-2 model is used to extract sequence-level embeddings, and global average pooling is used to obtain a fixed 1×1280-dimensional vector representing the entire protein sequence.

[0074] In one embodiment, a peptide-protein interaction prediction method based on multi-source features and feature fusion may further include a process of extracting protein features. Specifically, this process includes: calculating amino acid composition features, dipeptide composition features, and amino acid entropy features for the protein sequence; concatenating the amino acid composition features, dipeptide composition features, and amino acid entropy features along the feature dimension to obtain sequence-level statistical features; feeding the protein sequence into a pre-trained ESM-2 protein language model, extracting the hidden state vector corresponding to each residue position from the output of the last layer of the ESM-2 protein language model to generate protein residue-level embedding features; and performing a global average pooling operation on the protein residue-level embedding features to obtain global pooled features.

[0075] Computer equipment can convert each protein sequence into two complementary global feature representations, providing input for subsequent protein branch feature learning. Unlike peptide multi-source features, protein features use sequence-level global representations, ensuring both computational efficiency and capturing the overall characteristics of the protein.

[0076] Among them, sequence-level statistical features are obtained by performing global statistical analysis on the entire protein sequence to capture its amino acid composition preferences, sequence patterns, and evolutionary information; global pooling features are extracted, which can not only retain the deep semantic information of ESM-2, but also fix the feature dimension to 1×1280, which is convenient for subsequent processing.

[0077] Computer equipment can sequentially calculate the amino acid composition (AAC) features, dipeptide composition (DPC) features, and amino acid entropy (AAE) features of protein sequences. After concatenating the AAC, DPC, and AAE features according to their dimensions, they are standardized to generate sequence-level statistical features with fixed dimensions. The AAC feature is a 20-dimensional amino acid frequency vector, and the DPC feature is a 400-dimensional dipeptide frequency vector.

[0078] Computer equipment can convert protein sequences into the input format required by the ESM-2 pre-trained model. After inputting the sequence into the ESM-2 pre-trained model, residue-level embedding features are extracted from the output of the last hidden layer, and global average pooling is performed to obtain global pooling features. Finally, the global pooling features and sequence-level statistical features are used as the final protein features.

[0079] In this embodiment, by combining the residue-level local information of short sequences with the global statistical information of long sequences, and supplementing it with protein language model features, efficient prediction of peptide-protein interactions is achieved.

[0080] Step 206: Input the peptide multi-source features into the peptide branch, process them through the Mamba-2 model, global-local channel attention, and CNN-BiLSTM network respectively, and then fuse the multi-source features through multi-scale sparse cross attention to output the peptide feature representation; input the protein features into the protein branch, process them through global-local channel attention and multi-scale linear attention in sequence to output the protein feature representation.

[0081] In this embodiment, a dual-branch architecture can be adopted, and a differentiated feature learning strategy can be designed based on the characteristics of peptide and protein sequences to achieve efficient and robust peptide-protein interaction prediction.

[0082] Regarding peptide branching, the inputs include residue-level hand-coded features, contact map structure features, and ESM-2 pre-trained residue-level embedding features (BE, BLOSUM62, AAAindex, PC6), among other peptide multi-source features. Mamba-2 is used to model long-range peptide dependencies, improving the efficiency of sequence information expression; GLCA extracts local and global channel features to supplement multi-scale information of short peptide sequences; CNN-BiLSTM captures fine-grained contextual patterns, enhancing the local interactive expression ability of short sequences; and multi-scale sparse cross-attention (MSC) fuses peptide multi-source features to suppress redundant information and achieve unified peptide representation.

[0083] Specifically, in one embodiment, a peptide-protein interaction prediction method based on multi-source features and feature fusion may further include a process of processing via peptide branching. This process includes: linearly projecting residue-level hand-coded features and contact map structural features onto the dimensional space of ESM-2 pre-trained residue-level embedding features to obtain a peptide multi-source feature tensor; inputting the peptide multi-source feature tensor into a Mamba-2 model, performing long-range dependency modeling on the peptide multi-source feature tensor based on the structured state-space duality algorithm, and outputting a first peptide feature representation; inputting the peptide multi-source feature tensor into a global-local channel attention model, extracting local channel dependencies through local branches, extracting global channel interaction information through global branches, fusing the local channel dependencies and global channel interaction information, and superimposing residual connections to output a second peptide feature representation; inputting the peptide multi-source feature tensor into a CNN-BiLSTM network, capturing local interaction features of the peptide sequence through a multi-scale convolutional network, modeling sequence context information through a BiLSTM network, concatenating and linearly projecting the local interaction features and sequence context information to output a third peptide feature representation.

[0084] Computer equipment can perform standardization and dimension alignment operations on peptide multi-source features. By linearly projecting the dimensions of residue-level hand-coded features and contact diagram structural features, they are uniformly mapped to a feature dimension space that matches the ESM-2 pre-trained residue-level embedding features, eliminating dimensional differences, and concatenating channel dimensions to form a peptide multi-source feature tensor.

[0085] Next, the computer device can input the peptide multi-source feature tensor into the Mamba-2 model. The Mamba-2 model is specific to peptide branches and uses the Structured State-Space Duality (SSD) algorithm and State-Space Model (SSM) to efficiently model long-range dependencies of peptide sequences while maintaining linear complexity. Through efficient tensor operations and data parallel execution, it improves the speed and accuracy of peptide feature extraction while maintaining linear computational complexity. The linear time-step update mechanism of the Mamba-2 model models long-range associations between residues in the peptide sequence, capturing the spatial and sequence dependencies of distant residues within the peptide. Data parallel execution and tensor rearrangement optimization are employed to improve feature extraction speed, outputting a first peptide feature representation incorporating long-range dependency information.

[0086] Global-local channel attention is a dual-branch channel attention mechanism that can simultaneously capture local fine-grained patterns and global channel dependencies. The local branch captures key peptide residue-level features through convolution; the global branch encodes overall channel interactions through non-local attention, enhancing multi-level feature expression; finally, rich feature representations are output through channel-level concatenation, while retaining residuals to ensure gradient stability.

[0087] Global-local channel attention can be applied to peptide branching and protein branching, respectively. The features of the input local channels are represented as X1, a local channel feature vector of dimension C / 2. The calculation formula can be expressed as: ; Used for concentration The system extracts channel information; a one-dimensional convolutional layer with a kernel size of 3 is used to capture local channel dependencies, and the results of the local stage are output through a Sigmoid activation function.

[0088] The formula for calculating the local attention weights of each channel can be expressed as: ; This represents the Sigmoid activation function; by multiplying it sequentially with the weights, it highlights the interactions between key channels and enhances the expression of local features. The input... and After element-wise multiplication, residual connections are stacked to preserve the original information, resulting in the final local attention feature map. : The residual branch retains important information from the original features, keeps the gradient stable during backpropagation, and prevents feature quality degradation.

[0089] The second branch global channel uses a non-local attention mechanism to capture global channel dependencies, representing the features of the input global channel as follows: First, Perform global average pooling to obtain channel-level feature vectors: By using global channel feature vectors The query vector is obtained by inputting a one-dimensional convolutional layer with a kernel size of 3 and processing it with the Sigmoid activation function. With key vector ,in, , Through calculation and The outer product yields the global attention map. Used to capture global interaction relationships, the calculation formula can be expressed as: The global attention map is processed in the channel dimension using Softmax. Normalization is performed to highlight key global channel dependencies; matrix multiplication is used to utilize... logarithmic tensor After weighted adjustment, a feature representation incorporating global information is obtained: ; local attention feature map With global attention feature map The input is concatenated along the channel dimension and residual connections are superimposed to preserve the initial input information, forming the final output of the attention module: .

[0090] In other words, Global-Local Channel Attention (GLCA) captures both local fine-grained channel patterns and global channel dependencies of peptide features through a dual-branch channel attention mechanism, supplementing multi-scale information while preserving residual connections to ensure gradient stability. The specific processing steps for global-local channel attention can be as follows: The input features are split into two sub-tensors along the channel dimension and input into the local and global branches of GLCA, respectively. Global average pooling (GAP) is performed on one of the sub-tensors to condense channel information. A one-dimensional convolution (Conv1D) with a kernel size of 3 is used to capture residue-level local channel dependencies. Local attention weights are generated by Sigmoid activation. These weights are then multiplied element-wise with the sub-tensor and residuals are concatenated to output a local attention feature map. Global average pooling is performed on the other sub-tensor to obtain channel-level feature vectors. These vectors are then activated by Conv1D (k=3) and Sigmoid to generate query and key vectors, respectively. Their outer product is calculated and normalized using Softmax to obtain a global attention map. The value tensor is weighted using this attention map to output a global attention feature map. Finally, the local attention feature map, global attention feature map, and the initial input feature tensor of GLCA are concatenated along the channel dimension. Residuals are then concatenated to output an enhanced second peptide feature representation, which integrates local residue details and global channel association information.

[0091] CNN-BiLSTM combines the local feature extraction capabilities of Convolutional Neural Networks (CNNs) with the sequence context modeling capabilities of Bidirectional Long Short-Term Memory Networks (BiLSTMs) to capture fine-grained local interactions and bidirectional contextual patterns in peptide sequences. The specific processing steps using CNN-BiLSTM can include: convolving the input peptide multi-source feature tensor with a one-dimensional CNN layer using multi-scale convolutional kernels (3×3, 5×5) to extract local interaction features of adjacent residues in the peptide sequence; outputting a local contextual feature tensor after ReLU activation and pooling operations; inputting the feature tensor output by the CNN into the BiLSTM network to model the contextual information of the peptide sequence along both the forward and backward directions, capturing the bidirectional dependencies of residues throughout the sequence; concatenating the outputs of the forward and backward hidden layers of the BiLSTM along the sequence dimension, projecting them through a linear layer, and outputting a third peptide feature representation that integrates fine-grained local interactions and bidirectional context.

[0092] In one embodiment, a peptide-protein interaction prediction method based on multi-source features and feature fusion may further include a feature summation process, specifically including: inputting the first peptide feature representation, the second peptide feature representation, and the third peptide feature representation into a multi-scale sparse cross-attention; performing multi-window average pooling on each peptide feature representation through multi-scale sparse cross-attention to mine multi-scale information; and then generating peptide feature representations by weighted fusion through sparse attention weight calculation and learnable parameters.

[0093] Multi-scale sparse cross-attention can capture structural information of different granularities through multi-scale pooling, sparse attention suppresses redundant information through Topk, and learnable parameter fusion can adaptively adjust the weights of different sparse strategies.

[0094] Computer devices can use multi-scale sparse cross-attention to select one peptide feature representation from the first, second, and third peptide feature representations as the baseline feature, and the other two peptide feature representations as features to be fused. For each feature to be fused, average pooling with three different window sizes is performed to mine multi-scale information. Then, attention mapping and double Top-k sparse operations are used to filter key information. The attention map is fused by learningable parameters and multiplied by the value vector to obtain the fused feature. Finally, all fused features are concatenated with the baseline feature, and after linear layer projection and layer normalization, a unified peptide feature representation is output.

[0095] While peptide branches are being processed, protein branches are also processing protein features. For the protein branch, the input consists of sequence-level statistical features and ESM-2 global pooling features. A global-local channel attention mechanism (GLCA) is used to extract key channel relationships, enhancing global context and local dependency information. Multi-scale linear attention (LiteMLA) is employed to capture long-range dependencies, increasing the global receptive field while maintaining linear computational complexity, resulting in a compact protein feature representation.

[0096] In one embodiment, the peptide-protein interaction prediction method based on multi-source features and feature fusion may further include a process of processing protein features. Specifically, this process includes: linearly projecting sequence-level statistical features and ESM-2 global pooling features to the same feature dimension to obtain projected protein features; inputting the projected protein features into a global-local channel attention mechanism, extracting local channel dependencies of sequence-level statistical features through local branches, extracting global channel interaction information of ESM-2 global pooling features through global branches, and performing superimposed residual connections to output a first protein feature representation; inputting the first protein feature representation into a multi-scale linear attention mechanism, constructing pseudo-sequences through feature segmentation, applying a ReLU linear attention mechanism for global dependency modeling, and combining scale-aware positional bias to obtain multi-scale feature relationships, outputting the final protein feature representation.

[0097] Since the input sequence-level statistical features and ESM-2 global pooling features have different dimensions, they need to be linearly projected to a unified dimension to obtain the projected protein features.

[0098] The projected protein features are input into Global-Local Channel Attention (GLCA). GLCA shares the same dual-branch channel attention architecture as the peptide branch. Targeting the global nature of protein features, it focuses on capturing local fine-grained associations and global channel dependencies in the channel dimension. At the same time, it ensures gradient stability through residual connections to avoid feature information degradation.

[0099] The first protein feature representation is output through global-local channel attention. The first protein feature representation is then input into multi-scale linear attention (LiteMLA). ReLU linear attention is used to replace traditional softmax attention, thereby achieving a global receptive field and linear computational complexity.

[0100] Global-local channel attention can generate query tensor Q, key tensor K, and value tensor V from the input first protein feature representation through three independent learnable linear projection layers. Then, the ReLU similarity function is used to calculate the similarity between the query tensor and the key tensor, reducing the computational complexity of attention from quadratic to linear. Next, the associative law of matrix multiplication can be used to first multiply the key tensor and the value tensor to achieve a one-time aggregation of global key-value pairs. The aggregation result is then multiplied with the ReLU-processed query tensor, avoiding the need to calculate each query tensor-key tensor pair individually, significantly improving computational efficiency. Finally, scale-aware positional bias is incorporated to add positional weights to feature associations at different scales. The algorithm captures the multi-scale long-range dependencies of protein features. It performs nonlinear transformation on the attention-weighted features through a lightweight feedforward path to enhance key feature information while suppressing invalid signals in low-activation regions, thereby improving feature effectiveness. It performs global average pooling (GAP) on the feature tensor after multi-scale linear attention processing to compress the sequence dimension and restore it to a fixed-dimensional one-dimensional feature vector. After layer normalization and linear layer projection, it outputs a unified protein feature representation. The protein feature representation is a compact, high-information-density representation that integrates local / global features and long-range dependency information of the channel dimension. Since the dimension is fixed, it can be directly spliced ​​and fused with peptide feature representations.

[0101] Specifically, by combining scale-aware positional bias with a lightweight feedforward path, fine-grained sequence information is recovered while maintaining computational efficiency, and the ReLU similarity function is used instead of softmax. For input x, the generalized form of softmax attention can be expressed as: ; Subsequently, by utilizing the associative law of matrix multiplication, the computation is rearranged into a one-time aggregation of global key-value pairs before multiplication with the query, thereby reducing the computational complexity and memory usage from quadratic to linear, while maintaining the original functionality of the attention mechanism. .

[0102] Multi-scale fusion of peptide multi-source features, utilizing The sparse mechanism suppresses irrelevant information and adaptively adjusts the fusion weights through learnable parameters, achieving efficient modeling of key cross dependencies. Multi-scale sparse cross attention is applied only to peptides to fully utilize the dense information of their short sequences. This not only fuses multi-scale cues but also effectively suppresses irrelevant interference through sparse operations, resulting in more accurate feature fusion results. Given two inputs X and Y, to extract the implicit multi-scale information from the inputs, MSC first performs average pooling on Y with three different window sizes, obtaining... , , A larger pooling window captures the overall structural information of the peptide sequence, while a smaller window captures... , , The mapping formula can be expressed as: ; ;in, Indicates the flattening operation. This indicates a uniform segmentation operation along the channel dimension. The layer normalization function is represented. and Learnable linear projection parameters; attention map It can be represented as: Where M represents the similarity relationship between elements in Q and K, and c is the dimension of Q. Adjusted attention graph. It can be represented as: Furthermore, ;in, and For learnable parameters, and respectively adopt and of Sparse operation functions; finally, the adjusted attention map. Assigned To achieve weighted fusion: ;in, The output of the MSC has already fused two sets of input information; based on the MSC, { }and The features are sequentially merged and then simply spliced ​​together to obtain classification features. The specific process can be represented as follows: ; .

[0103] Step 208: The peptide feature representation and protein feature representation are concatenated and then input into a multilayer perceptron, which outputs the predicted probability of peptide-protein interaction.

[0104] Finally, peptide feature representations and protein feature representations are concatenated and input into a multilayer perceptron (MLP) for prediction, outputting the probability of peptide-protein interactions. By learning the heterogeneous features of peptide and protein branches and fusing them at multiple scales, local, global, and long-range dependency information is effectively integrated, achieving high-precision prediction of peptide-protein interactions.

[0105] In one embodiment, a peptide-protein interaction prediction method based on multi-source features and feature fusion may further include a prediction process, specifically including: performing global average pooling on peptide feature representations and layer normalization on protein feature representations to align the dimensions of peptide and protein feature representations; concatenating the aligned peptide and protein feature representations along the feature dimensions to generate a fused feature vector; inputting the fused feature vector into a multilayer perceptron and performing nonlinear transformations through at least two fully connected layers to output the predicted probability of peptide-protein interactions.

[0106] The computer device can perform global pooling along the sequence dimension of the peptide feature representation, and then perform layer normalization on the protein feature representation to align the dimensions and feature value distributions of the two. Next, the aligned peptide and protein feature representations can be concatenated in dimensional order to generate a fused feature vector. After nonlinear transformation by the ReLU activation function, a Dropout regularization operation is performed to obtain the fused feature tensor for prediction. The fused feature tensor for prediction is input into a multilayer perceptron, and after linear transformation, ReLU activation, layer normalization, and linear transformation by the first fully connected layer, the output peptide-protein interaction prediction probability value can be determined based on the probability value.

[0107] This application provides a peptide-protein interaction prediction method based on multi-source features and feature fusion. It integrates peptide contact maps, ESM-2 pre-trained features, and hand-coded features. Enhanced peptide representations are generated through a multi-scale sparse cross-attention (MSC) architecture. Peptide feature fusion is separated from subsequent peptide-protein interaction modeling, ensuring the fusion process focuses on optimizing peptide representation quality. Sparsity weights reduce repetitive signals, achieving efficient and high-quality feature fusion. A PepPIs prediction framework is provided, utilizing Mamba-2 global dependency modeling, Global Local Channel Attention (GLCA), and Multi-Scale Linear Attention (LiteMLA) for cross-scale linear attention fusion. Mamba-2's state-space model-based global dependency modeling provides linear complexity for global sequence feature learning. GLCA performs attention modeling in both global and local channel dimensions, separating global context from fine-grained local cues. LiteMLA replaces softmax with linear kernels for cross-scale attention, improving the efficiency of cross-scale interactions and suppressing low-activation regions. The method boasts advantages such as low-rank resource consumption, linear or near-linear time complexity, high robustness to large-scale data, and effective utilization of multimodal features.

[0108] In one embodiment, a peptide-protein interaction prediction method based on multi-source features and feature fusion can be applied to, for example... Figure 3The model architecture shown is based on multi-source features and multi-scale sparse cross-attention for predicting peptide-protein interactions (MSMSC). Figure 3 As shown, the MSMSC model is divided into three core branches: A) Protein Feature Representations: Processing long protein sequences and extracting global features; B) Peptide Feature Representations: Processing short peptide sequences and performing residue-level refined feature learning; C) Predicted Module: Fusing peptide and protein features to output the final interaction prediction results. It also includes four core components used in the above branches: D) Mamba-2: Used for linear complexity long-range dependency modeling of peptide branches; E) GLCA (Global Local Channel Attention): Dual-branch channel attention, simultaneously capturing local and global channel dependencies; F) MSC (Multi Scale Sparse Cross Attention): Multi-scale sparse cross attention, used for peptide multi-source feature fusion; G) LiteMLA (Multi Scale Linear Attention): Multi-scale linear attention, used for long-range dependency modeling of protein branches.

[0109] like Figure 3 As shown, branch A is the protein feature representation, which is specifically designed for long protein sequences and employs a global and lightweight feature extraction strategy; branch B is the peptide feature representation, which is designed for short peptide sequences and employs residue-level refined feature learning to fully explore multi-source heterogeneous information; branch C is the prediction module, which inputs the fused features into a multilayer perceptron (MLP) classifier and outputs the prediction results of peptide-protein interactions.

[0110] Among them, Mamba-2 is based on the State-Space Model (SSM) and achieves long-range dependency modeling with linear time complexity, which is specifically adapted to short peptide sequences; GLCA is a dual-branch channel attention structure, where the local branch uses 3×3 convolution to capture fine-grained dependencies, and the global branch uses non-local attention to encode global dependencies between channels, and finally the output is fused; MSC mainly mines multi-scale information through multi-window average pooling and combines Top-k sparse operations to select key attention, so as to achieve efficient fusion of multi-source features; LiteMLA mainly replaces the traditional softmax attention with ReLU linear attention, reducing the complexity from quadratic to linear, and combines multi-scale convolution to efficiently process long sequences.

[0111] In one embodiment, the MSMSC model is evaluated using multiple standard classification performance metrics, including accuracy (ACC), precision (PRE), recall (REC), F1 score (F1), and Matthews correlation coefficient (MCC). The evaluation formulas for each performance metric can be expressed as follows: ; ; ; ; Among them, ACC measures the overall correctness of predictions, PRE reflects the accuracy of positive sample predictions, REC measures the ability to identify positive samples, F1 combines PRE and REC to balance precision and recall, and MCC is used as the main evaluation metric, which can effectively handle the problem of imbalanced sample classes and is suitable for bioinformatics datasets. Through the above metrics, the predictive performance and robustness of the model for peptide-protein interactions can be comprehensively evaluated.

[0112] In one embodiment, to verify the effectiveness of the MCMSC model in this application, it was compared with several existing peptide-protein interaction prediction methods, including CAMP, PIPR, HyperAttDTI, CSGNN, and DeepPepPI, under independent test set conditions. All models used a unified evaluation metric, and the experimental results are expressed as the mean and standard deviation of 20 independent tests. The experimental results are as follows: Figure 4 As shown, MCMSC achieves the best overall performance on the BD1 dataset. On the BD2 dataset, it outperforms the comparison methods in all metrics except precision. Compared with DeepPepPI, the current best performing method, MCMSC achieves a steady improvement in key metrics such as accuracy, MCC, and F1 on both datasets, demonstrating stronger predictive ability and generalization performance.

[0113] In one embodiment, to verify the contribution of each functional module of this application to the overall performance, a systematic ablation experiment analysis was conducted on the MCMSC model. The experiment was carried out on an independent test set, focusing on examining the impact of each module on the predictive performance of peptide-protein interactions. The ablation experiment analysis of each module of the model is as follows: Figure 5 As shown, removing peptide contact map encoding significantly reduced the model's performance across all evaluation metrics on both the BD1 and BD2 datasets. The structural prior information provided by contact maps effectively complements sequence features, enhancing the ability to characterize interaction interfaces and thus being a crucial factor in improving prediction accuracy.

[0114] like Figure 5 As shown in sections A and B, the model performance significantly degrades when the MSC module is removed or replaced with standard cross-attention. The experimental results are shown in the table below:

[0115]

[0116] It can be seen that MSCs achieve efficient fusion of features from multiple peptides through multi-scale and sparse mechanisms, and can effectively extract high-order transmolecular dependence information. Their performance is significantly better than that of traditional cross-attention structures.

[0117] like Figure 5 As shown in sections C and D, removing the Mamba-2, Global Local Channel Attention (GLCA), or Multi-Scale Linear Attention (LiteMLA) modules all resulted in varying degrees of performance degradation. Mamba-2 helps capture long-range peptide dependencies; GLCA can simultaneously model global and local channel features; and LiteMLA enhances the ability to filter key information while maintaining linear complexity. The synergistic effect of these three modules significantly improves the model's overall perceptual ability.

[0118] like Figure 5 As shown in sections E and F, after comparing various pre-trained protein language models, it was found that ESM-2 achieved the best or stable leading performance on both the BD1 and BD2 datasets, indicating that it has advantages in feature expression ability and training corpus size, and is suitable as the default feature extraction model.

[0119] In summary, the ablation experiment results show that the functional modules proposed in this application are structurally complementary and synergistic, jointly improving the accuracy and robustness of peptide-protein interaction prediction, and verifying the rationality and advancement of the overall model design.

[0120] It should be understood that although the steps in the flowchart above are shown sequentially as indicated by the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowchart above may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these sub-steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the sub-steps or stages of other steps.

[0121] In one embodiment, such as Figure 6 As shown, a peptide-protein interaction prediction system based on multi-source features and feature fusion is provided, including: a data acquisition module 610, a feature extraction module 620, a feature learning module 630, and a feature fusion and prediction module 640, wherein:

[0122] The data acquisition module 610 is used to acquire various peptide-protein interaction data and define each peptide-protein interaction data as a peptide sequence or a protein sequence according to the amino acid sequence length.

[0123] The feature extraction module 620 is used to extract residue-level hand-coded features, contact diagram structure features, and ESM-2 pre-trained residue-level embedding features for peptide sequences as peptide multi-source features; and to extract sequence-level statistical features and ESM-2 global pooling features for protein sequences as protein features.

[0124] The feature learning module 630 is used to input multi-source peptide features into the peptide branch, which are then processed by the Mamba-2 model, global-local channel attention, and CNN-BiLSTM network, respectively. After multi-source features are fused through multi-scale sparse cross attention, the peptide feature representation is output. Protein features are input into the protein branch, which are then processed by global-local channel attention and multi-scale linear attention, respectively, to output the protein feature representation.

[0125] The feature fusion and prediction module 640 is used to concatenate peptide feature representations and protein feature representations and input them into a multilayer perceptron, and output the predicted probability of peptide-protein interaction.

[0126] In one embodiment, the data acquisition module 610 is further configured to filter out each peptide-protein interaction data from the STRING database according to the confidence index; determine the amino acid sequence length corresponding to each peptide-protein interaction data; define residues with amino acid sequence lengths greater than a length threshold as protein sequences, and define residues with amino acid sequence lengths not greater than a length threshold as peptide sequences.

[0127] In one embodiment, the feature extraction module 620 is further configured to perform binary encoding, BLOSUM62 matrix encoding, AAIndex physicochemical property index encoding, and PC6 six-dimensional physicochemical descriptor encoding on each amino acid residue in the peptide sequence to obtain various encoding vectors; concatenate the various encoding vectors according to the channel dimension to obtain residue-level manual encoding features; construct a contact graph matrix based on the residue length of the peptide sequence, assign element values ​​to the contact graph matrix according to the spatial interaction rules of amino acid residues in the peptide sequence to characterize the spatial proximity relationship between residues in the peptide sequence; convert the contact graph matrix into residue-level vectors to generate contact graph structural features; input the peptide sequence into a pre-trained ESM-2 protein language model, extract the hidden state vector corresponding to each residue position from the output of the last layer of the ESM-2 protein language model, and generate residue-level embedding features.

[0128] In one embodiment, the feature extraction module 620 is further configured to calculate amino acid composition features, dipeptide composition features, and amino acid entropy features for the protein sequence; concatenate the amino acid composition features, dipeptide composition features, and amino acid entropy features along the feature dimension to obtain sequence-level statistical features; feed the protein sequence into a pre-trained ESM-2 protein language model, extract the hidden state vector corresponding to each residue position from the output of the last layer of the ESM-2 protein language model, and generate protein residue-level embedding features; and perform a global average pooling operation on the protein residue-level embedding features to obtain global pooling features.

[0129] In one embodiment, the feature learning module 630 is further configured to linearly project the residue-level hand-coded features and contact map structural features onto the dimensional space of the ESM-2 pre-trained residue-level embedding features to obtain a peptide multi-source feature tensor; input the peptide multi-source feature tensor into the Mamba-2 model, perform long-range dependency modeling on the peptide multi-source feature tensor based on the structured state space duality algorithm, and output a first peptide feature representation; input the peptide multi-source feature tensor into global-local channel attention, extract local channel dependencies through local branches, extract global channel interaction information through global branches, fuse the local channel dependencies and global channel interaction information and superimpose residual connections to output a second peptide feature representation; input the peptide multi-source feature tensor into a CNN-BiLSTM network, capture local interaction features of the peptide sequence through a multi-scale convolutional network, model sequence context information through a BiLSTM network, concatenate the local interaction features and sequence context information and linearly project them to output a third peptide feature representation.

[0130] In one embodiment, the feature learning module 630 is further configured to input the first peptide feature representation, the second peptide feature representation, and the third peptide feature representation into a multi-scale sparse cross-attention; perform multi-window average pooling on each peptide feature representation through multi-scale sparse cross-attention to mine multi-scale information; and then generate peptide feature representations by weighted fusion through sparse attention weight calculation and learnable parameters.

[0131] In one embodiment, the feature learning module 630 is further configured to linearly project the sequence-level statistical features and ESM-2 global pooling features respectively, uniformly mapping them to the same feature dimension to obtain projected protein features; input the projected protein features into global-local channel attention, extract the local channel dependencies of the sequence-level statistical features through local branches, extract the global channel interaction information of the ESM-2 global pooling features through global branches, and perform superimposed residual connections to output the first protein feature representation; input the first protein feature representation into multi-scale linear attention, construct pseudo-sequences through feature segmentation, apply the ReLU linear attention mechanism for global dependency modeling, combine scale-aware position bias to obtain multi-scale feature relationships, and output the final protein feature representation.

[0132] In one embodiment, the feature fusion and prediction module 640 is further configured to perform global average pooling on the peptide feature representation and layer normalization on the protein feature representation to align the dimensions of the peptide feature representation and the protein feature representation; concatenate the aligned peptide feature representation and protein feature representation along the feature dimension to generate a fused feature vector; input the fused feature vector into a multilayer perceptron and perform nonlinear transformation through at least two fully connected layers in sequence to output the predicted probability of peptide-protein interaction.

[0133] In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as follows: Figure 7 As shown, the computer device includes a processor, memory, network interface, display screen, and input devices connected via a system bus. The processor provides computing and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system and computer programs. The internal memory provides an environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The network interface is used to communicate with external terminals via a network connection. When executed by the processor, the computer program implements a peptide-protein interaction prediction method based on multi-source features and feature fusion. The display screen can be a liquid crystal display (LCD) or an e-ink display. The input devices can be a touch layer covering the display screen, buttons, a trackball, or a touchpad mounted on the computer device casing, or an external keyboard, touchpad, or mouse.

[0134] Those skilled in the art will understand that Figure 7 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.

[0135] In one embodiment, a computer device is provided, including a memory and a processor, the memory storing a computer program, the processor executing the computer program to implement the steps of a peptide-protein interaction prediction method based on multi-source features and feature fusion.

[0136] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon, the computer program being executed by a processor to implement the steps of a peptide-protein interaction prediction method based on multi-source features and feature fusion.

[0137] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. Any references to memory, storage, databases, or other media used in the embodiments provided in this application can include non-volatile and / or volatile memory. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link DRAM (SLDRAM), Rambus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

[0138] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

[0139] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are relatively specific and detailed, they should not be construed as limiting the scope of the invention patent. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this patent application should be determined by the appended claims.

Claims

1. A method for predicting peptide-protein interactions based on multi-source features and feature fusion, characterized in that, The method includes: Acquire various peptide-protein interaction data, and define each peptide-protein interaction data as a peptide sequence or a protein sequence based on the amino acid sequence length; For the peptide sequence, residue-level hand-coded features, contact diagram structural features, and ESM-2 pre-trained residue-level embedding features are extracted as peptide multi-source features; for the protein sequence, sequence-level statistical features and ESM-2 global pooling features are extracted as protein features. The peptide multi-source features are input into the peptide branch, and after being processed by the Mamba-2 model, global-local channel attention, and CNN-BiLSTM network respectively, the multi-source features are fused through multi-scale sparse cross attention to output the peptide feature representation; the protein features are input into the protein branch, and after being processed by global-local channel attention and multi-scale linear attention to output the protein feature representation. The peptide feature representation and protein feature representation are concatenated and then input into a multilayer perceptron, which outputs the predicted probability of peptide-protein interaction.

2. The peptide-protein interaction prediction method based on multi-source features and feature fusion according to claim 1, characterized in that, Obtain the peptide-protein interaction data for each peptide, and define each peptide-protein interaction data as a peptide sequence or protein sequence based on its amino acid sequence length, including: Data on various peptide-protein interactions were selected from the STRING database based on confidence levels. Determine the amino acid sequence length corresponding to each of the peptide-protein interaction data; Residues with an amino acid sequence length greater than a length threshold are defined as protein sequences, and residues with an amino acid sequence length less than or equal to a length threshold are defined as peptide sequences.

3. The peptide-protein interaction prediction method based on multi-source features and feature fusion according to claim 1, characterized in that, For the peptide sequence, residue-level hand-coded features, contact map structural features, and ESM-2 pre-trained residue-level embedding features are extracted as peptide multi-source features, including: For each amino acid residue in the peptide sequence, binary encoding, BLOSUM62 matrix encoding, AAIndex physicochemical property index encoding, and PC6 six-dimensional physicochemical descriptor encoding are performed to obtain each encoding vector; The individual encoded vectors are concatenated along the channel dimension to obtain residue-level hand-coded features; A contact map matrix is ​​constructed based on the residue length of the peptide sequence. The elements of the contact map matrix are assigned values ​​according to the spatial interaction rules of amino acid residues in the peptide sequence to characterize the spatial proximity relationship between residues in the peptide sequence. The contact map matrix is ​​then converted into a residue-level vector to generate contact map structural features. The peptide sequence is input into a pre-trained ESM-2 protein language model, and the hidden state vector corresponding to each residue position is extracted from the output of the last layer of the ESM-2 protein language model to generate residue-level embedding features.

4. The peptide-protein interaction prediction method based on multi-source features and feature fusion according to claim 1, characterized in that, For the protein sequence, sequence-level statistical features and ESM-2 global pooling features are extracted as protein features, including: The amino acid composition characteristics, dipeptide composition characteristics, and amino acid entropy characteristics of the protein sequence were calculated respectively. The amino acid composition features, dipeptide composition features, and amino acid entropy features are spliced ​​along the feature dimensions to obtain sequence-level statistical features; The protein sequence is fed into a pre-trained ESM-2 protein language model, and the hidden state vector corresponding to each residue position is extracted from the output of the last layer of the ESM-2 protein language model to generate protein residue-level embedding features. A global average pooling operation is performed on the protein residue-level embedding features to obtain global pooled features.

5. The peptide-protein interaction prediction method based on multi-source features and feature fusion according to claim 1, characterized in that, The multi-source features of the peptide are input into the peptide branch and processed by the Mamba-2 model, global-local channel attention, and CNN-BiLSTM network, respectively, including: The residue-level hand-coded features and contact map structure features are linearly projected onto the dimensional space of the ESM-2 pre-trained residue-level embedding features to obtain the peptide multi-source feature tensor. The peptide multi-source feature tensor is input into the Mamba-2 model, and long-range dependency modeling is performed on the peptide multi-source feature tensor based on the structured state-space duality algorithm to output the first peptide feature representation. The peptide multi-source feature tensor is input into the global-local channel attention. Local channel dependencies are extracted through local branches, and global channel interaction information is extracted through global branches. The local channel dependencies and global channel interaction information are fused and superimposed with residual connections to output the second peptide feature representation. The peptide multi-source feature tensor is input into a CNN-BiLSTM network. The local interaction features of the peptide sequence are captured by a multi-scale convolutional network. The sequence context information is modeled by a BiLSTM network. The local interaction features and sequence context information are concatenated and linearly projected to output the third peptide feature representation.

6. The peptide-protein interaction prediction method based on multi-source features and feature fusion according to claim 5, characterized in that, By fusing multi-source features through multi-scale sparse cross-attention, peptide feature representations are output, including: Input the first peptide feature representation, the second peptide feature representation, and the third peptide feature representation into the multi-scale sparse cross attention; The multi-scale sparse cross-attention is used to perform multi-window average pooling to mine multi-scale information for each peptide feature representation. Then, the peptide feature representation is generated by weighted fusion of sparse attention weights and learnable parameters.

7. The peptide-protein interaction prediction method based on multi-source features and feature fusion according to claim 1, characterized in that, The protein features are input into the protein branch, and processed sequentially through global-local channel attention and multi-scale linear attention to output a protein feature representation, including: The sequence-level statistical features and ESM-2 global pooling features are linearly projected to the same feature dimension to obtain the projected protein features. The projected protein features are input into global-local channel attention. Local channel dependencies of the sequence-level statistical features are extracted through local branches, and global channel interaction information of the ESM-2 global pooling features is extracted through global branches. The residual connections are then stacked to output the first protein feature representation. The first protein feature representation is input into multi-scale linear attention. Pseudo-sequences are constructed through feature segmentation. The ReLU linear attention mechanism is applied to perform global dependency modeling. Multi-scale feature relationships are obtained by combining scale-aware position bias, and the final protein feature representation is output.

8. The peptide-protein interaction prediction method based on multi-source features and feature fusion according to claim 1, characterized in that, The peptide feature representation and protein feature representation are concatenated and then input into a multilayer perceptron to output the predicted probability of peptide-protein interaction, including: A global average pooling operation is performed on the peptide feature representation, and a layer normalization process is performed on the protein feature representation to align the dimensions of the peptide feature representation and the protein feature representation. The aligned peptide feature representations and protein feature representations are concatenated and spliced ​​along the feature dimension to generate a fused feature vector. The fused feature vector is input into a multilayer perceptron and then subjected to nonlinear transformation through at least two fully connected layers to output the predicted probability of peptide-protein interactions.

9. A peptide-protein interaction prediction system based on multi-source features and feature fusion, characterized in that, The system includes: The data acquisition module is used to acquire various peptide-protein interaction data and define each peptide-protein interaction data as a peptide sequence or a protein sequence according to the amino acid sequence length. The feature extraction module is used to extract residue-level hand-coded features, contact diagram structure features, and ESM-2 pre-trained residue-level embedding features for the peptide sequence as peptide multi-source features; and to extract sequence-level statistical features and ESM-2 global pooling features for the protein sequence as protein features. The feature learning module is used to input the multi-source features of the peptide into the peptide branch, process them through the Mamba-2 model, global-local channel attention, and CNN-BiLSTM network respectively, and then fuse the multi-source features through multi-scale sparse cross attention to output the peptide feature representation; the protein features are input into the protein branch, and processed through global-local channel attention and multi-scale linear attention in sequence to output the protein feature representation. The feature fusion and prediction module is used to concatenate and splice the peptide feature representation and protein feature representation, input them into a multilayer perceptron, and output the predicted probability of peptide-protein interaction.