RNA-protein binding site prediction method and system based on self-attention mechanism

A technology combining sites and prediction methods, applied in the field of bioinformatics, can solve problems such as the need to further improve the prediction accuracy and the insufficient characteristics of the extracted RNA data, so as to improve the prediction accuracy and reduce the experiment time and financial loss.

Active Publication Date: 2022-02-08
SICHUAN UNIV
View PDF10 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In addition, in the model stage, the existing models mostly use the combination of convolutional neural network (CNN) and recurrent neural network (RNN), but compared with the self-attention mechanism (self-attention mechanism), the existing model extracts RNA The data features are not sufficient, so the prediction accuracy still needs to be further improved

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • RNA-protein binding site prediction method and system based on self-attention mechanism
  • RNA-protein binding site prediction method and system based on self-attention mechanism
  • RNA-protein binding site prediction method and system based on self-attention mechanism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0071] The present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments.

[0072] The present invention proposes a method and system for predicting RNA-protein binding sites based on a self-attention mechanism. This method encodes the contextual relationship between adjacent nucleotides through k-mer embedded coding, and introduces a self-attention mechanism to build a prediction model, giving higher weights to key subsequences so that the network can fully learn key features, thereby improving the model's accuracy. prediction accuracy.

[0073] This embodiment provides a method for predicting RNA-protein binding sites based on the self-attention mechanism, refer to figure 1 and figure 2 , the process is implemented based on python3.8.6-tensorflow2.4.0. The method includes:

[0074] S1: Data acquisition and preprocessing, acquiring RNA sequence data and performing data preprocessing;

[0075] S2: Based...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an RNA-protein binding site prediction method and system based on a self-attention mechanism. According to the method, a deep learning model is trained through employing the sequence features of an RNA and protein binding site and upstream and downstream, and an RNA-protein interaction binding site is predicted through employing the model; in the coding process of the sequence features, a k-mer embedded coding mode is introduced, the context relationship between adjacent nucleotides is coded, and more effective features are provided for the model; in the feature extraction process, a prediction model is constructed by using a self-attention mechanism, RNA sequence features are focused from the global perspective, and key subsequences are endowed with higher weights so that the network can fully learn key features, and then the model prediction accuracy is improved; and finally, the method provided by the invention is evaluated on a reference data set, and the method is superior to the prior art in the aspect of prediction precision.

Description

technical field [0001] The invention relates to the technical field of biological information, in particular to a method and system for predicting RNA-protein binding sites based on a self-attention mechanism. Background technique [0002] RNA-binding proteins (RBPs) play an important role in gene regulation. Studies have shown that, except for a few RNAs that can function independently in the form of ribozymes, most RNAs participate in gene regulation by combining with proteins to form RNA-protein complexes. RNA binding proteins play key roles in the regulation of life activities such as RNA synthesis, alternative splicing, modification, transport and translation. For example, heterogeneous ribonucleoprotein (HNRNPL) not only directly regulates the alternative splicing of many RNAs, but also regulates the formation of circular RNAs (circRNAs) through back-splicing. [0003] To better understand the function of RNA-binding proteins, researchers need reliable predictive mod...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B15/30G16B40/00G06K9/62G06N3/04G06N3/08
CPCG16B15/30G16B40/00G06N3/084G06N3/047G06N3/048G06F18/241G06F18/2415Y02A90/10
Inventor 朱敏王心翌张铭洋姚林龙春林
Owner SICHUAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products