Unlock instant, AI-driven research and patent intelligence for your innovation.

A Deep Learning-Based Identification Method for RNA and Protein Binding Sites

A binding site, deep learning technology, applied in the field of RNA and protein binding site identification, can solve the problems of harsh experimental environment, impact on results, large time and effort, etc.

Active Publication Date: 2022-03-08
JILIN UNIV
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, there are two main methods for studying the binding sites of RNA and proteins. One is through biological experiments. This method not only has strict requirements on the experimental environment, but also has high requirements on the professionalism of researchers.
Although this method is accurate and reliable, it requires a lot of time and effort, and many uncontrollable factors in the experimental process will also have a huge impact on the results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Deep Learning-Based Identification Method for RNA and Protein Binding Sites
  • A Deep Learning-Based Identification Method for RNA and Protein Binding Sites

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] see Figure 1 to Figure 2 Shown:

[0026] The recognition method of RNA and protein binding sites based on deep learning provided by the present invention is mainly divided into two parts: data processing and model design training, and the main steps of each part are as follows:

[0027] The first step, data processing

[0028] This method uses the data in CLIPdb as the data set of this experiment. The data set consists of 31 experiments of RBP in 19. For the data in each experiment, the RNA in the interaction site cluster derived from CLIP-seq All nucleotides of are considered binding sites. The data of each experiment is divided into training set and test set, which contains 30000 records in training set and 10000 records in test set. We then use 80% of the original training set as the training set and 20% as the validation set. Finally, in each biological experiment, 24,000 samples were used as the training set, 6,000 samples were used as the validation set, and ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a recognition method of RNA and protein binding sites based on deep learning. The method is as follows: the recognition method is divided into two parts: data processing and model design training, and the main steps of each part are as follows: the first step, data processing : Step 1, remove redundancy from original data; Step 2, predict secondary structure information; Step 3, encode the secondary structure obtained in Step 2; Step 4, save all data in the corresponding csv file as input data ;The second step, model design and training: step 1, coded into a data matrix as the model input; step 2, the dimensions of the two feature matrices are the same; step 3, input the combined features into the encoder; step 4, the final encoder encoding The result; Step 5, obtain the final category through the sigmoid function. Beneficial effects: the use of the Transformer network to learn the long-term dependence of sequences and encode features can effectively predict the binding sites of RNA and proteins.

Description

technical field [0001] The invention relates to a method for identifying binding sites of RNA and proteins, in particular to a method for identifying binding sites of RNA and proteins based on deep learning. Background technique [0002] RNA binding protein (RNA binding protein, RBP) is a kind of protein that regulates RNA metabolic process. It is closely related to many important biological processes, such as post-transcriptional regulation of genes and gene expression. For example, miRNA is a type of RNA with a length of about 21 bp. After interacting with mRNA, it will stop translation and play a role in post-transcriptional regulation. In addition, the disorder of RBP may lead to various diseases. Mutations of RBP, FUS, and TDP-43 are closely related to ALS. RBP can also disrupt the post-transcriptional mechanism of diabetes. Understanding the regulatory role of RBP in diabetes can help design RNA-based treatments for diabetic complications. Integrating the information...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G16B20/30G06N3/04
CPCG16B20/30G06N3/045
Inventor 朱晓冬李泽晋刘元宁
Owner JILIN UNIV