Unlock instant, AI-driven research and patent intelligence for your innovation.

RNA and protein binding site recognition method based on deep learning

A binding site and deep learning technology, applied in the field of RNA and protein binding site identification, can solve the problems of impact on results, harsh experimental environment requirements, and high professional quality requirements of scientific researchers

Active Publication Date: 2021-07-27
JILIN UNIV
View PDF10 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, there are two main methods for studying the binding sites of RNA and proteins. One is through biological experiments. This method not only has strict requirements on the experimental environment, but also has high requirements on the professionalism of researchers.
Although this method is accurate and reliable, it requires a lot of time and effort, and many uncontrollable factors in the experimental process will also have a huge impact on the results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • RNA and protein binding site recognition method based on deep learning
  • RNA and protein binding site recognition method based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] See Figure 1 to 2 Down:

[0026] The present invention provides a method based on deep learning RNA and protein binding sites, mainly divided into two parts: data processing and model design training, and the main steps are as follows:

[0027] First step, data processing

[0028] The method uses data in the clipdb as a data set of this experiment, which consists of 31 experiments in 19, for data from the data in each experiment, from RNA within the interaction of Clip-SEQ All nucleotides are considered a binding site. Each experiment data is divided into training sets and test sets, including 300,000 training sets and 10,000 test sets. We will reappear 80% of the original training as a training set, 20% as a validation set. In the end, 24,000 samples in each biological experiment serve as a training set, with 6,000 samples as a validation set, and there are more than 10,000 samples as an independent test set.

[0029] The original data in the data set is the RNA sequence f...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an RNA and protein binding site identification method based on deep learning, and the identification method is divided into two parts: data processing and model design training, and each part mainly comprises the following steps: 1, data processing: step 1, performing redundancy elimination on original data; step 2, predicting secondary structure information; step 3, coding the secondary structure obtained in the step 2; step 4, storing all data in the corresponding csv file as input data; 2, model design and training: step 1, encoding as a data matrix as model input; step 2, determining that the dimensions of the two feature matrixes are the same; step 3, inputting the combined features into an encoder; step 4, obtaining a final encoding result of the encoder. step 5, obtaining a final category through a sigmoid function. The invention has the beneficial effects that the long dependency of the sequence is learned by using the Transform network and the characteristics are coded, so that the binding site of RNA and protein can be effectively predicted.

Description

Technical field [0001] The present invention relates to a method of identifying an RNA and a protein binding site, and more particularly to a method of identifying a depth study of RNA and protein binding sites. Background technique [0002] RNA Binding Protein (RBP) is a class of proteins that regulate the RNA metabolic process, which is closely related to many important biological processes, and post-transcriptional regulation and gene expression, and the like. For example, miRNA is a class of RNAs with a length of 21 bp, and after mutual action, it will stop translation and play a role of transcriptional regulation. In addition, the disorder of RBP may lead to various diseases. The mutation of RBP, FUS, and TDP-43 is closely related to myramatic lateral river, and RBP can also destroy the transcriptional mechanism of diabetes, understand the regulation of RBP in diabetes, can help design RNA-based diabetes complications. Integrated RBP and RNA interactions and single nucleotid...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G16B20/30G06N3/04
CPCG16B20/30G06N3/045
Inventor 朱晓冬李泽晋刘元宁
Owner JILIN UNIV