Predicting method of RNA two-grade structure

A technology of secondary structure and prediction method, applied in the field of biological research, can solve the problems of difficult to obtain crystals, high cost, time-consuming secondary structure, etc., and achieve the effect of improving prediction efficiency and accurate prediction results.

Pending Publication Date: 2019-07-12
ZHEJIANG UNIVERSITY OF SCIENCE AND TECHNOLOGY
View PDF4 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the RNA molecule itself has the characteristics of difficult to obtain crystals and fast molecular degradation, so it is relatively time-consuming and costly to predict its secondary structure by experimental physics and chemical experiments.
The use of computers based on comparative sequence analysis and minimum free energy methods to predict these spatial structures has improved the efficiency of prediction compared with traditional methods, but for secondary structures with long RNA primary sequence bases, the predicted The time and cost are also greatly increased

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Predicting method of RNA two-grade structure
  • Predicting method of RNA two-grade structure
  • Predicting method of RNA two-grade structure

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0035] Example 1: A method for predicting RNA secondary structure: download the PDB data set from the official website of the PDB database, such as figure 1 Shown is a partial example of one of the RNA information '2JTP.pdb' in the downloaded PDB data. The PDB database contains three parts: RNA sequence information, RNA primary sequence and three-dimensional space coordinates. It can be seen from the figure The primary sequence of RNA is recorded in 'SEQRES'. First, data preprocessing is performed on the PDB data set, and the primary sequence is extracted by means of regular expressions. Some of the data are except A, C, G, U For other characters, these characters need to be cleaned to obtain the correct RNA primary sequence. Use the known RNA secondary structure prediction software RNAview to predict the RNA secondary structure corresponding to each primary sequence in batches under the Linux system, and remove the RNA tertiary structure with too high dimensionality, leaving ...

Embodiment 2

[0045] Example 2: A method for predicting the secondary structure of RNA: download the PDB data set from the official website of the PDB database, first perform data preprocessing on the PDB data set, and extract the primary sequence by means of regular expressions, some of which The data has characters other than A, C, G, and U. At this time, these characters need to be cleaned to obtain the correct RNA primary sequence. Use the known RNA secondary structure prediction software RNAview to predict the RNA secondary structure corresponding to each primary sequence in batches under the Linux system, and remove the RNA tertiary structure with too high dimensionality, leaving only the secondary structure and Partial pseudoknot structure. After data preprocessing, the PDB data set is divided into RNA primary sequence data set and RNA secondary structure data set, and then the RNA primary sequence in the RNA primary sequence data set is processed by computer, and a 5bit orthogonal 0...

Embodiment 3

[0053] Embodiment 3: A kind of prediction method of RNA secondary structure, carry out the prediction of RNA secondary structure according to the operation step of embodiment 2, but when the RNA primary sequence after encoding is input into machine learning model as feature, adjust The number of windows and the number of base pairs in RNA long-range correlations were used to test the overall prediction accuracy of RNA secondary structure. In this embodiment, the SVM classifier is used alone, and the quantitative analysis method is used to determine the most suitable RNA secondary structure prediction. The test results are as follows: Figure 8 shown. From Figure 8 It can be seen that when there is no base pairing in the RNA long-range correlation, that is, when base pair=0, the more the number of windows, the highest overall prediction accuracy can reach 80%, because the base in the RNA secondary structure The relationship between becomes larger, the larger the number of wi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a predicting method of an RNA two-grade structure. The method comprises the steps of performing data preprocessing on a PDB data set, dividing the PDB data set into an RNA first-grade sequence data set and an RNA second-grade structure data set; performing computer coding processing on the RNA first-grade sequence in the RNA first-grade sequence data set, and inputting thecoded RNA first-grade sequence as a characteristic into a machine learning model which is established based on a supervising learning algorithm, obtaining a target function, and using the RNA second-grade structure data set as an output label of the machine learning model, and training and testing the machine learning model; and finally performing RNA two-grade structure predicting by means of thetrained and tested machine learning model. According to the predicting method, a supervising learning algorithm is used and an artificial intelligence method is used for predicting the RNA two-gradestructure, thereby greatly improving predicting efficiency and realizing relatively high predicting result accuracy.

Description

technical field [0001] The invention relates to the field of biological research, in particular to a method for predicting an RNA secondary structure. Background technique [0002] Ribonucleotide molecule RNA, as a macromolecule in organisms, is an important substance existing in organisms. It not only cooperates with deoxyribonucleotide molecules DNA and proteins to maintain the activities of organisms, but also in Plays an important role in DNA and protein synthesis. Studies have found that the study of RNA structure can help us understand the function of RNA molecules more comprehensively, which is beneficial for biological researchers to explore the relationship between RNA, DNA and proteins, so as to understand the function of organisms and understand and treat diseases. [0003] The molecular structure of RNA consists of three parts: primary sequence, secondary structure, and tertiary space structure. The tertiary space structure of RNA is a stable structure formed i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B15/20G16B20/20G16B40/20G06K9/62
CPCG16B40/20G16B15/20G16B20/20G06F18/2411G06F18/214
Inventor 孙婷婷苏静杰
Owner ZHEJIANG UNIVERSITY OF SCIENCE AND TECHNOLOGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products