Circular RNA recognition method based on machine learning strategy

A technology of machine learning and recognition method, applied in the field of data science, can solve the problem of low recognition sensitivity and accuracy, and achieve the effect of cost saving

Active Publication Date: 2020-08-25
XI AN JIAOTONG UNIV
View PDF4 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

A variety of computational methods for identifying circular RNAs from RNA-seq data have been proposed, but these methods generally suffer from low identification sensitivity and low accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Circular RNA recognition method based on machine learning strategy
  • Circular RNA recognition method based on machine learning strategy
  • Circular RNA recognition method based on machine learning strategy

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] see figure 1 , the present invention a kind of circular RNA identification method based on machine learning strategy, comprises the following steps:

[0046] S1, input data

[0047] For the SAM file and gene annotation FASTA file generated by RNA-seq, the input data format requirements are: SAM format and FASTA format; run the existing circular RNA identification algorithm to obtain the candidate circular RNA set, and determine the breakpoint position of the candidate circular RNA;

[0048] Run the existing circular RNA identification algorithm to output a set of candidate circular RNAs, where the reference genome number is used, and the position of the 5' splicing site of the circular RNA is defined as the left breakpoint, using brk 1 Indicates; the position of the 3' splice site of circular RNA is defined as the right breakpoint, with brk 2 Represent; the two can jointly represent a candidate circular RNA; sort the circular RNAs in the candidate circular RNA set acc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a circular RNA recognition method based on a machine learning strategy. The method comprises the steps: inputting data, positioning each candidate circular RNA on a reference genome, and extracting Reads features nearby circular RNA regions; training a supervised machine learning model by using the extracted features; and performing true and false positive classification onthe candidate circular RNA set by using the trained model, and outputting final circular RNA. The method belongs to a machine learning filtering strategy, has the advantages of the machine learning filtering strategy, and can remarkably save the cost, time and the like in clinical practice.

Description

technical field [0001] The invention belongs to the technical field of data science, and in particular relates to a circular RNA identification method based on a machine learning strategy. Background technique [0002] Circular RNA (English name: Circular RNA, English abbreviation: CircRNA) is an important member of the non-coding RNA (English name: non-coding RNA, English abbreviation: ncRNA) family. The definition of circular RNA is: circRNA (circular RNA, circular RNA) is a type of non-coding RNA molecule with a closed circular structure, without a 5'cap structure and a 3'poly(A) structure. Its existence was discovered as early as the 1970s, but due to the limitations of technology and knowledge at that time, circular RNA was once considered to be the result of splicing errors or transcriptional noise. In recent years, with the deepening of research and the development of sequencing technology, for the first time in 2012, RNA sequencing (English name: RNA sequencing, Eng...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B40/00G16B30/00G06N20/00
CPCG16B40/00G16B30/00G06N20/00
Inventor 张选平王一丹王嘉寅
Owner XI AN JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products