Unlock instant, AI-driven research and patent intelligence for your innovation.

Transcript assembly method based on reference genome

A technology for reference genomes and assembly methods, applied in genomics, biochemical equipment and methods, proteomics, etc., to achieve the effect of improving accuracy

Pending Publication Date: 2022-04-05
SHANDONG UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In addition, many tools have been developed to combine transcripts from multiple RNA-seq samples, such as StringTie2 and TACO's merging method, but the reconstructed transcripts still have many deficiencies

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Transcript assembly method based on reference genome
  • Transcript assembly method based on reference genome
  • Transcript assembly method based on reference genome

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0048] A transcript assembly method based on a reference genome, the steps are as follows:

[0049] (1) Extract the mRNA of the sample to be tested, amplify it, obtain circularized cDNA, and sequence it on the machine; use two comparison tools (Hisat2, Star) to analyze the original sequencing data obtained off the machine, and obtain a reply to the reference genome the result of.

[0050] (2) Construct a clipping diagram for the reply results of each comparison tool:

[0051] Based on the results of reads posted on the reference genome, cluster them into different gene loci, and the exon-exon splicing sites are derived from these spliced ​​reads;

[0052] Specifically, construct a traditional splicing graph G=(V,E) for each gene locus, where each node v corresponds to an exon, and each edge e corresponds to the splice site between two exons ;

[0053] In addition, edges and nodes are weighted by the number of reads supporting them; edges or nodes with low weights that may b...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a transcript assembly method based on a reference genome, and belongs to the field of transcript assembly in bioinformatics, and the specific steps are as follows: (1) analyzing original sequencing data obtained from a machine by using at least two comparison tools; (2) constructing a splicing diagram according to a replying result of each comparison tool; (3) combining the spliced pictures of the comparison tools to generate a label spliced picture; (4) extracting a double-terminal circuit marked in the label graph; and (5) searching path coverage in the label splicing diagram to finally obtain path coverage representing transcripts, and covering all double-end paths by the assembled transcripts. According to the method, the reply results of different comparison tools are combined, the label splicing diagram is constructed on the basis of the reply results of the different comparison tools, the diagram is traversed through a dynamic path extension algorithm, so that the path representing the transcript is found out, the characteristics and advantages of the different comparison tools are considered, and the accuracy of the reconstructed transcript is improved.

Description

technical field [0001] The invention relates to a transcript assembly method based on a reference genome, which combines the reply results of different comparison tools, and belongs to the field of transcript assembly in bioinformatics. Background technique [0002] Transcriptome sequencing technology (RNA-seq), as a powerful technique for transcriptome analysis, has been widely used worldwide. Especially in the past five years, this technology has transitioned from research to clinical applications, which has provided clues for the study of complex diseases (such as cancer) associated with aberrant splicing events or differential expression levels. Furthermore, it provides the opportunity to observe the complexity of eukaryotic transcriptomes, identify expressed transcriptomes, and precisely quantify their expression abundance at the whole transcriptome level. One of the critical steps in RNA-seq data analysis is the accurate assembly of large numbers of sequencing reads i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G16B20/00C12Q1/6869
Inventor 赵晓宇于婷姚鸿彬李国君
Owner SHANDONG UNIV
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More