Third-generation full-length transcriptome sequencing result analysis method suitable for Sequel sequencing

A transcriptome analysis, full-length technology, applied in genomics, proteomics, instruments, etc., can solve the problems of slow running time and high computer resource consumption, and achieve fast running speed, easy analysis, and fine annotation

Pending Publication Date: 2020-12-15
南京派森诺基因科技有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Usually a sample can obtain millions of sequencing reads, the previous analysis methods have disadvantages such as high computer resource consumption and slow running time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Third-generation full-length transcriptome sequencing result analysis method suitable for Sequel sequencing
  • Third-generation full-length transcriptome sequencing result analysis method suitable for Sequel sequencing
  • Third-generation full-length transcriptome sequencing result analysis method suitable for Sequel sequencing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0057] A three-generation full-length transcriptome analysis method applicable to the Sequel sequencing platform, comprising the following steps:

[0058] Step 1, sequencing data filtering step:

[0059] Use pacbio's official isoseq3 process to process the raw data:

[0060] Use the ccs program to process the off-machine subreads to obtain the consistency sequence CCS of each zero-mode waveguide hole, as shown in figure 1 As shown, the accuracy value distribution of pacbio CCS (Consensus Sequence) is mainly distributed around 0.99, indicating that the sequencing results are of very high quality after processing;

[0061] Use the lima program to identify the joints of the consensus sequence to obtain the full-length sequence FL, as shown in figure 2 As shown in the figure, the sequences of full length non chimeric with PolyA (full length non-chimeric, containing PolyA) account for the vast majority, and the effective sequence comparison in the result is relatively high;

[00...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a third-generation full-length transcriptome analysis method suitable for a Sequel sequencing platform, which is characterized by comprising the following steps: 1, a sequencing data filtering step; 2, a sequencing data comparison step; 3, a transcript annotation step; 4, an ORF prediction step; 5, a transcript function annotation step; 6, a fusion gene analysis step; 7, aLncRNA prediction step; 8, variable shear analysis step; and 9, a variable polyadenylation analysis step. According to the method, the running speed is higher, annotation on the transcript is finer compared with common matchannot software, and the type of the transcript is more convenient to analyze.

Description

technical field [0001] The invention relates to the field of gene detection, in particular to a three-generation full-length transcriptome analysis method applicable to the Sequel sequencing platform. Background technique [0002] A transcriptome is the collection of all transcripts produced by a species or a particular cell type. Transcriptome research can study gene function and gene structure at an overall level, and reveal specific biological processes and molecular mechanisms in the process of disease occurrence. It has been widely used in basic research, clinical diagnosis, and drug development. Eukaryotic protein-coding genes have a poly(A) tail at the 3' end, so for eukaryotic organisms, after extracting total RNA, the RNA sequence can be reverse-transcribed into cDNA using a reverse transcription primer with polyT, Using cDNA as a template, a full-length cDNA library was prepared, and the constructed library was sequenced with a Sequel sequencer. [0003] The sequ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B20/30G16B50/00
CPCG16B20/30G16B50/00
Inventor 沈立姜丽荣孙子奎
Owner 南京派森诺基因科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products