Mutation sequence annotation method

A variation and sequence technology, applied in the field of annotation of variation sequences

Active Publication Date: 2020-09-11
中国人民解放军海军军医大学第三附属医院 +1
View PDF5 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] HGVS (Human Genome Variation Society) has formulated the mutation naming rules recognized by the academic community (http: / / varnomen.hgvs.org / ), but ANNOVAR does not use HGVS specification naming by default

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Mutation sequence annotation method
  • Mutation sequence annotation method
  • Mutation sequence annotation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0070] (1) Determine the variant sequence information (call variants)

[0071] (1.1) Obtain the variant sequence

[0072] Using probe capture technology, next-generation sequencing of human exons is performed to obtain the sequence to be analyzed. Using the variant sequence analysis software (GATK, https: / / gatk.broadinstitute.org / hc / en-us), compare the sequence to be analyzed with the reference genome to obtain the variation information (call variants).

[0073] (1.2) Integrate reference sequence information

[0074] Obtain hg19 reference genome sequence and hg19 reference genome annotation file, which includes gene name, transcript name, physical location, positive and negative strands, information of each element (elements include UTR, Intron, CDS), etc. Among them, the download address of the hg19 reference genome sequence is ftp: / / hgdownload.soe.ucsc.edu / goldenPath / hg19 / bigZips / hg19.fa.gz; the access address of the hg19 reference genome annotation file is: ftp: / / hgdownlo...

example 1

[0130] Mutation site: A mutation at position 69511 of chromosome 1 to G (step 1.3 standardized variation information, 1:69511:69511:A:G)

[0131] Amino acid sequence variation annotation results:

[0132] Comparative example: OR4F5:NM_001005484:exon1:c.421A>G:p.T141A

[0133] Example: OR4F5:NM_001005484:exon1:c.421A>G:p.Thr141Ala

[0134] The annotation results are consistent, but the abbreviation of amino acid used by ANNOVAR in the comparative example does not conform to the specification.

[0135] Nucleic acid sequence variation annotation results:

[0136] Comparative example: Symbol: OR4F5

[0137] Example: Symbol: OR4F5, EntrezID: 79501

[0138] The annotation results are consistent, but ANNOVAR has no EntrezID information in the comparative example.

[0139] Functional area notes: both exonic; results consistent.

[0140] Variation type annotation: all are nonsynonymous SNV; the results are consistent.

example 2

[0142] Variation site: Chromosome 9 70176769 G deletion (step 1.3 standardized variation information, 9:70176769:70176769:G:-)

[0143] Nucleic acid and amino acid sequence variation annotation results:

[0144] Comparative example: FOXD4L5:NM_001126334:exon1:c.1215delC:p.W406Gfs*21

[0145] Example: FOXD4L5:NM_001126334:exon1:c.1215_1215del:p.Trp406Glyfs

[0146] The annotation results are consistent, but the start and stop sites are missing in the examples, and three letters are used for amino acids.

[0147] Functional area notes: both exonic; results consistent.

[0148] Variation type annotations:

[0149] Comparative example: frameshift deletion

[0150] Example: del_frameshift_stoploss

[0151] In the example, the stoploss information is successfully annotated.

[0152] Gene information:

[0153] Comparative example: Symbol: FOXD4L5

[0154] Example: Symbol: FOXD4L5, EntrezID: 653427

[0155] In the comparative example, ANNOVAR has no EntrezID information.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of biological information, and particularly relates to a mutation sequence annotation method, which comprises the following steps of: (1) determination ofmutation sequence information: obtaining a mutation sequence, integrating reference sequence information, and standardizing the mutation information; and (2) variation annotation, wherein the annotation result comprises an annotation function region, a variation type, a nucleic acid sequence and an amino acid sequence. According to the method, the existing function of the industrial gold standardANNOVAR can be realized, the defects in ANNOVAR are overcome, the aspects of distinguishing of splicing site and splicing region variation, CDS edge variation, annotation frameshift and stoposes / stopgain and the like are perfected, a normative expression mode is used, the gene number Entrez ID is also increased, and the method has good application value.

Description

technical field [0001] The invention belongs to the technical field of biological information, and in particular relates to a method for annotating variant sequences. Background technique [0002] With the development of sequencing technology, the throughput of sequencing continues to increase and the cost of sequencing continues to decrease. More and more species have obtained genome and transcriptome information. In the field of subdivision, more and more studies focus on the variation among different breeds or populations of the same species, and even between differentiated individuals, in order to seek the phenotypic differences caused by the variation of individual genetic information in a large genetic background . This poses a challenge to the search and annotation of variant sequences. [0003] Taking humans as an example, ANNOVAR is the mainstream software for annotating mutations and is considered the gold standard in the industry. However, in actual use, the inv...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B30/00G16B50/10
CPCG16B30/00G16B50/10
Inventor 文文王红阳朱赢陈淑桢何慧斯高勇汪德鹏
Owner 中国人民解放军海军军医大学第三附属医院
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products