Method for detecting expansion of short tandem repeat

A short tandem repeat and sequence technology, applied in biochemical equipment and methods, microbial determination/inspection, etc., can solve the problems of sensitivity loss, low sequence insertion sensitivity, high sequence error rate, and achieve the effect of improving specificity

Active Publication Date: 2018-10-16
BEIJING GRANDOMICS BIOTECH +1
View PDF3 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The advantage of third-generation sequencing compared with second-generation sequencing is that the read length is longer and there is no GC preference. The disadvantage is that the error rate of the sequence is higher (about 15% error rate)
[0014] 2) Short tandem repeat expansion is a special type of sequence insertion. Existing tools for three-generation sequencing to detect genomic structural varia

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for detecting expansion of short tandem repeat
  • Method for detecting expansion of short tandem repeat
  • Method for detecting expansion of short tandem repeat

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0059] The next-generation sequencing data has a short read length (100-300bp), but the sequencing accuracy is high (>99%), and is suitable for detecting repeat expansions whose short repeat expansion length is smaller than the read length. In this example, HipSTR (Willems, T. et al. Genome-wide profiling of heritable and de novo STR variations. Nature Methods 14, 590 (2017)) was first used to detect NA12878 next-generation sequencing data (illuminaHiSeq2500, 150bp paired-end, 30X) The number of short tandem repeats in , randomly selected 1000 regions with short tandem repeat expansions and 4000 regions without short tandem repeat expansions as the test data set, and then used the test data set to test different detection methods to detect the third-generation sequencing data of NA12878 ( Pacbio Sequel sequencing platform, about 44X) Effect of short tandem repeat expansion. The main method flow chart of this technical solution is as follows: figure 1 Shown:

[0060] 1. Build...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method for detecting expansion of a short tandem repeat. The method comprises the following steps: firstly, carrying out sequence alignment; secondly, detecting short tandem repeat of third-generation sequencing data by RepeatHMM; thirdly, detecting sequence insertion of a short tandem repeat area by inScan; fourthly, calculating an intersection of RepeatHMM detection results and sequence insertion detection results of the short tandem repeat area. According to the method provided by the invention, the results of the sequence insertion and RepeatHMM short tandem repeatdetection are combined, so that the specificity of detecting the expansion of the short tandem repeat is improved.

Description

technical field [0001] The invention belongs to the technical field of gene sequencing, in particular to a short tandem repeat (STR, short tandem repeat) expansion detection method. Background technique [0002] Short tandem repeats refer to the repetitive sequence formed by connecting multiple nucleotides (the number of repeating units is greater than or equal to 2 and less than or equal to 6) in the DNA sequence. It will affect the expression, modification and corresponding physiological functions of genes; when the number of short tandem repeat units increases, it is called short tandem repeat expansion. [0003] Three-generation sequencing refers to a single DNA / RNA molecular sequencing technology. The current commercial three-generation sequencing technologies include Pacbio's single-molecule real-time sequencing technology and Nanopore's nanopore sequencing technology. The average length of reads measured by Pacbio's single-molecule real-time sequencing technology is ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): C12Q1/6869
CPCC12Q1/6869C12Q2525/151
Inventor 杨旗唐北沙梁帆江泓杨帆沈璐汪德鹏
Owner BEIJING GRANDOMICS BIOTECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products