Method for detecting and typing repeat number of short tandem repeat sequence

A typing method and a technology of repeated numbers, applied in the field of bioinformatics analysis, can solve problems such as a lot of difference (sometimes the ratio of the two exceeds 1:10, inaccurate typing of STR typing detection, multi-STR typing, etc. Achieve high accuracy, strong objectivity, and high data utilization

Active Publication Date: 2021-09-07
BEIJING MICROREAD GENE TECH
View PDF6 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Since the amount of data in next-generation sequencing will vary with the length of the fragment, generally the longer the fragment, the less data, so when detecting heterozygous STR loci, if there is a large difference in the number of repeats between the two types, the two alleles The number of reads may be much different (sometimes the ratio of the two will exceed 1:10), according to the threshold of 40%, it will cause STR typing with a large number of repetitions to be missed
[0006] It can be seen that there are still many inaccurate problems in the detection of STR typing by next-generation sequencing. In view of this, the present invention is proposed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for detecting and typing repeat number of short tandem repeat sequence
  • Method for detecting and typing repeat number of short tandem repeat sequence
  • Method for detecting and typing repeat number of short tandem repeat sequence

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0089] Embodiment 1 Method system construction of the present invention

[0090] The process flow of the overall analysis of the next-generation sequencing STR of the present invention is as follows: figure 1 , in the process, first split the off-machine data BCL file into fastq, and then perform read comparison to obtain the STR site to which each read belongs. For all the reads aligned to a certain site, use flanking sequence matching, and if matched, use the repeat number calculation formula to calculate the number of repeats contained in the repeat region. Then count the number of reads of each type in each site, and finally use the typing algorithm to obtain the correct typing. The key steps are the detection of repeated regions and the determination of typing, as follows.

[0091] 1. Construction of search / alignment method for STR flanking sequences

[0092] When using flanking sequences to match reads, the flanking sequences may contain mismatches, insertions and del...

Embodiment 2

[0120] Example 2 Sample Detection Test

[0121] After constructing the algorithm and obtaining the optimal parameter system, the present invention respectively uses standard samples and real samples to test the method of the present invention. Specifically, standard positive samples 9948 and 9947 cell lines, and more than 70 groups of real samples were used for testing. The positive control for real sample testing is the result of capillary electrophoresis (CE). The specific steps include firstly the second-generation library construction and high-throughput sequencing of STR sites; data splitting and sequence comparison after off-machine; finally, the method proposed by the present invention is used to detect the number of repetitions and type STRs, The old method (direct typing through ACR value) was used as a control.

[0122] For the test results of standard samples:

[0123] Using the new method to detect STR typing at 66 loci of 9948 cell lines, the concordance rate b...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of information analysis of sequencing data, and particularly provides a method for detecting and typing the repetition number of a short tandem repeat sequence based on next-generation sequencing. According to the detection method, two flanking sequences can be compared at the same time across a middle repeated region, and mismatching, insertion and deletion of the flanking sequences are considered at the same time; the typing method is an STR typing method for performing dynamic threshold setting by combining repetition number difference based on detection of an effective peak value, and can be applied to detection of repetition numbers of different next-generation sequencing platforms.

Description

technical field [0001] The invention belongs to the field of bioinformatics analysis, and in particular relates to a method for detecting and typing the repeat number of short tandem repeat sequences. Background technique [0002] Short tandem repeat (short tandem repeat, STR), also known as microsatellite DNA (micrositellite DNA), is a DNA sequence synthesized in tandem with 2 to 6 bases as the core. STR has a high mutation rate, polymorphism, and easy Detection and other characteristics, so it is widely used in forensic related detection. Since 1985, STR detection has been applied to the forensic field, usually by capillary electrophoresis with fluorescent labels. Specific primers are designed for different STR sites, amplification products of different lengths and different fluorescent labels are obtained through amplification, and different STR sites are distinguished by capillary electrophoresis. [0003] However, this capillary electrophoresis-based detection method ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B20/30G16B30/10
CPCG16B20/30G16B30/10
Inventor 李梦郭茂平胡欢陈初光
Owner BEIJING MICROREAD GENE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products