Constructing method of genetic marker reference system for group differentiating and identification, and genetic marker reference system

A technology of genetic markers and construction methods, used in special data processing applications, instruments, electrical digital data processing, etc.

Active Publication Date: 2019-09-06
BEIJING INST OF GENOMICS CHINESE ACAD OF SCI CHINA NAT CENT FOR BIOINFORMATION
View PDF12 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, such methods are currently lacking for forensic investigations

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Constructing method of genetic marker reference system for group differentiating and identification, and genetic marker reference system
  • Constructing method of genetic marker reference system for group differentiating and identification, and genetic marker reference system
  • Constructing method of genetic marker reference system for group differentiating and identification, and genetic marker reference system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0089] This embodiment is used to illustrate how to use the method of the present invention to construct a reference set containing 16 SNPs from 55786541 SNPs for the distinction of Africans, Europeans and Asians ( figure 2 ,Table 1).

[0090] Specific steps are as follows:

[0091] 1. Data segmentation

[0092] Based on the 55,786,541 SNPs of 108 Africans, 313 Europeans, and 993 Asians in the 1000 Genomes Project (1000 Genomes Project), the data was segmented according to the intercontinental source of the population, and two types were obtained after segmentation. The first category is {Africa, (Europe, Asia)} and the second category is {Europe, Asia}.

[0093] 2. Data filtering

[0094] Calculate the F of the SNPs in each class ST value, and accordingly sort the SNPs in each class in descending order, and keep the top 20,000 SNPs.

[0095] 3. SNP selection

[0096] A feature selection algorithm was used to select a subset of 100 SNPs in each class after data filterin...

Embodiment 2

[0104] In this embodiment, the SNP reference frame selected from the data set of 178 SNPs is used by the method (AIM-SNPtag) described in the present invention. These 178 SNPs have been identified in "Li C-X, Pakstis AJ, Jiang L, Wei Y-L, Sun Q-F, Wu H, BulbulO, Wang P, Kang L-L, Kidd JR, Kidd KK. A panel of 74AISNPs: Improved ancestryinference within Eastern Asia. Forensic Science International: Genetics 23(2016) 101-110." Publicly reported in the article.

[0105] This embodiment is used to illustrate how to not go through step (2)---data filtering, and directly use step (1), (3) and (4)---data segmentation, SNP selection and integration optimization, from a smaller number of SNPs Concentrate selection to construct SNP reference system.

[0106] Specific steps are as follows:

[0107] 1. Data segmentation

[0108] Based on the Africans (AFR), Europeans (EUR), South Asians (SA), East Asians (EA) and Southeast Asians (SEA) in the Thousand Genomes Project (1000Genomes Projec...

Embodiment 3

[0121] This example is used to illustrate how to use the method of the present invention to select a reference set containing 47 STRs from 670,646 STR loci for distinguishing Africans, Europeans and Asians. This embodiment only involves steps (1) to (3) of the method of the present invention, and does not involve step (4).

[0122] Specific steps are as follows:

[0123] 1. Data segmentation

[0124] Based on the 670,646 STRs of 108 Africans, 313 Europeans, and 993 Asians in the 1000 Genomes Project (1000 Genomes Project), the data is segmented according to the intercontinental source of the population, and the segmentation categories are {Africa, Europe, Asia}.

[0125] 2. Data filtering

[0126] Firstly, STR sites with more than 10% missing data were filtered out; a total of 90,537 STR sites passed this filtering criterion. Then, calculate the F that preserves the STR ST value, and accordingly sort the STRs in each class in descending order, and keep the first 20,000 ST...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the field of molecular biology and genetics, and particularly discloses a constructing method of a genetic marker reference system for group differentiating and identification, and a genetic marker reference system. The constructing method comprises the steps of performing data segmentation on genetic marker data, performing genetic marker selection, or filtering the segmented data according to the condition, or performing integration optimization on the selected genetic marker. The method according to the invention can successfully reduce computing complexity from O(2n) to O(n2). In combination with some simple pre-screening strategies, the method according to the invention can process the whole genome data of thousands of persons to ten thousands of person, and selects the genetic marker reference system. In actual application, the method can be used for selecting the reference system which has preset accuracy (such as 95% or 99%) and comprises relatively small number of genetic markers according to the actual requirement. The characteristics have an important application value in legal examiner or medical genetic researching.

Description

technical field [0001] The invention belongs to the fields of molecular biology and genetics, and in particular relates to a method for constructing a genetic marker reference system for group differentiation and identification and a genetic marker reference system. Background technique [0002] One of the important tasks of forensic analysis is to clarify the group origin and group source of individuals, so as to effectively narrow the scope of investigation. Over the past few decades, although many group-specific genetic markers have been continuously developed, only a few have been used in actual forensic testing. In recent years, with the rapid development of genotype analysis technology and sequencing technology, a large amount of genetic data has emerged, providing an opportunity to fully explore the application potential of molecular genetic markers. In fact, polymorphic genetic markers have been successfully used for the prediction of physical characteristics and th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B40/00G16B50/00
CPCG16B40/00G16B50/00Y02A90/10
Inventor 陈华赵石磊马亮石承民
Owner BEIJING INST OF GENOMICS CHINESE ACAD OF SCI CHINA NAT CENT FOR BIOINFORMATION
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products