SNP marker combination for deducing crowds in different geographic areas of Asia

A geographical area and marker technology, applied in the biological field, can solve the problems of difficulty in applying genetic analysis of large sample populations, large demand for DNA samples, and high cost of whole genome SNP analysis, and achieve the effect of satisfying forensic and medical genetic analysis.

Active Publication Date: 2020-03-17
BEIJING INST OF GENOMICS CHINESE ACAD OF SCI CHINA NAT CENT FOR BIOINFORMATION
View PDF3 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] However, the high cost of genome-wide SNP analysis and the large demand for DNA samples are difficult to apply to forensic applications and genetic analysis of very large sample populations

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • SNP marker combination for deducing crowds in different geographic areas of Asia
  • SNP marker combination for deducing crowds in different geographic areas of Asia
  • SNP marker combination for deducing crowds in different geographic areas of Asia

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0041] In this example, six geographical populations of East Asia, West Asia, South Asia, North Asia, Central Asia, and Southeast Asia islands were screened out from the 1000 Genomes Project (1000GP), EGDP, HGDP, SGDP, SSIP, and SSMP databases, and the genomes of a total of 2276 samples were obtained. Horizontal SNP data. The population and sample size included in each region are shown in Table 1. After the integration and merging of data from different sources, site quality control and individual screening, an original data set consisting of 349,381 SNP points from 2,128 unrelated individuals was formed for the subsequent construction of indicative SNP combinations of geographical ancestry.

[0042] 1000GP: the 1000Genomes Project, Thousand Genomes Project, A global reference for human genetic variation, Nature 526(7571)(2015)68-74.

[0043] EGDP: Estonian Biocentre Human Genome Diversity Panel, Genomic analyzes inform on migration events during thepeopling of Eurasia, Natur...

Embodiment 2

[0059] Example 1 extracts SNP combinations that are useful for ancestry inference from a total of 349,381 SNPs. The algorithm can weigh the ancestry inference ability of the SNP itself and the information overlap between different SNPs to obtain the best combined inference effect. Add the screened SNPs one by one, and calculate the average classification accuracy rate AAC, and get the curve as figure 1 shown. The classification accuracy AC is defined as the ratio of the number of correctly classified samples to the total number of test samples,

[0060]

[0061] The average classification accuracy (AAC) is defined as the average value obtained by repeatedly calculating the AC value 1000 times when the test set is randomly selected.

[0062] In this example, three methods are used to evaluate the performance of the SNP reference system obtained in Examples 1-3. The first way is to directly compare the real ancestry with the predicted ancestry; the second way is to calcula...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of biology, and particularly discloses an SNP marker combination for deducing crowds in different geographic areas of Asia, and the specific information ofSNP molecular markers contained in the SNP marker combination is shown in Table 1. The SNP marker combination provided by the invention can be used for distinguishing and deducing crowds in east Asia, west Asia, South Asia, north Asia, central Asia and southeast Asia islands, and the distinguishing accuracy rates of SNP marker combinations with different capacities can reach 90.04%, 95.05% and 96.19% respectively.

Description

technical field [0001] The invention belongs to the field of biotechnology, in particular, it relates to the combination of SNP markers for inferring populations in different geographical regions of Asia (East Asia, West Asia, South Asia, North Asia, Central Asia and Southeast Asia islands). Background technique [0002] Asia is the largest and most populous continent, accounting for about 30% of the earth's land area and 60% of the world's population. At the same time, Asia is also a region with many ethnic groups, diverse language families, and complex religions. Although different groups of people present the characteristics of transnational distribution, the long-term distribution of the same ethnic, language and religious groups in history has significant regional characteristics. This evolutionary feature of the population enables different human populations to have definite and clear geographical ancestry, which constitutes the population genetics basis for inferring...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): C12Q1/6888C40B40/08C12N15/11
CPCC12Q1/6888C40B40/08C12Q2600/156C12Q2600/124
Inventor 陈华石承民赵石磊刘琪
Owner BEIJING INST OF GENOMICS CHINESE ACAD OF SCI CHINA NAT CENT FOR BIOINFORMATION
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products