SNP marker combinations for inferring major ethnic groups in northwestern China and adjoining Central Asian countries

A national and labeling technology, applied in the biological field, can solve problems such as short amplification fragments

Active Publication Date: 2022-04-29
BEIJING INST OF GENOMICS CHINESE ACAD OF SCI CHINA NAT CENT FOR BIOINFORMATION
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although SNP molecular markers are the most abundant markers in the human genome, with a full range of allele frequency patterns, short amplified fragments are required for genetic analysis, and they are suitable for the analysis of degraded DNA samples. However, from most forensic samples Only a small amount of highly degraded DNA can be obtained, which can only meet the needs of analyzing a limited number of SNP marker sites

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • SNP marker combinations for inferring major ethnic groups in northwestern China and adjoining Central Asian countries
  • SNP marker combinations for inferring major ethnic groups in northwestern China and adjoining Central Asian countries
  • SNP marker combinations for inferring major ethnic groups in northwestern China and adjoining Central Asian countries

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0039]This embodiment is based on the article Characterizing private and shared signatures of positive selection in 37Asian populations published by Xuanyao Liu, et al. in European Journal of Human Genetics (2017) and Genomic by Jinchuan Xing, et al. published in PLOS Genetics (2013). Analysis of natural selection and phenotypic variation in high-altitude Mongolians article data screened out six geographical populations of Han, Tibetan, Kazakh, Kirgiz, Uyghur and Tajik, combined with self-collected samples, a total of 551 samples of genomic level SNP data. The population and sample size included in each region are shown in Table 2. After the integration, merging, site quality control and individual screening of data from different sources, a raw data set including 551 unrelated individuals, each containing 150,793 SNP sites, was formed for subsequent ancestral indicative SNPs Construction of marker combinations.

[0040] Table 2. Sample sources

[0041]

[0042] This exa...

Embodiment 2

[0051] Example 1 extracts SNP combinations that are beneficial for ancestry inference from a total of 150,793 SNPs. The algorithm can weigh the ancestry inference ability of the SNP itself and the information overlap between different SNPs to obtain the best combined inference effect. Add the screened SNPs one by one, and calculate the average classification accuracy rate AAC, and get the curve as figure 1 shown. The classification accuracy AC is defined as the ratio of the number of correctly classified samples to the total number of test samples,

[0052]

[0053] The average classification accuracy (AAC) is defined as the average value obtained by repeatedly calculating the AC value 1000 times when the test set is randomly selected.

[0054] In this example, three methods are used to evaluate the performance of the SNP reference system obtained in Example 1. The first way is to directly compare the real ancestry with the predicted ancestry; the second way is to calcul...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the field of biotechnology, and specifically discloses combinations of SNP markers for inferring different main ethnic groups in Northwest China and adjacent Central Asian countries. The specific information of the SNP molecular markers contained is shown in Table 1. The combination of SNP markers provided by the present invention can distinguish and deduce Uyghur, Kazak, Kirgiz, Tajik, Tibetan and Han populations, and the distinguishing accuracy rates of SNP marker combinations with different capacities can reach 90%, 95% and 99% respectively above.

Description

technical field [0001] The invention belongs to the field of biotechnology, specifically, infers the SNP marker combinations of different main ethnic groups (Uygur, Kazak, Kirgiz, Tajik, Tibetan and Han) in Northwest China and adjacent Central Asian countries. Background technique [0002] With the continuous deepening of political and economic cooperation between my country and Central Asian countries and the increasingly frequent economic and trade exchanges, the northwest region has gradually become an important inland window for my country's opening to the outside world. However, the cross-regional and cross-border flow of population also brings major risks to social security and national security to a certain extent. Since the population of Central Asian countries and the residents of Northwest my country have a certain degree of overlap in terms of ethnicity, language and religious beliefs, and are highly similar to the domestic population in terms of physical characte...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): C12Q1/6888C12N15/11
CPCC12Q1/6888C12Q2600/156C12Q2600/166
Inventor 陈华谭晓彤石承民赵石磊刘琪
Owner BEIJING INST OF GENOMICS CHINESE ACAD OF SCI CHINA NAT CENT FOR BIOINFORMATION
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products