Efficient high-accuracy whole-genome selection method capable of performing parallel operation

A genome-wide and accurate technology, applied in genomics, proteomics, instruments, etc., can solve the problems of complex parameter solving process, poor accuracy of breeding values, complex assumptions, etc.

Active Publication Date: 2019-12-24
武汉影子基因科技有限公司
View PDF8 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Among them, the direct method has high calculation efficiency, but due to its simple assumptions on the genetic construction of traits, the accuracy of the estimated breeding value is poor; the indirect method has relatively complex assumptions on the genetic construction of traits, which is more in line

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Efficient high-accuracy whole-genome selection method capable of performing parallel operation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments.

[0037] Such as figure 1 As shown, the high-efficiency parallel computing and high-accuracy genome-wide selection method of the present invention is characterized in that it comprises the following steps:

[0038] Step 1: read the original genotype file and the original phenotype file, extract the genotype data and phenotype data of the same individual in the original genotype file and the original phenotype file, and form a new genotype file and a new phenotype file, And use the new genotype file to calculate the kinship matrix G among all individuals.

[0039] In this embodiment, the S4 data format in the R language is used to establish the data mapping between the disk and the internal memory; the reading and storage of the genotype file and the phenotype file adopt the big.matrix format in the R CRAN::bigmemory software package , and provi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of animal and plant breeding and human disease prediction, and provides an efficient high-accuracy whole-genome selection method capable of performing parallel operation. The method comprises the following steps: firstly, reading an original genotype file and a phenotype file, constructing a new genotype file and a new phenotype file, and calculating agenetic relationship matrix of all individuals; then, extracting all individuals in the new phenotypic file as a reference group, and extracting all individuals without phenotypic data in the originalgenotypic file as a prediction group; carrying out whole genome association analysis by utilizing the reference group data, and extracting result characteristics of the whole genome association analysis; constructing a model library with specific characters, sequentially optimizing an optimal fixed effect and an optimal random effect by adopting a cross validation strategy, and selecting an optimal prediction model from the model library; and finally, calculating genome estimated breeding values of the prediction group by utilizing the optimal prediction model. The method can quickly, accurately and stably predict individual genome breeding values, and thus the accuracy and efficiency of whole genome selection are improved.

Description

technical field [0001] The invention relates to the technical fields of animal and plant breeding and human disease prediction, in particular to an efficient, parallel-computable and highly accurate genome-wide selection method. Background technique [0002] With the development of high-density single nucleotide polymorphism (SNP) genotyping technology covering the whole genome, genome-wide selection (prediction), as a powerful tool for statistical analysis of genome, is widely used in the identification of complex traits in plant and animal breeding. Genetic value (species value) prediction and assessment, and in human genetics research. [0003] The existing genome-wide selection methods are divided into two categories: one is the direct method represented by the genome-wide best linear unbiased prediction GBLUP (Genomicbest linearunbiasedprediction), which only needs to construct the genome-wide relationship matrix between individuals, and obtain the variance components. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B50/30G16B20/20G16B25/00
CPCG16B20/20G16B25/00G16B50/30
Inventor 赵书红尹立林刘小磊李新云余梅朱猛进唐振双许婧雅殷东
Owner 武汉影子基因科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products