Filling method and device for genotype data missing and server

A technology for missing data and missing genes, applied in genomics, instrumentation, proteomics, etc., can solve the problems of low filling efficiency and high error rate of gene filling values

Active Publication Date: 2020-04-17
SHENZHEN INST OF ADVANCED TECH CHINESE ACAD OF SCI
View PDF5 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In view of this, an embodiment of the present invention provides a method, device and server for filling missing genotype data, so as to solve the problems of low filling efficiency and high error rate of predicted gene filling values

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Filling method and device for genotype data missing and server
  • Filling method and device for genotype data missing and server
  • Filling method and device for genotype data missing and server

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0047] Such as figure 1 Shown is a schematic flow chart of the method for filling missing genotype data provided in the first embodiment of the present invention. This embodiment can be applied to the application scenario of predicting and filling missing gene values ​​in genetic data. The method can be executed by a processor in a filling device with missing genotype data. The device can be a server, a smart terminal, a tablet, or a PC. Etc.; In the embodiments of this application, the filling device with missing genotype data is used as the execution subject for description. The method specifically includes the following steps:

[0048] S110. Obtain genetic data of several different individuals from the gene bank to generate several gene samples; each of the gene samples includes several gene values ​​that are randomly covered;

[0049] In the actual sequencing of genetic samples, some genetic data in the samples will be missing in the process of processing the genetic samples t...

Embodiment 2

[0074] Such as image 3 Shown is a schematic flow chart of the method for filling in missing genotype data provided in the second embodiment of the present invention. On the basis of the first embodiment, this embodiment also provides a process of optimizing the parameters carried by the values ​​and the parameters of the missing gene prediction model in the method for filling missing genotype data, thereby further improving the accuracy of predicting gene values. The method specifically includes:

[0075] S210: Calculate a gradient in reverse according to the original value of the complete gene sample and the corresponding gene sample, and update the parameters of the missing gene prediction model through the gradient;

[0076] Due to the filling method of missing genotype data, the process of predicting and filling the test gene samples with missing gene data includes two parts. The first part is that for each missing position of the gene to be tested, based on the dynamic linka...

Embodiment 3

[0085] Such as Figure 4 Shown is the filling device with missing genotype data provided in the third embodiment of the present invention. On the basis of Embodiment 1 or 2, an embodiment of the present invention also provides a detection device 4, which includes:

[0086] The gene sample generating module 401 is configured to obtain gene data of several different individuals from the gene bank to generate several gene samples; each of the gene samples includes several gene values ​​that are randomly covered;

[0087] In an implementation example, the genetic data of several different individuals is obtained from the gene bank to generate several gene samples; when each of the gene samples includes several gene values ​​that are randomly covered, the gene sample generation module 401 includes:

[0088] The intercepting unit is used to obtain genetic data of several different individuals from the gene bank, and intercept the genetic data into several genetic samples with the same leng...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of gene prediction, and provides a filling method and device for genotype data missing and a serve. The method comprises the steps: obtaining gene data ofa plurality of different individuals from a gene pool, and generating a plurality of gene samples; for each gene deletion position in the gene sample, generating a filling value for pre-filling the gene deletion position according to a dynamic linkage relationship between the gene deletion position and the gene sample where the gene deletion position is located;; and inputting each pre-filled genesample into a missing gene prediction model, carrying out gene value prediction on the gene missing position according to the pre-filled filling value, and outputting a complete gene sample filled with the predicted gene value. Each gene sample comprising a plurality of gene values which are randomly covered; and the filling value carries a parameter corresponding to the dynamic linkage relationship. According to the embodiment of the invention, the problems of low filling efficiency and high error rate of the predicted gene filling value are solved.

Description

Technical field [0001] The present invention relates to the technical field of gene prediction, in particular to a method, device and server for filling genotype data missing. Background technique [0002] The loss of genetic data caused by SNP (Single Nucleotide Polymorphism Marker) chip sequencing has brought great challenges to genome-wide association analysis. The loss of genotype data is divided into genetic loss and detectable loss. In the process of analyzing genotypic deletions, we generally discuss technical deletions rather than artificial deletions, which are mainly caused by the following reasons: deletions caused by whole-genome resequencing, deletions caused by simplified gene sequencing, exome sequencing, and targets The deletion caused by region capture sequencing and the deletion caused by SNP chip, etc. [0003] In the prior art, it is common to fit a parameter by a gene sequence with missing values, learn the overall characteristics of the missing data, and then...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B20/20
CPCG16B20/20Y02A90/10
Inventor 殷力殷鹏
Owner SHENZHEN INST OF ADVANCED TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products