Unlock instant, AI-driven research and patent intelligence for your innovation.

Classification method for optimizing genome sequencing result

A genome sequencing and classification method technology, applied in the field of classification to optimize genome sequencing results, can solve the problems of not being able to truly reflect the true/false classification of real data, poor performance, etc., to improve precision, accuracy, and reliability Effect

Inactive Publication Date: 2019-09-06
JINAN UNIVERSITY +1
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

When the VQSR algorithm is developed, it only refers to the surface rules of the whole genome sequencing data. The simulation data is used in the verification, which cannot truly reflect the true / false classification of the real data, resulting in unsatisfactory performance in practical applications. It is not recommended. Instructions

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Classification method for optimizing genome sequencing result

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.

[0020] In the embodiment of the classification method for optimizing genome sequencing results of the present invention, the flowchart of the classification method for optimizing genome sequencing results is as follows: figure 1 Shown. figure 1 , The classification method for optimizing genome sequencing results includes the following steps:

[0021] Step S01: Read the input polymorphism record text file: In this step, read the input poly...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a classification method for optimizing a genome sequencing result. The method comprises the following steps of A), reading an input polymorphism recording text file; B), classifying the contents of the polymorphism recording text files according to notes, and obtaining corresponding classification information, wherein the classification information comprises homologous mononucleotide diversity, homologous insertion deletion change, heterogenous mononucleotide diversity and heterogenous insertion deletion change; C), filtering different classification information by means of different filtering indexes, filtering the content which does not accord with a filtering standard, and obtaining a filtered result; and D), gathering the filtered results and outputting. The classification method for optimizing the genome sequencing result has a beneficial effect of improving precision of a whole genome sequencing result.

Description

Technical field [0001] The invention relates to the field of information technology, in particular to a classification method for optimizing genome sequencing results. Background technique [0002] After the whole genome sequencing data is processed by the analysis process, a summary list of mutations will be obtained as the result. In order to improve the accuracy of the result, it is necessary to identify it most of the time to filter out some false positives. The most commonly used filtering methods are the hard filter of GATK (The Genome Analysis Toolkit, a software used for second-generation resequencing data analysis) and the VQSR function of GATK. GATK's hard filtering conditions are relatively rigid, and it does not classify specific situations. Therefore, although it can filter false positives, the loss of true positives is great. The VQSR method is relatively comprehensive, but it runs very slowly, and its performance in different data types is very different. VQSR i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G16B30/00
CPCG16B30/00
Inventor 谭宇翔张宇尹芝南
Owner JINAN UNIVERSITY