Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method for calculating classification unit components of sequencing data

A technology for sequencing data and analysis methods, applied in the field of bioinformatics analysis, which can solve the problems of the accuracy of pathogen results, high false positives of species results, and low specificity.

Active Publication Date: 2020-08-28
SIMCERE DIAGNOSTICS CO LTD +2
View PDF6 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0015] However, the conventional analysis method of the above-mentioned sequencing data has the defect of high false positive results (low specificity) for species, which has a great impact on the accuracy of pathogen results.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for calculating classification unit components of sequencing data
  • Method for calculating classification unit components of sequencing data
  • Method for calculating classification unit components of sequencing data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0136] Embodiment 1 invention design

[0137] The present invention does not consider a) contamination of samples introduced during sampling, library building, and sequencing, and b) contamination of samples introduced by barcode splitting errors. Because the former is used as the pollution introduced by the experimental operation, the pollution investigation can be carried out by establishing negative control and other experimental methods in the operation, which is not within the scope of the present invention; the latter can be solved by selecting a barcode system with better distinguishing effect on the one hand. (not in the scope of discussion of the present invention), on the other hand, through some quantitative positive control experiments not in the scope of discussion of the present invention, the empirical value of the wrong introduction ratio can be obtained, and used for abundance screening to solve this false positive.

[0138] 1. During sequence alignment, a seq...

Embodiment 2

[0177] Embodiment 2 clinical experiment verification

[0178] The present invention collects 114 urine samples of urinary infection patients, conducts microbial culture and PCR detection on each sample, and judges whether there is a certain taxonomic unit in the sample based on the comprehensive results of microbial culture and PCR detection. Wherein 36 samples are used to calculate the subtaxon frequency threshold; the remaining 78 samples are used to calculate the performance of the taxon results of the conventional bioinformatics analysis method and the new method of the present invention, to illustrate that the new method is compared with the original conventional method The results of the improvement effect are as follows:

[0179] 1. The present invention takes 36 samples of urinary infection patients as a training set, and obtains the taxon identification results of the samples through a culture method. Taxon composition results for samples were obtained using conventi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a method for calculating classification unit components of sequencing data. The method is based on an index of the frequency of a sub-classification units of a sequencing read-out sequence and a calculation framework of the index, is used for measuring the condition of misalignment of the classification units in a sequence alignment result, can effectively remove a false positive result from component calculation of the classification units, and improves the specificity and accuracy of component calculation. Meanwhile, regression from the misalignment sequence to a real component result is realized by a strategy of re-statistics after the abnormal classification units are removed, and a quantitative result of the abundance of the classification units is effectivelycorrected.

Description

technical field [0001] The invention relates to the field of bioinformatics analysis, in particular to a taxon component calculation method of sequencing data. [0002] technical background [0003] Infectious diseases are a class of diseases caused by pathogenic microorganisms. There are many types of infection sources and many patients, which have a major impact on public health in countries around the world. According to the World Health Organization, in 2016 as an example, lower respiratory tract infections alone caused about 3 million deaths worldwide. At the same time, the abuse of antibiotics caused by the blind treatment of infectious diseases is also becoming more and more serious. Accurate detection of infectious pathogens is the most important part of solving the above problems. [0004] The traditional method for detecting pathogens of infectious diseases is microbial culture, but the culture has the disadvantages of long detection time and low sensitivity. The...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G16B30/10G16B30/20G16B40/00
CPCG16B30/10G16B30/20G16B40/00
Inventor 梁忱胡龙吴苏生杨帆肖念清任用
Owner SIMCERE DIAGNOSTICS CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products