Microbial data processing method for high-throughput sequencing

A technology for data processing and sequencing data, applied in the field of high-throughput sequencing quality control, it can solve the problems of not considering potentially valuable information, high false positive rate of read assignment, and achieve accurate gene expression results, reads and contigs The effect of concentrated distribution and high clustering purity

Active Publication Date: 2019-01-25
EZHOU INST OF IND TECH HUAZHONG UNIV OF SCI & TECH +1
View PDF3 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the false positive rate of read assignments remains high and does not take into account potentially valuable information such as abundance correlations of certain target species across multiple samples (with similar contaminants)

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Microbial data processing method for high-throughput sequencing
  • Microbial data processing method for high-throughput sequencing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] In the following, a high-throughput sequencing microbial data processing method provided by the present invention will be further described in detail and completely in conjunction with examples. The embodiments described below are exemplary, and are only used to explain the present invention, but should not be construed as limiting the present invention.

[0032] The experimental methods in the following examples are conventional methods unless otherwise specified. Unless otherwise specified, the experimental materials used in the following examples are all commercially available.

[0033] In this embodiment, the results of high-throughput sequencing of microorganisms in human sample saliva are taken as an example for quality control. The specific operation steps are as follows:

[0034] 1. Simulated and real data sets

[0035] 1. Information about simulated and real metagenomic data sets.

[0036] In this embodiment, three types of metagenomic data sets are selected: simulated...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a microbial data processing method for high-throughput sequencing. According to the method, contig assembling and binning are performed on microbial 16s RNA read segments of high-throughput sequencing; microbial contigs are marked with q-PCR, so that the microbial contigs comprise marker genes; biological contigs containing the marker gene are removed, so that high-qualitymicrobial metagenomic sequencing data are obtained. Sequence clustering and other methods are adopted to identify and remove sequences from pollutants, so that the microbial metagenomic sequencing data with higher purity can be obtained, and therefore, gene expression results based on the microbial metatranscriptomic sequencing data are more accurate. The method of the invention, with the microbial metagenomic sequencing data as a research object, can improve the quality of the microbial metagenomic sequencing data based on bioinformatics ideas.

Description

Technical field [0001] The invention relates to a microbial data processing method for high-throughput sequencing, which belongs to the field of high-throughput sequencing quality control. Background technique [0002] Next-generation sequencing technology (NGS), also known as high-throughput sequencing, is characterized by high output and high resolution. It can read hundreds of thousands to millions of DNA molecules in parallel at a time, providing abundant genetic resources. While learning information, it can also greatly reduce sequencing costs and shorten sequencing time. Due to the large amount of data processed by high-throughput sequencing technology, the processing content is complicated, so the control of sequencing quality and the determination and elimination of pollution sources have become an important research topic. There are many factors affecting the quality of sequencing. The most common influencing factors are errors in operation. The main source of batch eff...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B30/00G16B20/00
Inventor 宁康奚望高岩成章昱陈超云韩毛振
Owner EZHOU INST OF IND TECH HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products