Unlock instant, AI-driven research and patent intelligence for your innovation.

Metagenome sequencing data processing system based on IIB type restriction enzyme characteristics and processing method thereof

A technology of restriction endonuclease and sequencing data, applied in the field of metagenomic sequencing data processing system, can solve the problems of fragment selection, inaccurate quantification of amplification preference, lack of detection of microbial data processing method, preference in PCR amplification, etc. , to reduce computing resources and running time, quickly achieve accurate qualitative and quantitative, and improve detection rate and accuracy

Active Publication Date: 2022-02-22
青岛欧易生物科技有限公司 +1
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] There are relatively few research reports on the analysis of microorganisms through simplified genome sequencing. In the reported studies so far, RAD, ddRAD, and GBS are used for microbial identification, but these simplified genomes all have the problem of fragment selection: 1. It will cause data loss; 2. Fragments obtained after enzyme digestion have different lengths, resulting in biased PCR amplification, uneven sequencing depth, and inaccurate quantification
The simplified genome sequencing technology based on type IIB restriction endonuclease, because it can obtain endonuclease fragments of equal length (subsequently referred to as "tags"), it not only solves the fragment selection, amplification preference, and quantification of the above-mentioned simplified genome. Inaccurate and other problems, and can detect microorganisms at low cost and high resolution, but the current metagenomic sequencing technology based on the characteristics of type IIB restriction endonucleases does not have a data processing method for detecting microorganisms

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Metagenome sequencing data processing system based on IIB type restriction enzyme characteristics and processing method thereof
  • Metagenome sequencing data processing system based on IIB type restriction enzyme characteristics and processing method thereof
  • Metagenome sequencing data processing system based on IIB type restriction enzyme characteristics and processing method thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0055] Take the standard product MOCK-MSA1002 used by the American Human Microbiology Program (HMP) as the object (this standard product is mixed with 20 bacteria in equal proportions of 16S, and the technical scheme is as follows: figure 1 , figure 2 (shown) to carry out experimental test: select type IIB restriction enzymes such as BcgI to construct a library, and then sequence on the HiseqXten SE50 platform.

[0056] 1) For the sequencing data of each sample, first perform data preprocessing, including removing adapters, removing reads with a ratio of N bases greater than 8%, and removing low-quality reads (the number of bases with a quality value lower than Q30 exceeds the entire reads. 15% of bases), remove reads that do not contain BcgI restriction sites, and finally obtain high-quality reads.

[0057] 2) Download 180,412 microbial genomes, including bacteria, fungi, archaea, and viruses, from the NCBI RefSeq database.

[0058] 3) Use BcgI to electronically digest 180...

Embodiment 2

[0067] Human fecal samples were used as experimental materials (technical solutions such as figure 1 , figure 2 shown), 5 human fecal genomic DNAs were digested with type IIB restriction enzymes such as BcgI, and the library was constructed, and then sequenced on the Illumina Nova PE150 platform.

[0068] 1) For the sequencing data of each sample, first perform data preprocessing, including using flash to splicing the data, then remove adapters, remove reads containing N bases greater than 8%, and remove low-quality reads (the quality value is lower than Q30). The number of bases exceeds 15% of the number of bases in the entire reads), and the reads that do not contain the BcgI restriction site are removed, and finally high-quality reads are obtained.

[0069] 2) Download 180,412 microbial genomes, including bacteria, fungi, archaea, and viruses, from the NCBI RefSeq database.

[0070] 3) Use BcgI to electronically digest 180,412 microbial genomes, use a hash table to recor...

Embodiment 3

[0078] Human armpit samples were used as experimental materials (technical solutions such as figure 1 , figure 2 shown), the 5 human fecal genomic DNAs were digested with type IIB restriction enzymes such as BcgI and BsaXI respectively, and the library was constructed, and then sequenced on the Illumina Hiseq XtenPE150 platform.

[0079] 1) For the sequencing data of each sample, first perform data preprocessing, including using flash to splicing the data, then remove adapters, remove reads containing N bases greater than 8%, and remove low-quality reads (the quality value is lower than Q30). The number of bases exceeds 15% of the number of bases in the entire reads), and the reads that do not contain BcgI and BsaXI restriction sites are removed, and finally high-quality reads are obtained.

[0080] 2) Download 180,412 microbial genomes, including bacteria, fungi, archaea, and viruses, from the NCBI RefSeq database.

[0081] 3) Use BcgI and BsaXI to electronically digest 180,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a metagenome sequencing data processing system based on IIB (2B) type restriction enzyme characteristics. The metagenome sequencing data processing system comprises a data preprocessing module, a qualitative module, a quantitative module and a multi-enzyme digestion result qualitative / quantitative merging module. The processing system disclosed by the invention can be used for analyzing and processing metagenome sequencing data with IIB type restriction enzyme characteristics based on a unique label two-step quantitative method, has the characteristics of high detection speed, low cost, low false positive rate, high accuracy and high resolution, and lays a foundation for the technology in the field of microbiological detection. The invention further discloses a metagenome sequencing data processing method based on IIB (2B) type restriction enzyme characteristics. According to the method and the system disclosed by the invention, the identification of microorganisms such as bacteria and fungi and the acquisition of relative content information can be realized at the same time at low cost and high resolution, and the blank of the technology in the field of microorganism simplified genome detection at present is filled.

Description

technical field [0001] The invention belongs to the technical field of bioinformatics, and in particular relates to a metagenomic sequencing data processing system based on the characteristics of type IIB restriction endonucleases. Background technique [0002] Currently, there are two main high-throughput research techniques in microbial diversity research: amplicon sequencing (amplicon sequencing) and whole-metagenome sequencing (WMS). Amplicon sequencing has amplification bias and resolution down to the genus level, and cannot effectively distinguish differences between species and strains. The resolution of metagenomic sequencing can reach the species or even strain level, but the sequencing cost is too high. Thus, simplified genome sequencing technologies enable large-scale studies within limited budgets. [0003] There are few reports on the analysis of microorganisms by simplified genome sequencing. Currently, RAD, ddRAD, and GBS have been used to identify microorga...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G16B20/30G16B40/00
CPCG16B20/30G16B40/00
Inventor 孙政王师张荣超黄适周丽沙王修评
Owner 青岛欧易生物科技有限公司