Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Microbial genome database construction method and application thereof

A construction method and database technology, applied in the construction method of microbial genome database and its application in microbial identification, can solve problems such as false positives, and achieve the effect of good compatibility

Active Publication Date: 2021-06-18
NANJING SIMCERE MEDICAL LAB CO LTD +2
View PDF10 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The proportion of consensus sequences between closely related species will be higher. When only one species appears, due to the alignment of consensus sequences, it may lead to false positives in judging that another species appears at the same time.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Microbial genome database construction method and application thereof
  • Microbial genome database construction method and application thereof
  • Microbial genome database construction method and application thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0052] Embodiment 1 method is established

[0053] 1. Microbial genome database construction method:

[0054] 1) Data acquisition: Obtain representative genome data of microbial species. Each strain of each species may have multiple genome sequences. For example, when genome data is obtained from NCBI, the RefSeq category marked as "reference genome" and " The genome sequence of "representative genome" is used as the genome sequence of the strain of the species; if there is no genome of "reference genome" or "representative genome", the genome marked with "na" is selected as the genome sequence.

[0055] 2) Plasmid sequence removal: In order to avoid the influence of the plasmid sequence on the identification, the plasmid sequence existing in the above-mentioned genome was removed to obtain the genome sequence after plasmid removal.

[0056] 3) Identification of the consensus sequence set and the specific sequence set: the above-mentioned plasmid-removed genomes of each speci...

Embodiment 2

[0068] Example 2 Escherichia coli and Shigella data construction

[0069] The following uses Escherichia coli and Shigella flexneri as examples to construct the database.

[0070] 1 Data Acquisition:

[0071] The microbial genome sequences were downloaded from NCBI, and the genome sequences GCF_000008865.2 and GCF_003697165.2 of two strains of Escherichia coli, and the genome sequences of two strains of Shigella, GCF_000006925.2 and GCF_007197595.1 were obtained.

[0072] 2 Remove the plasmid sequence: remove the sequence with Plasmid (plasmid) according to the sequence name in the genome sequence file.

[0073] 3 Identification of consensus and specific sequence sets: Merge the two genome sequences of Escherichia coli, and then use jellyfish to interrupt according to the length of 76bp, step size 1bp, jellyfish includes the process of removing redundancy, and obtain sequence set 1; The genome sequences were merged, and then jellyfish was used to cut them according to the le...

Embodiment 3

[0089] Embodiment 3 is compared with conventional library building and screening methods

[0090] The database was constructed according to the conventional method, that is, the genome sequences of 4 downloaded E. coli and Shigella species were used, and after removing the plasmid sequences, the sequences were merged together as the microbial genome reference database of the 4 species of bacteria.

[0091] For this database, use the blast software command makeblastdb to build a comparison library. Using the above 7 simulated data, perform blast comparison with this comparison library, and screen the comparison results.

[0092] In the first method, when all the comparison results of each read are the same species, it is the source species of the reads; if all the comparison results of a single read are not the same species, the reads are discarded. The result is as follows:

[0093]

[0094] In the second method, each reads only retains the alignment result with the highe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a microbial genome database construction method and application thereof. According to the invention, the construction method of the microbial genome database comprises the steps: constructing the database in a manner of labeling after genome breaking, labelling consensus sequences and specific sequences of multiple species, and constructing a comparison score matrix among the specific sequences of the species, so a sequence source is quickly and accurately obtained.

Description

technical field [0001] The invention relates to the field of bioinformatics, in particular to a method for constructing a microbial genome database and its application in microbial identification. Background technique [0002] Metagenomics next generation sequencing (mNGS) does not rely on traditional microbial culture, and directly performs high-throughput sequencing of nucleic acids in clinical samples, which can quickly and objectively detect a variety of pathogenic microorganisms (including viruses, bacteria, fungi, parasites). With the improvement of the mNGS technology platform and the increase of clinical research, the clinical application of mNGS is becoming more and more extensive. There are two important parts in the metagenomic sequencing analysis, one is the construction of the microbial genome database, and the other is the analysis and screening of the comparison results, and the construction method of the microbial genome database affects and determines the a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G16B50/30G16B30/10
CPCG16B30/10G16B50/30
Inventor 陈莉张岩李振中戴岩梁相志郭昊张林李诗濛任用
Owner NANJING SIMCERE MEDICAL LAB CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products