Bacterial identification and typing analysis genomic database and identification and typing analysis methods

A bacterial identification and genome technology, applied in the field of bacterial genome identification and typing, can solve the problems of deviation of identification results, identification threats, difficulty in distinguishing the integrity of genome drafts and contamination, etc., to achieve the effect of improving speed and easy operation.

Active Publication Date: 2022-07-26
杭州微数生物科技有限公司
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Mislabeled Genome Sequences Pose Great Threat to Identification
Third, the integrity of the genome and the contamination rate also need to be taken seriously, as they may lead to severe bias in the identification results
For example, contamination in genome sequences can lead to biased results with high ANI values ​​between two different species
However, due to the high degree of variation in genome size and gene content between species, the integrity and contamination of draft genomes may not be easily distinguished

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Bacterial identification and typing analysis genomic database and identification and typing analysis methods
  • Bacterial identification and typing analysis genomic database and identification and typing analysis methods
  • Bacterial identification and typing analysis genomic database and identification and typing analysis methods

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0035] As shown in Figure 1, the construction process of the bacterial identification and typing analysis genome database MDBACDB of the present invention is as follows:

[0036] 1) Collect bacterial genome information from NCBI, and collect meta-information including "strain", "culture collection", "clone" and "annotation", establish a corresponding table of genome information and meta-information, and clarify the source of each genome.

[0037]2) Obtain a list of validly published bacterial name and type strains from the LPSN and consult Bergey's Handbook of Archaeal and Bacterial Systems and IJSEM's articles. After screening, the bacterial genomes of qualified strains are obtained and entered into the database for management.

[0038] 3) After screening the bacterial genome sequences in the library, the construction of MDBACDB is completed. Erroneous, low-quality bacterial genomes were filtered by the self-written Python program MDBacQCTools.

[0039] The steps for qualit...

Embodiment 2

[0043] As shown in Figure 1, the bacterial genome data analysis platform FIDBac of the present invention is analyzed as follows:

[0044] 1) Obtain the genome sequence GCF_008121515.1_genomic.fna of the published bacteria;

[0045] 2) Submit to the bacterial genome data analysis platform FIDBac ( image 3 ), and the identification analysis was performed by the self-compiled Python program FIDBac.

[0046] The analysis process of FIDBac is as follows: first, extract the 16S rRNA sequence in the genome of the bacteria to be identified (GCF_008121515.1_genomic.fna) and compare it with the LTP database; secondly, use Kmerfinder (v3.1) to extract from GCF_008121515.1_genomic.fna The K-mers of MDBACDB were compared with the K-mer database of MDBACDB; again, the top 20 bacterial ID numbers screened in the first two steps were obtained, and the genome sequences were extracted from the bacterial genome database MDBACDB, using fastANI (v1. 1) Calculate the ANI value of the query genom...

Embodiment 3

[0048] As shown in Figure 1, the bacterial genome data analysis platform FIDBac of the present invention is analyzed as follows:

[0049] 1) Obtain Staphylococcus capitis from NCBI, Bacillus cereus , Bacillus anthracis The genome sequences of GCA_001650475.1, GCA_002564865.1 and GCA_000725325.1;

[0050] 2) Extract the 16S rRNA gene sequence from the genome, use the 16S rRNA gene sequence and the reference database LTP for BLAST alignment and identification, and sort by Score value.

[0051] 3) Submit to the bacterial genome data analysis platform FIDBac ( image 3 ), and the identification analysis was performed by the self-compiled Python program FIDBac. The analysis process of FIDBac is as follows: first, the 16S rRNA sequences in the genomes of the bacteria to be identified (GCA_001650475.1.fna, GCA_002564865.1.fna and GCA_000725325.1.fna) are extracted and compared with the LTP database; secondly, Kmerfinder (v3 .1), the K-mers extracted from GCF_008121515.1_genomic...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a bacterial identification and typing analysis genome database and an identification and typing analysis method. Create a high-quality genomic database for bacterial identification and phenotyping by removing erroneous labels and low-quality genome assemblies. Relying on this database, it provides bacterial identification and typing analysis methods based on genomic information, and develops a rapid bacterial genome identification and identification platform (FIDBac). The accuracy rate of FIDBac identification is more than 97%, which is significantly higher than other similar identification systems or software. This single, coherent, and automated workflow for bacterial genome identification is of great importance in the food industry, pharmaceutical industry, clinical diagnostics, and microbial resource development.

Description

technical field [0001] The invention relates to the field of bacterial genome identification and typing, in particular to a bacterial identification and typing analysis genome database and an identification and typing analysis method. Background technique [0002] Accurate bacterial species identification is the key to successful bacterial classification, pathogen detection and source tracing, and is of great significance in the fields of food industry, pharmaceutical industry, clinical diagnosis, and microbial resource development. Traditionally, bacterial identification has relied on phenotype identification, but phenotype identification has disadvantages such as limited reproductive capacity, labor-intensive experiments, and time-consuming, and molecular biology methods are expected to overcome these disadvantages. The 16S rRNA gene has become a popular molecular biology method in prokaryotic taxonomy because of its ubiquitous distribution in bacterial and archaeal genome...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G16B50/00C12Q1/689
CPCG16B50/00C12Q1/689
Inventor 陈欢梁倩徐荣王莹刘程智何陆平
Owner 杭州微数生物科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products