Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Bacteria identification and typing analysis genome database and identification and typing analysis method

A bacterial identification and genome technology, applied in the field of bacterial genome identification and typing, can solve the problems of identification result deviation, identification threat, genome draft integrity and contamination are not easy to distinguish, and achieve the effect of increasing speed and easy operation

Active Publication Date: 2021-05-28
杭州微数生物科技有限公司
View PDF8 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Mislabeled Genome Sequences Pose Great Threat to Identification
Third, the integrity of the genome and the contamination rate also need to be taken seriously, as they may lead to severe bias in the identification results
For example, contamination in genome sequences can lead to biased results with high ANI values ​​between two different species
However, due to the high degree of variation in genome size and gene content between species, the integrity and contamination of draft genomes may not be easily distinguished

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Bacteria identification and typing analysis genome database and identification and typing analysis method
  • Bacteria identification and typing analysis genome database and identification and typing analysis method
  • Bacteria identification and typing analysis genome database and identification and typing analysis method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0035]As shown in FIG. 1, the bacterial identification and classification of the present invention analyzes the genomic database MDBACDB construction process as follows:

[0036]1) Collect bacterial genome information from NCBI, and collect the correspondence forms of "strain", "cultured", "cloned", and "annotation", establish genomic information and meta information, and clarify the source of each genome.

[0037]2) Get a list of effective published bacterial names and type strains from LPSN, and review Bergey's ate and bacterial system manuals and iJSem's articles. After the screening, the qualified strain bacteria genome entered the database for management.

[0038]3) After screening the bacterial genomic sequence of the warehouse, the MDBACDB construction is completed. Filter the wrong, low-quality bacterial genome by self-contained Python program MDBacqctools.

[0039]The steps of MDBACQCTools are under quality control: First, the integrity and pollution rate of each genome are evaluated u...

Embodiment 2

[0043]As shown in FIG. 1, the FiDBAC analysis of the bacterial genomic data analysis platform of the present invention is as follows:

[0044]1) Get a genomic sequence of public bacteria GCF_008121515.1_genomic.fna;

[0045]2) Submit to bacterial genomic data analysis platform FIDBAC (image 3), Identification analysis by self-compiled Python program FIDBAC.

[0046]FiDBAC's analysis process is as follows: First, the 16S rRNA sequence to be identified to be identified (GCF_008121515.1_genomic.fna) is compared with the LTP database; secondly, use Kmerfinder (V3.1), from GCF_008121515.1_genomic.fna The K-MERS is compared with the MDBACDB's K-MER database; again, obtain the top 20 bacterial ID numbers obtained in the first two steps, and extract genomic sequences from the bacterial genome database MDBACDB, using FASTANI (V1. 1) Calculate the ANI value of the query genome and the corresponding type of strain genome; Finally, the identification result returns only the closest species and the ANI v...

Embodiment 3

[0048]As shown in FIG. 1, the FiDBAC analysis of the bacterial genomic data analysis platform of the present invention is as follows:

[0049]1) Get Staphylococcus Capitis from NCBI,Bacillus CEREUS,Bacillus AnthracisGenome sequence GCA_001650475.1, GCA_002564865.1 and GCA_000725325.1;

[0050]2) Extract the 16S rRNA gene sequence from the genome, using a 16S rRNA gene sequence and the reference database LTP to perform BLAST comparison identification, and sort it according to the score value.

[0051]3) Submit to the bacterial genome data analysis platform FIDBAC (image 3), Identification analysis by self-compiled Python program FIDBAC. The Fidbac analysis process is as follows: First, the 16S rRNA sequences in the bacterial genome (GCA_001650475.1.FNA, GCA_001650475.1.FNA, GCA_002564865.1.FNA) are extracted to compare the LTP database; secondly, using Kmerfinder (V3 .1), comparison from the K-MERS extracted from GCF_008121515.1_Genomic.fna with MDBACDB's K-MER database; again, obtain the top...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a bacterial identification and typing analysis genome database and an identification and typing analysis method. A high-quality bacterial identification and typing analysis genome database is created by deleting erroneous tags and assembling low-quality genomes. Based on the database, a bacterial identification and typing analysis method based on genome information is provided, and a set of rapid bacterial genome identification and identification platform (FIDBac) is developed. The accuracy rate of FIDBac identification reaches 97% or above, and is obviously higher than that of other identification systems or software of the same kind. The single, coherent and automatic bacterial genome identification working process has important significance in the fields of food industry, pharmaceutical industry, clinical diagnosis, microbial resource development and the like.

Description

Technical field[0001]The present invention relates to bacterial genome identification and classification, in particular, involving bacterial identification and typing analysis of genomic databases and identification and typing analysis methods.Background technique[0002]Accurate bacterial strain identification is the key to successful bacterial classification, pathogenic bacteria detection and source tracking, is of great significance in the field of food industry, pharmaceutical industry, clinical diagnosis and microbial resource development. Traditionally, bacteria identification depends on phenotype identification, but the phenotype identification has the disadvantage of reproduction ability, high experimental labor intensity, and time long, and molecular biological methods are expected to overcome these shortcomings. The 16S rRNA gene became a popular molecular biological method in the prokaryotic biological biological biological biology due to its universal distribution and syst...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G16B50/00C12Q1/689
CPCG16B50/00C12Q1/689
Inventor 陈欢梁倩徐荣王莹刘程智何陆平
Owner 杭州微数生物科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products