Methods and systems for constructing clinical pathogenic microorganism metagenomic databases

By correcting species information, filtering genomes, and marking contaminated areas in the pathogen database, combined with k-mer alignment and the least common ancestor algorithm, the problems of database redundancy and long detection time were solved, achieving efficient and accurate identification of pathogens.

CN117316299BActive Publication Date: 2026-06-30BEIJING WEIYAN MEDICAL LAB CO LTD +3

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
BEIJING WEIYAN MEDICAL LAB CO LTD
Filing Date
2023-10-07
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing technologies for constructing clinical pathogen databases suffer from problems such as large database redundancy, long detection time, low assembly efficiency, and difficulty in guaranteeing accuracy.

Method used

By collecting and correcting species information of pathogenic microorganisms, focusing web crawling technology is used for retrieval; the genome is filtered and statistically analyzed, the k-mer alignment strategy is used to break down the genome and mark contaminated regions, the lowest common ancestor algorithm is used to build the database, and it is regularly updated and partitioned for storage.

Benefits of technology

A high-quality pathogen database has been established, which has improved the accuracy and sensitivity of species identification, reduced false positives, and shortened the detection time.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117316299B_ABST
    Figure CN117316299B_ABST
Patent Text Reader

Abstract

This application provides a method and system for constructing a metagenomic database of clinical pathogenic microorganisms. The method includes: collecting and correcting species information of pathogenic microorganisms; performing searches based on the corrected pathogenic microorganism genome results; filtering and counting bacterial genomes; screening and filtering reference genomes of viral pathogenic microorganisms; downloading the screened pathogenic microorganism genomes; breaking the genomes obtained from the above screening results into fixed-length fragments; classifying reads with overlapping regions to obtain contaminated regions in the genome; labeling contaminated sequences in all genomes; removing contaminated sequences from the genome; constructing an alignment database for the obtained genomes based on the k-mer algorithm and the lowest common ancestor algorithm; managing the database, regularly updating and maintaining the database; optimizing data storage by partitioning the database into different databases according to species boundaries and different setting criteria.
Need to check novelty before this filing date? Find Prior Art