Method for screening important characteristic genes related to drug resistance phenotype of bacteria based on machine learning

A feature gene, machine learning technology, applied in the field of gene sequencing

Active Publication Date: 2022-02-18
天津金匙医学科技有限公司 +2
View PDF10 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although carrying and expressing a single drug resistance gene can lead to drug resistance in bacteria, such as the KPC-2 gene can cause carbapenem resistance, there is also a single mechanism or the expression of a single drug resistance ge...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for screening important characteristic genes related to drug resistance phenotype of bacteria based on machine learning
  • Method for screening important characteristic genes related to drug resistance phenotype of bacteria based on machine learning
  • Method for screening important characteristic genes related to drug resistance phenotype of bacteria based on machine learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0085] Embodiment 1 The design optimization of this application

[0086] As mentioned in the background technology section of this application, most of the existing research on bacterial drug resistance screening focuses on the screening of single nucleotide polymorphisms (SNPs), insertions / deletions (Indels) or k-mer features at the core genome level, However, in addition to these characteristics, it is also very important to screen for important non-core drug resistance genes associated with drug resistance phenotypes.

[0087] figure 1 This is the design idea of ​​this application, based on this idea, this embodiment designs and optimizes specific methods for drug resistance screening. Take model selection as an example to show the establishment process of this application.

[0088] When performing the association analysis between genotype and drug sensitivity result data, this application compared and used GLM generalized linear model (R language glm(), stepAIC() command...

Embodiment 2

[0097] Example 2 Screening and verification of Klebsiella pneumoniae-resistant carbapenem and cephalosporin-related traditional Chinese medicine characteristic genes

[0098] Step 1. Search and download the Klebsiella pneumoniae strain genome and its corresponding antibiotic susceptibility test result data from the public database.

[0099] Download from the NCBI NDARO database: Open the website https: / / www.ncbi.nlm.nih.gov / pathogens / isolates, enter "Klebsiella pneumoniae" in the search bar to retrieve the information of Klebsiella pneumoniae, and then in the Matched Isolates sub-window, click "Choose columns" select "AST pheotypes" to display this column information, then download the tabular data of the entire window, sort out the lung strains with drug susceptibility test results data, according to the Assembly ID information, from the genome database of NCBI (ftp: / / ftp.ncbi.nlm.nih.gov / genomes) to download genome sequences in bulk.

[0100] Download from the PATRIC platf...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method for screening important characteristic genes related to the drug resistance phenotype of bacterial based on a machine learning technology. According to the method, for the antibiotic resistance of the bacterial, on the basis of the BGWAS thought, target bacterial genomes on a public platform or large-sample-size bacterial strain genome data obtained after current collection, sequencing and assembly, and corresponding antibiotic drug susceptibility test results are collected, and correlation analysis between genotypes and drug resistance phenotypes is carried out by using the machine learning method, so that important characteristic genes (non-core drug-resistant genes) related to the drug resistance phenotypes are screened out, weight coefficients of the important characteristic genes are obtained;and finally the reliability of the drug-resistant genes related to drugs is determined by using ROC analysis.

Description

technical field [0001] The present application relates to the field of gene sequencing technology, in particular to a method for screening important characteristic genes related to bacterial drug-resistant phenotypes based on machine learning technology. [0002] technical background [0003] Genome-wide association study (GWAS) is a method to screen the genetic variation significantly associated with a phenotype from the genome level, and then clarify the genetic mechanism of the phenotype. Compared with the traditional molecular genetics method, GWAS does not make any assumptions about the genetic mechanism of the phenotype, but directly starts from the phenotype, sets a reasonable control group, and finds the phenotype-related factors through statistical analysis of large sample data. change gender. In the study of human complex diseases, GWAS has achieved fruitful results, greatly improving people's understanding of complex phenotypes. Similarly, GWAS can also be used i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B30/10G16B30/20G16B50/00G16B40/00G16H70/40G06N20/00
CPCC12Q1/689C12Q2600/106
Inventor 韩朋饶冠华高建鹏陈方媛蒋智
Owner 天津金匙医学科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products