Method for screening important characteristic genes related to bacterial drug resistance phenotype based on machine learning

A feature gene, machine learning technology, applied in the field of gene sequencing

Active Publication Date: 2022-06-17
天津金匙医学科技有限公司 +2
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although carrying and expressing a single drug resistance gene can lead to drug resistance in bacteria, such as the KPC-2 gene can cause carbapenem resistance, there is also a single mechanism or the expression of a single drug resistance gene is not enough to directly cause drug resistance Therefore, it is necessary to quantify the contribution of each feature to drug resistance while screening drug resistance-related features, that is, to rank the importance of these feature genes

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for screening important characteristic genes related to bacterial drug resistance phenotype based on machine learning
  • Method for screening important characteristic genes related to bacterial drug resistance phenotype based on machine learning
  • Method for screening important characteristic genes related to bacterial drug resistance phenotype based on machine learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0085] Embodiment 1 Design optimization of this application

[0086] As mentioned in the background technology section of this application, most of the existing bacterial resistance screening studies focus on the screening of single nucleotide polymorphism (SNP), insertion / deletion (Indel) or k-mer characteristics at the core genome level, But in addition to these features, it is also very important to screen for important non-core drug resistance genes associated with drug resistance phenotypes.

[0087] figure 1 This is the design idea of ​​this application, and based on this idea, this example designs a specific method for optimizing drug resistance screening. The establishment process of this application is shown by taking model selection as an example.

[0088] In the association analysis between genotype and drug susceptibility result data, this application uses the GLM generalized linear model (R language glm(), stepAIC() commands), Lasso regression model (R language ...

Embodiment 2

[0097] Example 2 Screening and verification of important characteristic genes related to carbapenem and cephalosporin resistance in Klebsiella pneumoniae

[0098] Step 1. From the public database, search and download the Klebsiella pneumoniae strain genome and its corresponding antibiotic susceptibility test result data.

[0099] Download from the NCBI NDARO database: Open the website https: / / www.ncbi.nlm.nih.gov / pathogens / isolates, enter "Klebsiella pneumoniae" in the search bar to retrieve Klebsiella pneumoniae information, and then in the Matched Isolates sub-window, click "Choose columns" select "AST pheotypes" to display the information in this column, then download the tabular data of the entire window, sort out the pulmonary strains with drug susceptibility test result data, according to the Assembly ID information, from NCBI's genome database (ftp: / / ftp.ncbi.nlm.nih.gov / genomes) to download genome sequences in bulk.

[0100] Download from the PATRIC platform database...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

This application relates to a method for screening important characteristic genes related to bacterial drug resistance phenotypes based on machine learning technology. The method is aimed at bacterial antibiotic resistance phenotypes, based on the idea of ​​BGWAS to collect target bacterial genomes on public platforms or obtained after current collection, sequencing and assembly Genomic data of large sample strains and their corresponding antibiotic susceptibility test results, using machine learning methods to conduct correlation analysis between genotype and drug resistance phenotype, in order to screen out important characteristic genes related to drug resistance phenotype (non- Core drug resistance genes), and the weight coefficients of important characteristic genes were obtained at the same time, and finally ROC analysis was used to determine the reliability of drug-related drug resistance genes.

Description

technical field [0001] The present application relates to the technical field of gene sequencing, in particular to a method for screening important characteristic genes related to bacterial drug resistance phenotypes based on machine learning technology. [0002] technical background [0003] Genome-wide association study (GWAS) is a method to screen genetic variation significantly related to a phenotype at the genome level, and then to clarify the genetic mechanism of phenotype. Compared with traditional molecular genetics methods, GWAS does not make any assumptions about the genetic mechanism of phenotypes, but directly starts from phenotypes, sets up a reasonable control group, and finds phenotype-related factors through statistical analysis of large samples. change gender. In the study of complex human diseases, GWAS has achieved fruitful results, greatly improving people's understanding of complex phenotypes. Similarly, GWAS can also be used in bacterial research, prov...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G16B30/10G16B30/20G16B50/00G16B40/00G16H70/40G06N20/00
CPCC12Q1/689C12Q2600/106
Inventor 韩朋饶冠华高建鹏陈方媛蒋智
Owner 天津金匙医学科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products