Prediction method and device for potential BGC in genome sequence, equipment and medium

A genome sequence and prediction method technology, applied in sequence analysis, used to analyze two-dimensional or three-dimensional molecular structure, instruments, etc., can solve the problems of unfavorable drug research and development, high false positive rate, reduce false positive rate and improve accuracy Effect

Active Publication Date: 2021-12-07
TENCENT TECH (SHENZHEN) CO LTD +1
View PDF6 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, when using machine learning methods for BGC prediction, the false positive rate of BGC prediction results is high, that is, the BGC prediction results contain a large number of non-BGCs, which is not conducive to subsequent drug development

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Prediction method and device for potential BGC in genome sequence, equipment and medium
  • Prediction method and device for potential BGC in genome sequence, equipment and medium
  • Prediction method and device for potential BGC in genome sequence, equipment and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035] In order to make the purpose, technical solution and advantages of the present application clearer, the implementation manners of the present application will be further described in detail below in conjunction with the accompanying drawings.

[0036] Artificial Intelligence (AI) is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the nature of intelligence and produce a new kind of intelligent machine that can respond in a similar way to human intelligence. Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and de...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a prediction method and device for potential BGC in a genome sequence, equipment and a medium, and relates to the field of artificial intelligence. The method comprises the following steps: performing structural domain prediction on each gene in a genome sequence to obtain a Pfam structural domain contained in each gene; determining a Pfam score of each Pfam structural domain, wherein the Pfam score is used for representing the probability that the Pfam structural domain belongs to the BGC; on the basis of the Pfam score of each Pfam structural domain, determining a candidate BGC in the genome sequence; and performing BGC category prediction on the candidate BGCs, and determining potential BGCs in the candidate BGCs based on a category prediction result. According to the embodiment of the invention, a dual serial prediction mechanism is adopted, the first-stage filtering of the BGC is realized according to the Pfam score, and then the second-stage filtering of the BGC is realized through category prediction on the basis of the first-stage filtering result, so that the false positive rate of the BGC prediction result is reduced.

Description

technical field [0001] The embodiments of the present application relate to the field of artificial intelligence, in particular to a method, device, equipment and medium for predicting potential BGCs in genome sequences. Background technique [0002] Biosynthetic Gene Clusters (BGC) refers to a group of genes with biosynthetic functions, which can encode and synthesize secondary metabolites (small molecular compounds), and microbial secondary metabolites are an important source of drug development. [0003] In related technologies, drug developers use machine learning methods to detect the genome sequences of bacteria or fungi, so as to discover potential BGCs related to small molecular compounds with novel structures. In the subsequent research and development process, targeted experiments can be carried out based on the discovered potential BGCs. [0004] However, when using machine learning methods for BGC prediction, the false positive rate of BGC prediction results is ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B15/00G16B15/30G16B30/10G16B40/00
CPCG16B15/00G16B15/30G16B30/10G16B40/00
Inventor 杨子翊廖奔犇张胜誉辛志伟梁恒宇
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products