Automatic analysis method and system for sequencing data of whole genome of bacteria

A whole-genome sequencing and automated analysis technology, applied in sequence analysis, bioinformatics, instruments, etc., can solve problems such as insufficient comprehensiveness, poor user-friendliness, and inability to meet the needs of sequencing data analysis.

Pending Publication Date: 2021-05-28
NANKAI UNIV
View PDF6 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The inventors found that the existing workflow for the whole genome of bacteria usually focuses on the analysis of the second-generation bacterial sequencing data, which cannot satisfy the current rapid development of the third-generation and even some fourth-generation sequencing technologies, which are characterized by long reads. Analysis requirements for the generated sequencing data
And the aspects they involve are usually not comprehensive enough, only focusing on a single aspect of de novo sequencing or resequencing
[0005] While existing metagenomic workflows usually focus on metagenomic assembly bins, abundance calculations, etc., and provide good analysis at the metagenomic level (metagenomics usually focuses on the species diversity and functional potential of the entire microbial community in the environment ), but ignores the in-depth analysis of individual bacterial genomes after strain isolation and screening, such as the identification of individual bacterial genomes (accurate to the strain level) and corresponding annotations, which have important applications in the breeding and improvement of industrial microbial strains
[0006] In addition, existing workflows generally offer fewer choices of analysis tools for sequence preprocessing and assembly, which are less user-friendly

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic analysis method and system for sequencing data of whole genome of bacteria
  • Automatic analysis method and system for sequencing data of whole genome of bacteria
  • Automatic analysis method and system for sequencing data of whole genome of bacteria

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0036] The purpose of this example is to provide an automated analysis method for bacterial whole genome sequencing data.

[0037] An automated analysis method for bacterial whole genome sequencing data, comprising:

[0038] Obtain bacterial genome sequencing data and determine the type of sequencing data;

[0039] Perform corresponding preprocessing according to the type of sequencing data;

[0040] Perform resequencing analysis and de novo sequencing analysis on the preprocessed sequencing data according to the analysis type selected by the user and the preset tool software and software parameters;

[0041] Enable identification and annotation of bacterial genomes.

[0042] Further, the analysis type selected by the user and the preset tool software and software parameters are saved through a configuration file, and the user realizes related custom settings by modifying the configuration file.

[0043] Further, the resequencing analysis is specifically:

[0044] Assemble...

Embodiment 2

[0070] The purpose of this example is to provide an automated analysis system for bacterial whole genome sequencing data.

[0071] An automated analysis system for bacterial whole genome sequencing data, including:

[0072] A data acquisition unit, which is used to acquire bacterial genome sequencing data and determine the type of sequencing data;

[0073] A preprocessing unit, which is used to perform corresponding preprocessing according to the type of sequencing data;

[0074] The analysis unit is used to perform resequencing analysis and de novo sequencing analysis on the preprocessed sequencing data according to the analysis type selected by the user and the preset tool software and software parameters; to realize the identification and annotation of the whole bacterial genome.

[0075] In further embodiments, there is also provided:

[0076] A computer readable instruction. When the computer instruction is executed by a processor, the method described in the first embo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an automatic analysis method for bacterial whole genome sequencing data, which comprises the following steps: acquiring bacterial genome sequencing data, and judging the type of the sequencing data; respectively carrying out corresponding preprocessing according to the type of the sequencing data; performing re-sequencing analysis and de novo sequencing analysis on the preprocessed sequencing data according to an analysis type selected by a user and preset tool software and software parameters; and realizing the identification and annotation of the whole genome of the bacteria. The scheme provides a user-friendly automated analysis method, for researchers and clinicians without professional bioinformatics knowledge, automated bioinformatics analysis steps including sequencing quality control, re-sequencing and de novo assembly, similar bacterial reference genome identification, bacterial genome annotation, and at the same time, researchers and clinicians without professional bioinformatics knowledge; meanwhile, customized bioinformatics analysis can be carried out according to short-read-length and long-read-length sequencing data generated by different platforms, and an accurate analysis result is obtained.

Description

technical field [0001] The disclosure belongs to the technical field of gene sequencing, and in particular relates to an automatic analysis method and system for bacterial whole genome sequencing data. Background technique [0002] The statements in this section merely provide background information related to the present disclosure and do not necessarily constitute prior art. [0003] The widespread use of bacterial genome information requires automated workflows for genome sequencing analysis. Research on bacterial genome analysis workflows has achieved some results. [0004] The inventors found that the existing workflow for the whole genome of bacteria usually focuses on the analysis of the second-generation bacterial sequencing data, which cannot satisfy the current rapid development of the third-generation and even some fourth-generation sequencing technologies, which are characterized by long reads. Analysis requirements for the generated sequencing data. And the a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B30/10
CPCG16B30/10
Inventor 刘健孙嘉良陈娇
Owner NANKAI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products