Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Application of Docker technology in high-throughput sequencing data analysis

A sequencing data, high-throughput technology, applied in the field of biocomputing and molecular biology, can solve the problems of analysis software installation, configuration, migration of computer resource differences, etc., to achieve the effect of efficient mining analysis and reduce processing time

Inactive Publication Date: 2018-01-16
武汉古奥基因科技有限公司
View PDF2 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to solve the limitations of existing analysis software installation, configuration, migration, and dependent computer resource differences required for the analysis of existing biological big data, and to provide an application of Docker technology in high-throughput sequencing data analysis

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Application of Docker technology in high-throughput sequencing data analysis
  • Application of Docker technology in high-throughput sequencing data analysis
  • Application of Docker technology in high-throughput sequencing data analysis

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0022] Example 1 Application of Docker technology in chromatin immunoprecipitation high-throughput sequencing (ChIP-seq) data analysis

[0023] (1) Construction of ChIP-Seq V0 basic image:

[0024] Ⅰ. Use the docker commit command to build the basic image ChIP-seq V0, and then enter the interactive mode of the image by docker run;

[0025] Ⅱ. In the interactive mode, install the software and language required for ChIP-Seq analysis and save the basic image. The list of main software and languages ​​installed is shown in Table 1 below.

[0026] Table 1. List of main software and languages ​​required for ChIP-seq analysis

[0027]

[0028]

[0029] (2) Construction of ChIP-Seq analysis process and construction of ChIP-Seq V1.0 image:

[0030] In the basic image ChIP-Seq V0 where the above software and languages ​​are installed, follow figure 1 Build the analysis process ChIP-Seq.sh as shown, and then submit it with docker commit to generate a new image ChIP-Seq V1.0.

...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an application of a Docker technology in high-throughput sequencing data analysis, belonging to the field of biocomputing and molecular biology. The application comprises the following steps: firstly, constructing a base image by using a Docker, and storing software and languages required for analysis into the image in an interactive mode; building an analysis process, andthen generating a biological cloud computing platform image; saving the biological cloud computing platform image as a tar compression package, uploading and importing to a server directory that needsto be analyzed, and then mounting data information and data annotation files that need to be analyzed to the image, and calling the configured analysis process to analyze data. According to the application disclosed by the invention, the Docker technology is applied to high-throughput sequencing data analysis, the problem of limitations in software installation, configuration, migration, the resource difference of dependent computers and the like in existing biological big data analysis can be solved, and thus researchers can efficiently mine and analyze biological sequencing big data, and the time of processing analysis methods can be reduced.

Description

technical field [0001] The invention relates to the fields of biocomputing and molecular biology, in particular to the application of Docker technology in high-throughput sequencing data analysis. Background technique [0002] High Throughput Sequencing Technology (High Throught Sequencing Technology), also known as Next Generation Sequencing Technology (NGS), can sequence hundreds of thousands or even millions of DNA molecules in parallel at one time. With the increasingly mature development of high-throughput sequencing technologies such as transcriptome sequencing, genome resequencing, genome de novo sequencing, exome sequencing, and metagenomic sequencing, the resulting biological data has become increasingly complex in terms of data type, quantity, and data complexity. How to effectively analyze and utilize these biological big data has become an opportunity and a challenge for modern biology. [0003] The data analysis software used in the high-throughput data analysi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F19/28
Inventor 袁晓辉
Owner 武汉古奥基因科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products