Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

DNA self-index interval decompression method

A decompression and self-indexing technology, applied in electrical components, code conversion, etc., can solve the problems of large data storage space and long decompression time, and achieve the effect of strong applicability, reduction of decompression time, and reduction of storage space.

Active Publication Date: 2021-07-09
HARBIN INST OF TECH
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to solve the problem that the existing decompression algorithm requires a long decompression time and the decompressed data requires a large storage space, and proposes a DNA self-index interval decompression method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • DNA self-index interval decompression method
  • DNA self-index interval decompression method
  • DNA self-index interval decompression method

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach 1

[0015] Specific implementation mode one: combine figure 1 This embodiment will be described. A DNA self-index interval decompression method described in this embodiment, the method is specifically implemented through the following steps:

[0016] Step 1. Input the sequence data file to be decompressed, and configure the index interval parameter ([start, end]) and the decompression output mode parameter (mode);

[0017] Step 2, according to the index interval parameter, determine the interval range that needs to be decompressed in the sequence data file to be decompressed;

[0018] Step 3. According to the header file information of the sequence data file to be decompressed, determine the sequenced short-read base bit information (short-read "column" for short) within the range that needs to be decompressed, which can be compared to the bases on the reference genome. Sequencing quality score bit information (referred to as comparison quality score), sequencing quality score b...

specific Embodiment approach 2

[0030] Embodiment 2: This embodiment is a further detailed description of Embodiment 1. The decompression output mode parameter determines the type of data to be decompressed and output.

specific Embodiment approach 3

[0031] Specific embodiment three: this embodiment is a further specific description of specific embodiment two. When the decompression output mode parameter is set to 1, the data type of the decompression output is a gene sequence. When the decompression output mode parameter is set to When 2, the decompressed output data type is a short-read sequence. When the decompressed output mode parameter is set to 3, the decompressed output data type is a whole genome sequence.

[0032] By default it is 0, which decompresses according to the short-read sequence condition.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a DNA self-index interval decompression method, and belongs to the technical field of decompression of DNA compressed data. According to the method and the device, the problems of long decompression time required by an existing decompression algorithm and large storage space required by decompressed data are solved. According to the self-index interval decompression algorithm, the decompression range can be selected according to requirements, compared with a global static TPBWT decompression algorithm, the decompression time is greatly shortened, and the storage space of decompression data is also reduced. Compared with a traditional decompression algorithm, the algorithm is more flexible, data with different meanings can be decompressed according to different requirements, and the applicability is higher. The method and the device can be applied to decompression of DNA compressed data.

Description

technical field [0001] The invention relates to the technical field of decompression of DNA compressed data, in particular to a method for decompressing DNA self-index intervals. Background technique [0002] With the development of DNA sequencing technology, biomedical research is facing the problem of how to store and transmit DNA data. Compressing DNA data and then decompressing it has become one of the important methods to solve the problem. [0003] After the LYZip tool performs data compression based on the TPBWT algorithm to obtain the short-read sequencing data, the existing decompression algorithm can only achieve global and static decompression. Although the existing decompression algorithm can realize the decompression of DNA data, it takes a long time to decompress, and the storage space required for the decompressed data is also large. Therefore, a method to reduce the decompression time and storage space is proposed. method is very necessary. Contents of th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): H03M7/30
CPCH03M7/30
Inventor 李杨刘博王亚东
Owner HARBIN INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products