DNA data storage coding and decoding method

A technology of encoding and decoding and data encoding, which is applied in the development of DNA data storage process based on data compression and error correction encoding, in the field of large-capacity data storage, can solve the problems of high data storage cost and slow reading speed, and increase error correction capacity, reduce data redundancy, and improve data storage and reading efficiency

Active Publication Date: 2019-02-01
NANJING GENSCRIPT BIOTECH CO LTD
View PDF8 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to solve the existing problems of high data storage cost and slow reading speed, and to invent a complete and effective encoding framework involving DNA data storage, and a joint encoding process that combines compression and error correction algorithms to achieve Data preservation method using DNA sequence as storage medium

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • DNA data storage coding and decoding method
  • DNA data storage coding and decoding method
  • DNA data storage coding and decoding method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] The present invention will be further described below in conjunction with the accompanying drawings and embodiments.

[0023] Such as Figure 1-8 shown.

[0024] An encoding and decoding method for DNA data storage is a complete and effective encoding framework involving DNA data storage, and a joint encoding process that combines compression and error correction algorithms at the same time, realizing a data preservation method using DNA sequences as storage media ,Such as figure 1 . It can not only improve the efficiency of data storage and reading, but also effectively correct the errors generated in the process of DNA data storage and reading. Its data encoding includes the following steps: 1) Data compression, that is, first pack one or more electronic documents into a single file A.tar in TAR format, and then use the Lempel-Ziv-Markov chain-Algorithm algorithm (LZMA algorithm) to compress the TAR file Perform secondary compression to generate A.tar.lzma. 2) Da...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a DNA data storage coding and decoding method, which comprises data coding and data decoding. The data coding comprises the following steps: data compression is carried out: one or more electronic documents is or are packaged to a single document firstly; data transcoding is carried out: the compressed document is read in a binary form, and the binary data are converted toan integer-type numeric string; data redundancy is increased: error correction coding is carried out by using an RS coding system, and an integer-type numeric string with data redundancy increased isgenerated; and second transcoding of data is carried out: the integer-type numeric string with redundancy increased is transcoded to a DNA sequence set for chip synthesis. Data reading is the reverseprocess of data coding. In comparison with other algorithms, the framework better docks a Custom Array high-throughput synthesis platform through a brand-new 5-bit coding framework, and the coding potential is 1.67; and the algorithm uses a joint use strategy of TAR and LZMA compression algorithms and the RS coding system to achieve good balance between data redundancy reduction and error correction capability increasing.

Description

technical field [0001] The invention relates to a data storage technology, especially a large-capacity data storage technology, specifically a coding framework in the DNA data storage process, and the development of a DNA data storage process based on data compression and error correction coding. Background technique [0002] DNA data storage began in 1988. It refers to the use of DNA molecules to record and store data. Compared with traditional data storage methods such as tape storage and disk storage, its advantages are high-density data volume (1 cubic millimeter of DNA can store 1 EB), Long-term storage (estimated to be stored at -18°C for one million years) and low follow-up maintenance costs. Up to now, its disadvantages are that the cost of DNA synthesis is still high at this stage and the data reading speed is slow, so DNA data storage is more suitable for big data Archival storage. DNA data storage relies on the cutting-edge exploratory direction jointly created b...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B50/00H03M7/30
CPCH03M7/70H03M7/30H03M7/3068H03M7/3086G16B50/00G16B50/40G16B25/20G16B30/00G06F11/1076H03M13/1515Y10S977/704
Inventor 樊隆蒋浩君刘家栋王建鹏盛夏张丽华吴政宪柳振宇
Owner NANJING GENSCRIPT BIOTECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products