Method for information storage with DNA (Deoxyribonucleic Acid)

An information storage and DNA sequence technology, applied in the field of information storage, can solve the problems of high difficulty in sequence synthesis, sequencing and reading, discontinuous storage, and increased storage costs, so as to reduce the cost of synthesis and sequencing and improve storage and reading. Improve efficiency and reduce data recovery errors

Inactive Publication Date: 2017-06-13
SUZHOU HONGXUN BIOTECH CO LTD
View PDF4 Cites 36 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The above-mentioned method adopts binary and ternary digital information storage to be universal, but the storage methods based on binary and ternary coding methods have low degree of information compression, complicated storage algorithm operation, and poor storage continuity (rotary coding method, information writing method, etc.) After input, the information behind the writing position will change accordingly, resulting in discontinuous storage), and the length of the output DNA is too long, only a single index, the process of DNA synthesis and information recovery is error-prone, and at the same time, the quadruple overlapping walking structure causes data Redundancy, increasing storage costs
Long coding sequences make sequence synthesis, sequencing, and reading difficult, thus hindering their practical application [5]
[0008] In order to overcome the above problems, the applicant proposed a new informa

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for information storage with DNA (Deoxyribonucleic Acid)
  • Method for information storage with DNA (Deoxyribonucleic Acid)
  • Method for information storage with DNA (Deoxyribonucleic Acid)

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0046] Embodiment 1 Stores and extracts the digitized information of "Hello, World! Hello, World!" mixed in Chinese and English

[0047] to combine figure 2 Shown, at first the text file (26B) of " Hello, World! Hello, the world!" that Chinese and English words and punctuation are mixed is converted into quaternary BitDNA coding sequence data (DNA complete sequence) according to the method of the present invention, as follows:

[0048] TACATCTTTCGATCGATCGGACGAACAATGTGTCGGTGACTCGATCTAACATGCTACGGTCCAAGCTTCCTTCGGTGCGGCGGACAGAGCTACGCACTTCGCTGCTTTCAGAGCGGCGGACAAT.

[0049] Break the entire DNA sequence above into three DNA fragments, which are as follows:

[0050] DNA fragment 1: TACATCTTTCGATCGATCGGACGAACAATGTGTCGGTGACTCGA;

[0051] DNA fragment 2: TCTAACATGCTACGGTCCAAGCTTCCTTCGGTGCGGCGGACAGA;

[0052] DNA fragment 3: GCTACGCACTTCGCTGCTTTTCAGAGCGGCGGACAAT.

[0053] According to the output DNA format, the above-mentioned 3 DNA fragments were constructed into 3 sequences with ...

Embodiment 2

[0062] Example 2 Store and extract the digitized information of the picture "emoji.jpg" (3.83KB)

[0063] Will image 3 The shown emoji expression image file "emoji.jpg" (3.83KB) in jpg format is converted into quaternary BitDNA encoded data according to the encoding method of the present invention, and the DNA full sequence of 15708 bases is obtained, as shown in sequence 1;

[0064] The full DNA sequence was divided into 357 DNA fragments with a length of 44 nt according to the non-overlapping interrupt method, which were constructed as 357 output DNA sequences with a length of 100 nt according to the output DNA format (flanking primer sequence length 20 nt and index coding sequence length 8 nt ), that is, to complete the conversion of the digital information mixed in Chinese and English to the DNA sequence; then according to the 357 output DNA sequences obtained above, use an oligonucleotide synthesizer to prepare a DNA library and store it on a gene chip, thus completing t...

Embodiment 3

[0066] Embodiment 3 stores and extracts the digitized information (4.18KB) of the audio "example audio-laughter.mp3"

[0067] The sample audio file "Example Audio-Laughter.mp3" (4.18KB) in MP3 format is converted into quaternary BitDNA coded data according to the coding method of the present invention, and the DNA full sequence of 17148 bases is obtained, as shown in sequence 2;

[0068] The full DNA sequence was divided into 389 DNA fragments with a length of 44 nt and 1 DNA fragment with a length of 32 nt according to the non-overlapping interrupt method, which were constructed into 390 output DNA sequences with a length of 100 nt according to the output DNA format (flanking primers The length of the sequence is 20nt and the length of the index coding sequence is 8nt), that is, the conversion of the digital information of the mixed Chinese and English to the DNA sequence is completed; then according to the 390 output DNA sequences obtained above, a DNA library is prepared by ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method for information storage with DNA (Deoxyribonucleic Acid). The method comprises the following steps: (1) converting binary information of an original document of a computer into quaternary information, encoding, converting into a DNA complete sequence, wherein binary codes 00, 01, 10 and 11 are respectively and correspondingly converted into four deoxyribonucleotides of A, T, C and G; (2) separating the DNA complete sequence into a plurality of DNA fragments, and organizing and establishing an output DNA sequence which is 90-110nt in length and comprises an interpolation nucleotide encoding sequence consisting of DNA fragments, side primer sequences at two ends, and index encoding sequences on the inner sides of the primer sequences; (3) according to the output DNA sequence, synthesizing an artificial DNA sequence, and storing the artificial DNA sequence. The method provided by the invention has the remarkable advantages of being good in universality, being capable of simplifying calculation, improving continuity, storage efficiency and density of DNA information storage, reducing fault rates, lowering sequence synthesis and detection cost, and the like.

Description

technical field [0001] The invention belongs to the technical field of information storage, and in particular relates to a method for storing information by using artificially synthesized DNA. Background technique [0002] In recent years, global digital information is experiencing explosive growth. It is estimated that by 2017, the demand for global digital preservation data will exceed 16 Zettabytes (ZB, zettabytes). Therefore, it is urgent to develop a reliable, large-scale digital information management information storage media. However, the capacity of existing storage media cannot keep up with the growth rate of digitized information. At present, the main storage media are magnetic and optical media: magnetic media is the most dense storage form currently available on the market, tapes can store up to 185TB of data, and the storage density is about 10GB / mm 3 ;Recently, there are research reports that optical discs store 1PB data, about 100GB / mm 3 Feasibility of sto...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F19/28
CPCG16B50/00G16B50/50
Inventor 杨平蔡晓辉钟云鹏盛付旭李彦敏祁姗姗齐金才田净净朱沛煌
Owner SUZHOU HONGXUN BIOTECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products