An optimization method for DNA storage encoding based on multiverse algorithm based on k-means clustering

A technology of k-means clustering and optimization method, which is applied in the direction of DNA computer, calculation, gene model, etc., and can solve the problem of high cost of reading and writing DNA data

Active Publication Date: 2021-09-10
DALIAN UNIV
View PDF1 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the cost of reading and writing DNA data remains high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An optimization method for DNA storage encoding based on multiverse algorithm based on k-means clustering
  • An optimization method for DNA storage encoding based on multiverse algorithm based on k-means clustering
  • An optimization method for DNA storage encoding based on multiverse algorithm based on k-means clustering

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0024] The embodiments of the present invention are implemented on the premise of the technical solutions of the present invention, and detailed implementation methods and specific operation processes are given, but the protection scope of the present invention is not limited to the following embodiments. In the example, the DNA coding length n is 6, the Hamming distance constraint is d≥4, and the full discontinuity constraint and GC content constraint are as described above.

[0025] Step 1: Initialize the population to generate 500 DNA coding sequences of length 6. Initialize the relevant parameters required by the algorithm. In the wormhole existence probability WEP, min is set to 0.2, max is set to 1, and p in the travel distance rate TDR is set to 6;

[0026] Step 2: Use the multiverse algorithm to search the initial population, first initialize the fitness of the universe population, and sort the fitness of the universe, select the universe with the best fitness and the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a DNA storage code optimization method based on a multiverse algorithm of K-means clustering, specifically: to construct an optimal DNA code sequence that satisfies combination constraint conditions, firstly a certain number of DNA sequences must be constructed as the initial population , to evaluate and sort the fitness of the population. Secondly, using the obtained DNA coding sequence, optimize it with k-means clustering algorithm and wormhole crossover to obtain a DNA coding sequence with high fitness. Then, judge whether to join the set of alternative solutions according to constraints through constraint comparison. Finally, the optimal DNA coding sequence is output. This method can search for a better number of DNA coding sequences.

Description

technical field [0001] The invention relates to a swarm intelligence optimization algorithm and DNA storage coding, specifically, a multiverse algorithm, a K-means clustering algorithm and a wormhole intersection are used to optimize a DNA coding sequence, which belongs to the field of coding design in DNA storage. Background technique [0002] DNA storage technology was first considered to be the Microvenus project initiated by Joe Davis, which aims to store non-biological data such as images in DNA. Encoding CTAG-based base molecular sizes (C-1, T-2, A-3, G-4), four bases were assigned as phase transition values ​​rather than delta values. Each base represents how many times each binary bit (0 or 1) needs to be transformed into another binary, which is a technique for computer compression storage. Or it can be expressed as, C=X, T=XX, A=XXX, G=XXXX. For example, 10101→CCCC, 100101→CTCCT. However, there is a problem when decoding, C can be decoded into 0 or 1, which lead...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/62G06N3/12
CPCG06N3/123G06F18/23213
Inventor 王宾曹犇周士华张强魏小鹏
Owner DALIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products