DNA storage coding optimization method based on K-means clustering multivariate universe algorithm

A k-means clustering and optimization method technology, applied in DNA computer, computing, genetic model and other directions, can solve the problem of high cost of reading and writing DNA data, achieve fast iteration speed, improve average fitness, and speed up convergence.

Active Publication Date: 2019-12-03
DALIAN UNIV
View PDF1 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the cost of reading and writing DNA data remains high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • DNA storage coding optimization method based on K-means clustering multivariate universe algorithm
  • DNA storage coding optimization method based on K-means clustering multivariate universe algorithm
  • DNA storage coding optimization method based on K-means clustering multivariate universe algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0024] The embodiments of the present invention are implemented on the premise of the technical solutions of the present invention, and detailed implementation methods and specific operation processes are given, but the protection scope of the present invention is not limited to the following embodiments. In the example, the DNA coding length n is 6, the Hamming distance constraint is d≥4, and the full discontinuity constraint and GC content constraint are as described above.

[0025] Step 1: Initialize the population to generate 500 DNA coding sequences of length 6. Initialize the relevant parameters required by the algorithm. In the wormhole existence probability WEP, min takes 0.2, max takes 1, and p takes 6 in the travel distance rate TDR;

[0026] Step 2: Use the multiverse algorithm to search the initial population, first initialize the fitness of the universe population, and sort the fitness of the universe, select the universe with the best fitness and the worst fitnes...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a DNA storage coding optimization method based on a K-means clustering multivariate universe algorithm, and the method specifically comprises the steps: constructing an optimalDNA coding sequence meeting a combination constraint condition, firstly constructing a certain number of DNA sequences as an initial population, and carrying out the evaluation and sorting of the fitness of the population; secondly, optimizing the obtained DNA coding sequence by using a k-means clustering algorithm and wormhole intersection to obtain a DNA coding sequence with higher fitness; then, judging whether to add an alternative solution set or not according to constraints through constraint comparison; and finally, outputting an optimal DNA coding sequence. According to the method, DNA coding sequences with relatively good quantity can be searched.

Description

technical field [0001] The invention relates to a swarm intelligence optimization algorithm and DNA storage coding, specifically, a multiverse algorithm, a K-means clustering algorithm and a wormhole intersection are used to optimize a DNA coding sequence, which belongs to the field of coding design in DNA storage. Background technique [0002] DNA storage technology was first considered to be the Microvenus project initiated by Joe Davis, which aims to store non-biological data such as images in DNA. Encoding CTAG-based base molecular sizes (C-1, T-2, A-3, G-4), four bases were assigned as phase transition values ​​rather than delta values. Each base represents how many times each binary bit (0 or 1) needs to be transformed into another binary, which is a technique for computer compression storage. Or it can be expressed as, C=X, T=XX, A=XXX, G=XXXX. For example, 10101→CCCC, 100101→CTCCT. However, there is a problem when decoding, C can be decoded into 0 or 1, which lead...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06N3/12
CPCG06N3/123G06F18/23213
Inventor 王宾曹犇周士华张强魏小鹏
Owner DALIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products