Methods of compressing and decompressing gene sequences and device of compressing the gene sequence

A gene sequence and decompression technology, applied in the fields of computational biology and biological information, can solve the problems of storing gene sequences occupying a large storage space and low compression rate of gene sequences, so as to reduce the storage space and improve the compression rate.

Active Publication Date: 2018-01-26
SAMSUNG (CHINA) SEMICONDUCTOR CO LTD +1
View PDF4 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] An exemplary embodiment of the present invention is to provide a method and device for compressing and decompressing gene sequences, so as to solve t

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Methods of compressing and decompressing gene sequences and device of compressing the gene sequence
  • Methods of compressing and decompressing gene sequences and device of compressing the gene sequence
  • Methods of compressing and decompressing gene sequences and device of compressing the gene sequence

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0036] Now, different example embodiments will be described more fully with reference to the accompanying drawings, in which some example embodiments are shown.

[0037] figure 1 A flowchart showing a method for compressing a gene sequence according to an exemplary embodiment of the present invention.

[0038] Reference figure 1 In step S10, a mutation reference sequence is generated according to the high frequency mutation information and the standard reference sequence.

[0039] Here, it should be understood that biological genes can be described by the precise arrangement of base pairs of deoxyribonucleic acid (DNA), that is, biological genes can be represented by A (adenine), G (guanine), An ordered sequence composed of the four bases of T (thymine) and C (cytosine), that is, a gene sequence.

[0040] The lengths of gene sequences of different organisms are different. Various existing genetic research institutions provide multiple standard reference sequences for different biologi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides methods of compressing and decompressing gene sequences and a device of compressing the gene sequence. The method of compressing the gene sequence includes: generating a variation reference sequence according to high-frequency variation information and a standard reference sequence; and compressing the to-be-processed gene sequence according to a matching result of the to-be-processed gene sequence and the variation reference sequence to obtain the compressed gene sequence. According to the above-mentioned methods of compressing and decompressing the gene sequences and the above-mentioned device of compressing the gene sequence, a rate of compression for the gene sequence can be increased, thus storage space of the gene sequence is reduced, and copying and transmission for the gene sequence are facilitated.

Description

technical field [0001] The present invention relates to the technical fields of computational biology and biological information, and more specifically relates to a method and equipment for compressing and decompressing gene sequences. Background technique [0002] Gene sequence is generated through the collection and sequencing of biological gene sequencing technology. It is the research basis of bioinformatics, genetics, genomics, medicine and many other fields, and has important scientific value and practical significance. With the increasing maturity and extensive use of next-generation high-throughput sequencing technology (Next-generation Sequencing, NGS), the time to obtain biological gene sequences has been greatly reduced, and the cost has been significantly reduced. Sequencing projects will be more commonly used in the biomedical field. [0003] At the same time, the storage capacity of genetic data is also increasing rapidly. Taking the whole gene sequencing resul...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F19/18G06F19/22G06F19/24
Inventor 石永刚孔鑫令狐雄展郭世硕张周
Owner SAMSUNG (CHINA) SEMICONDUCTOR CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products