Unlock instant, AI-driven research and patent intelligence for your innovation.

Data partition storage method and device

A data partition and partition table technology, applied in the field of biological information, can solve the problems of unbalanced data volume and too many partitions, and achieve the effect of improving balance and query efficiency.

Pending Publication Date: 2021-01-15
BEIJING NOVOGENE TECH CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The embodiment of the present invention provides a data partition storage method and device to at least solve the technical problems in the related art that the human whole genome variation detection result data has too many or too few partitions in the hive data warehouse, and the data volume of the partitions is unbalanced

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data partition storage method and device
  • Data partition storage method and device
  • Data partition storage method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0031] According to an embodiment of the present invention, a method embodiment of a data partition storage method is provided. It should be noted that the steps shown in the flow chart of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, and , although a logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in an order different from that shown or described herein.

[0032] figure 1 is a flowchart of a data partition storage method according to an embodiment of the present invention, such as figure 1 As shown, the data partition storage method includes the following steps:

[0033] Step S102, initializing a partition table storing gene mutation sites in a predetermined data warehouse.

[0034] In this embodiment, the predetermined data warehouse may be a Hive data warehouse.

[0035] In an optional embodiment, before initializing the partition table for storing g...

Embodiment 2

[0059] According to another aspect of the embodiments of the present invention, a data partition storage device is also provided, Figure 4 is a schematic diagram of a data partition storage device according to an embodiment of the present invention, such as Figure 4 As shown, the data partition storage device may include: an initialization unit 41 , a division unit 43 , a first acquisition unit 45 and a storage unit 47 . The data partition storage device will be described below.

[0060] The initialization unit 41 is configured to initialize a partition table storing gene mutation sites in a predetermined data warehouse.

[0061] The division unit 43 is configured to divide the partition table into multiple sub-regions according to the data interval corresponding to each gene mutation site among the multiple gene mutation sites.

[0062] The first acquisition unit 45 is configured to acquire the starting value and the ending value of the mutation site of the target gene to...

Embodiment 3

[0072] According to another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium, which is characterized in that the computer-readable storage medium includes a stored computer program, wherein when the computer program is executed by a processor, the computer storage medium is controlled The device where it is located executes any one of the data partition storage methods mentioned above.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data partition storage method and device. The method comprises the steps: initializing a partition table for storing gene mutation sites in a predetermined data warehouse; dividing the partition table into a plurality of subareas according to the data interval corresponding to each gene mutation site in the plurality of gene mutation sites; obtaining a starting point numerical value and an end point numerical value of a to-be-stored target gene mutation site; and matching the target gene mutation site based on the starting point numerical value and the terminal pointdata of the target gene mutation site and storing the target gene mutation site in one or more subareas of the partition table. The technical problems that in the related art, the number of partitionsof human whole genome variation detection result data in a hive data warehouse is too large or too small, and the partition data size is unbalanced are solved.

Description

technical field [0001] The present invention relates to the technical field of biological information, in particular to a data partition storage method and device. Background technique [0002] With the rapid development of life science and gene sequencing technology, the Moore's law of sequencing cost has decreased, and the data output capacity has been greatly improved. With the deepening of scientific research, the diagnosis, treatment and screening of diseases such as cancer and genetic diseases not only focus on the influence of single genes on diseases, but also include the complex mechanism of multiple genes on diseases. More and more countries Initiated research on the application of human whole genome sequencing to human health. The length of the whole human genome is 3 billion bases, and storing a person's 30X WGS (whole genome sequencing) data probably requires more than 90G of hard disk space. Therefore, as the sequencing samples increase, the accumulated data ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G16B50/30G16B20/30G16B20/50
CPCG16B50/30G16B20/30G16B20/50
Inventor 孙成全李雷曹银川成岗刘冰吴俊李瑞强
Owner BEIJING NOVOGENE TECH CO LTD