Unlock instant, AI-driven research and patent intelligence for your innovation.

Duplicated data deletion method for medical big data

A technology for data deduplication and medical data, applied in the computer field, can solve problems such as uncertain time attributes of medical data, generation of fragments, and inability to efficiently ensure the security of medical data, so as to improve recovery performance and deletion performance, ensure security and Integrity, the effect of reducing storage overhead

Pending Publication Date: 2022-07-08
CENT SOUTH UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] However, data deduplication technology disperses data blocks in various containers, which destroys data locality and generates a lot of fragments, which affects system recovery performance and deletion performance
However, the existing data deduplication method improves the recovery performance at the expense of DER, and does not dig into the characteristics of medical data to improve the DER, recovery performance and deletion performance of the system at the same time, and it is impossible to judge which medical data has redundancy; in addition, There are methods that rely on the time attribute of the backup version when improving recovery performance, but medical data has an indeterminate time attribute
More importantly, medical data has higher compliance requirements for security and personal information protection than ordinary data, and the existing encrypted deduplication technology requires verification by a third-party organization, which cannot effectively guarantee the security of medical data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Duplicated data deletion method for medical big data
  • Duplicated data deletion method for medical big data
  • Duplicated data deletion method for medical big data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0065] like figure 1 Shown is a schematic flow chart of the method of the present invention, figure 2 It is a schematic diagram of the corresponding overall framework: the method for deduplication of medical big data provided by the present invention includes the following steps:

[0066] S1. Obtain the medical data to be analyzed;

[0067] S2. Perform similarity calculation on the medical data obtained in step S1, so as to obtain data files sorted according to the maximum similarity; specifically, based on the similarity between the files, each data stream is divided into fixed-size segments; Each segment is divided into blocks within the segment, and the hash value of each block is calculated and stored; then, based on the number of shared blocks between files, the similarity of the files is calculated; finally, the maximum similarity of each file is calculated, so that Get data files sorted by maximum similarity;

[0068] The specific implementation includes the followi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a repeated data deletion method for medical big data. The method comprises the following steps: acquiring to-be-processed medical data; performing similarity calculation on the medical data to obtain data files sorted according to the maximum similarity; and performing repeated data deletion operation according to the data file, and performing rewriting based on the maximum similarity to realize deletion of medical repeated data. According to the repeated data deletion method for the medical big data, whether the medical data are redundant or not is determined based on similarity calculation, noise data are removed at the same time, and the recovery performance and the deletion performance are improved by rewriting the data blocks based on the maximum similarity; by designing a block chain strategy and a repeated data deletion recovery algorithm, the security and integrity of the medical data are efficiently ensured; therefore, the method is suitable for the medical industry, and is high in efficiency and good in safety.

Description

technical field [0001] The invention belongs to the technical field of computers, and in particular relates to a method for removing duplicate data for medical big data. Background technique [0002] With the development of economy and technology, digitization and informatization have been widely integrated into all walks of life. The development of digitalization and informatization in the medical industry has contributed to the generation of medical big data. [0003] In addition to the large-scale, high-speed, diverse and high-value characteristics of general big data, medical big data also has the characteristics of multi-modality, redundancy, security, incompleteness and timeliness. Among them, redundancy and security have brought huge challenges to the application of medical big data. Since medical data is growing much faster than storage devices, and data storage, adequate equipment, and energy consumption are all expensive, data redundancy puts enormous pressure on...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/174G06F21/62G06F21/64G06K9/62
CPCG06F16/1748G06F21/6218G06F21/64G06F18/22Y02D10/00
Inventor 邹北骥肖伶朱承璋聂凡博曾梦陈智
Owner CENT SOUTH UNIV