A storage method and using method based on HDFS distributed file system

A distributed file and file technology, applied in the field of medical image processing, can solve the problem that FCSAN network bandwidth and processing capacity are difficult to meet the requirements of fast processing and transmission of PB series, HDFS writing performance is not real-time, lack of content index and single random Access and other issues, to meet the performance requirements of cloud storage distribution, meet the online high concurrent access requirements, and improve the effect of concurrent access capabilities

Active Publication Date: 2021-07-27
NORTHEASTERN UNIV LIAONING
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] 1) High construction costs: the amount of image data reaches TB and PB levels, the cost of using traditional storage architectures (such as FC SAN / iSCSI) is high, and the flexibility of heterogeneous integration and expansion is poor;
[0005] 2) There is a bottleneck in the transmission bandwidth: Even the high-performance FC SAN network bandwidth and processing capacity are difficult to meet the fast processing and transmission requirements of PB series;
[0006] 3) Limited usability: Large-scale hospital PACS systems often use the "online-nearline-offline" storage mode. Most of the offline data is stored in the tape library. The usability is poor, and the data cannot be obtained in real time;
[0007] 4) Lack of an integrated application sharing platform: Medical image collaboration, such as Web DICOM terminals, image consultation, image referral, distance education, digital film storage and other services basically adopt a "point-to-point" model, lacking integration, cross-platform, Highly available regional medical imaging collaborative application software, data sharing is difficult, such as transfer, remote medical treatment, etc. can not transfer data online
[0013] 3) Data redundancy is high, and by default each data is backed up on 3 servers;
At present, Hadoop also provides corresponding solutions for small files, such as Hadoop Archive file archiving, SequenceFile files, etc., but these methods cannot fully meet the application requirements of medical DICOM sequence images, such as lack of content index and single random access
[0018] 2) HDFS is not suitable for real-time application problems
The concept of HDFS design is not suitable for real-time applications. During the data writing process, each data block needs to be copied at least 3 times. The writing performance is much lower than the reading performance. Therefore, the writing performance of HDFS is not real-time and not suitable for multi-task concurrency It is not suitable for PACS real-time applications that need to quickly obtain image resources and write diagnostic reports
At the same time, every time HDFS is accessed, the client needs to establish a connection, open, close and disconnect the connection. For a sequence of hundreds of images that are frequently read, compared with the local file system, the reading efficiency will be significantly reduced;
[0019] 3) Low efficiency of random reading and writing of HDFS file content
If you access small files, you must jump from one Datanode to another Datanode, which greatly reduces the read performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A storage method and using method based on HDFS distributed file system
  • A storage method and using method based on HDFS distributed file system
  • A storage method and using method based on HDFS distributed file system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0050] The invention will be further described below in conjunction with accompanying drawings and specific implementation examples:

[0051] A storage method and usage method based on HDFS distributed file system, the present invention improves HDFS distributed file system, such as figure 1 as shown, figure 1 The middle dotted line part is the content of the present invention. On the basis of the HDFS distributed file system, an integrated content storage file block structure and a file cache pool based on the integrated content storage file block structure are added, and the file cache pool access process is given, including:

[0052] (1) Integrated content storage file block structure: save all local image files according to the following blocks, including content index table block, sampling volume data block, basic information table block, three-dimensional volume matrix block, and header information backup block. The original image file is stored according to the above f...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention proposes a storage method and usage method based on the HDFS distributed file system, including: an integrated content storage file block structure: including a content index table block, a sample volume data block, a basic information table block, a three-dimensional volume matrix block, a header Information backup block; file cache pool based on integrated content storage file block structure: including user queue, user data queue and HDFS connection pool; including file cache pool access process; the technology of the present invention is built on a distributed file system to reduce data storage Low cost, easy to expand, supports non-stop storage expansion, and data storage redundancy storage improves security; adopts distributed data access, significantly improves concurrent access capabilities, and has better read and write performance than traditional centralized storage technologies , to meet online high concurrent access requirements; this technology is deployed on the cloud platform, which can quickly build an application sharing platform to meet the performance requirements of cloud storage distribution for mobile application development.

Description

technical field [0001] The invention belongs to the field of medical image processing, and in particular relates to a storage method and a use method based on an HDFS distributed file system. Background technique [0002] With the rapid development of medical imaging technology, medical images have become an important basis for medical clinical diagnosis. These data are currently stored in the PACS (Picture Archiving and Communication System) system, using high-performance, large-capacity network storage arrays, tape libraries and other storage media. PACS follows the DICOM3.0 international standard, which is the organization and communication standard of medical images. [0003] At present, the PACS system has gradually developed from a single machine and departments to the whole hospital and regions, realizing the film-free hospital. Regionalization is the current main research goal of government health departments and medical institutions, but building a large shared me...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/182G06F16/13
Inventor 栗伟于鯤郭志伟赵大哲丁邦杰
Owner NORTHEASTERN UNIV LIAONING
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products