Distributed file cleaning method, device, and system

A technology of distributed files and cleaning methods, applied in the field of devices and systems, and distributed file cleaning methods, can solve the problems of high cost of storage systems, waste of cluster storage space, inability to delete HDFS files, etc., so as to save storage space and reduce storage cost effect

Pending Publication Date: 2022-01-28
INDUSTRIAL AND COMMERCIAL BANK OF CHINA
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

HDFS writes once and reads many times, which makes it impossible to partially delete HDFS ...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed file cleaning method, device, and system
  • Distributed file cleaning method, device, and system
  • Distributed file cleaning method, device, and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0057] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0058]It should be noted that a distributed file cleaning method, device and system disclosed in this application can be used in the field of artificial intelligence technology, and can also be used in any field other than the field of artificial intelligence technology. A distributed file cleaning method disclosed in this application The field of application of the cleaning method and device is not limited.

[0059] In order to facilitate the understanding of...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides a distributed file cleaning method, device, and system, and is applied to the technical field of artificial intelligence. The method comprises the following steps: checking objects in a distributed file in a main hot cluster according to a configured data life cycle table, and determining to-be-processed objects; counting the to-be-processed objects; if the proportion of the number of the objects in the total number of the objects is greater than or equal to a preset proportion threshold value, marking the distributed file as a target cleaning file; according to the data life cycle table and a processing mode corresponding to a preset object state, cleaning objects in the target cleaning file according to the processing mode. Thus, HDFS files can be partially deleted, storage resources can be released, cluster storage spaces can be saved, and the storage cost can be reduced.

Description

technical field [0001] The present invention relates to the field of computer technology, in particular to the field of artificial intelligence technology, in particular to a distributed file cleaning method, device and system. Background technique [0002] Object storage is a technology often used in the Internet. In the object storage system, multiple objects are combined into one large file and stored in the Hadoop Distributed File System (HDFS for short), and each object The location information in the large file is written as an index in the distributed storage system (HBase). In the era of big data, the data in the object storage system grows extremely rapidly, and invalid or invalid objects in the system will occupy a considerable storage space. HDFS writes once and reads many times, which makes it impossible to partially delete HDFS files, resulting in a huge waste of cluster storage space and high cost of object storage systems. Contents of the invention [0003...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/182G06F16/16G06F16/174
CPCG06F16/182G06F16/162G06F16/174
Inventor 张艺张志海林丹李俊谦
Owner INDUSTRIAL AND COMMERCIAL BANK OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products