HDFS-based small file processing method, apparatus and device, and storage medium
A processing method and small file technology, applied in the computer field, can solve problems such as consumption, occupation, and large memory consumption of name nodes, and achieve the effect of improving access efficiency
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0059] figure 1 It is a flow chart of an HDFS-based small file processing method provided by Embodiment 1 of the present invention. This embodiment is applicable to the case of processing small files in HDFS, and the method can be executed by an HDFS-based small file processing device , the device can be implemented by software and / or hardware, and can generally be integrated into computer equipment. Correspondingly, such as figure 1 As shown, the method includes the following operations:
[0060] S110. Retrieve the small files in the HDFS according to a preset retrieval period.
[0061] Wherein, the preset retrieval period may be a retrieval period set according to actual needs, such as half an hour, 1 hour, or 2 hours, and the embodiment of the present application does not limit the specific value of the preset retrieval period.
[0062] In the embodiment of the present invention, the small files in the HDFS are retrieved according to the preset retrieval cycle, specifica...
Embodiment 2
[0070] figure 2 It is a flow chart of an HDFS-based small file processing method provided by Embodiment 2 of the present invention. This embodiment is embodied on the basis of the above-mentioned embodiments. The specific implementation manner of classifying the small files in the small files according to the keywords of each of the small files, and merging and storing the classified small files according to the preset file merging method. Correspondingly, such as figure 2 As shown, the method of this embodiment may include:
[0071] S210. According to the preset retrieval cycle, use a file whose file size satisfies the small file retrieval condition as the small file.
[0072] Wherein, the small file retrieval condition may be: the file size is smaller than a set threshold. Exemplarily, the set threshold may be 216M or 512M, etc., and may be specifically set according to actual requirements, which is not limited in this embodiment of the present invention.
[0073] In t...
Embodiment 3
[0090] image 3 It is a schematic diagram of an HDFS-based small file processing device provided in Embodiment 3 of the present invention, as shown in image 3 As shown, the device includes: a small file retrieval module 310, a small file classification module 320, and a small file storage module 330, wherein:
[0091] A small file retrieval module 310, configured to retrieve small files in HDFS according to a preset retrieval cycle;
[0092] A small file classification module 320, configured to classify the small files according to the keywords of each of the small files;
[0093] The small file storage module 330 is configured to merge and store the small files according to a preset file merge method; wherein, the preset merge method includes an item method or a dictionary method.
[0094] The embodiment of the present invention retrieves the small files in HDFS according to the preset retrieval cycle, classifies the retrieved small files according to the keywords of each ...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


