File processing method and system for HDFS, equipment and storage medium

A file processing and file technology, applied in storage media, HDFS file processing methods, systems, and equipment fields, can solve problems such as poor processing results, and achieve the effect of solving poor processing results and reducing the number and size of files

Inactive Publication Date: 2020-04-10
车轮互联科技(上海)股份有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The main purpose of this application is to provide a file processing method, system, device, and storage medium for HDFS, so as to solve the problem of poor processing effect when there are a large number of small files in HDFS

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • File processing method and system for HDFS, equipment and storage medium
  • File processing method and system for HDFS, equipment and storage medium
  • File processing method and system for HDFS, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] In order to enable those skilled in the art to better understand the solution of the present application, the technical solution in the embodiment of the application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiment of the application. Obviously, the described embodiment is only It is an embodiment of a part of the application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the protection scope of this application.

[0027] It should be noted that the terms "first" and "second" in the description and claims of the present application and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. It should be understood that the data so used may be interchanged under appropriate circumstances for th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a file processing method and system for an HDFS, equipment and a storage medium. The method comprises the steps: configuring a first directory where small files of an HDFS to be merged are located and a second directory output after merging; combining the small files of the HDFS to be combined on the basis of a MapReduce program; judging whether the line number of the dataof the small files before merging is the same as that of the data of the merged files: if yes, deleting the files under the first catalogue, and then moving the files under the second catalogue to thefirst catalogue; if not, determining that the merging fails. The technical problem that the processing effect is poor when a large number of small files exist in the HDFS is solved. The MapReduce program is adopted for merging, so the merging speed is high, the result is verified, and the merging correctness is guaranteed.

Description

technical field [0001] The present application relates to the field of distributed file processing, and in particular, relates to a file processing method for HDFS, a system, a device, and a storage medium. Background technique [0002] Hadoop is a distributed system infrastructure developed by the Apache Foundation. Hadoop implements a distributed file system (Hadoop Distributed File System), referred to as HDFS. The invention emphatically introduces a method for merging HDFS small files, which reduces data storage space and performance impact. [0003] The inventor found that if there are a large number of small files on the HDFS, it will cause serious problems to system performance, and the existing methods cannot make full use of cluster resources, resulting in data loss. [0004] Aiming at the problem of poor processing effect when there are a large number of small files in HDFS in related technologies, no effective solution has been proposed yet. Contents of the inv...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/11G06F16/16G06F16/182
CPCG06F16/119G06F16/16G06F16/182
Inventor 徐涛吴峰郭伟
Owner 车轮互联科技(上海)股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products