Unlock instant, AI-driven research and patent intelligence for your innovation.

File comparison method and device for hdfs

A technology for comparing devices and files, which is applied in the Internet field, can solve problems such as low comparison efficiency and large network transmission volume, and achieve the effects of improving efficiency, consuming less network transmission, and saving network transmission volume

Active Publication Date: 2017-05-10
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Both of the above two methods need to download files, and compare the files byte by byte, which has the disadvantages of large network transmission volume and low comparison efficiency, especially when comparing large files.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • File comparison method and device for hdfs
  • File comparison method and device for hdfs
  • File comparison method and device for hdfs

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention. On the contrary, the embodiments of the present invention include all changes, modifications and equivalents coming within the spirit and scope of the appended claims.

[0023] In the description of the present invention, it should be understood that the terms "first", "second" and so on are used for descriptive purposes only, and cannot be interpreted as indicating or implying relative importance. In the description of the present invention, it should be noted that unless otherwise specified and limited, the terms "connected" and "connect...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a file comparison method and device for an HDFS. The file comparison method for the HDFS of the embodiment of the invention comprises the steps of: obtaining information of a first file and a second file from a master node of the HDFS; comparing whether the number of a plurality of first data blocks constituting the first file is identical to the number of second data blocks constituting the second file; if so, obtaining a plurality of first crc (Cyclic Redundancy Check) values of the plurality of first data blocks and a plurality of second crc values of the plurality of second data blocks from slave nodes of the HDFS; comparing the plurality of crc values and the plurality of second crc values respectively sequentially; if the comparative results are the same, judging that the first file is identical with the second file; if the comparative results are different, judging that the first file is different from the second file. According to the file comparison method for the HDFS, network transmission quantity can be saved and file comparison efficiency can be improved.

Description

technical field [0001] The invention relates to the technical field of the Internet, in particular to a file comparison method and device for HDFS. Background technique [0002] HDFS (Hadoop Distributed File System) is a distributed file system. It is characterized by high fault tolerance and provides high transfer rates to access application data, suitable for applications with very large data sets. [0003] When comparing files on HDFS, the traditional file comparison methods used include: [0004] 1. Direct comparison method: first download the two files to be compared from HDFS to the local, and then compare them locally through file comparison tools such as diff; [0005] 2. Hash value comparison method: first download the two files to be compared from HDFS to the local, then calculate the hash value of the two files separately, for example, using the md5 algorithm, and finally compare the calculated md5 values . [0006] Both of the above two methods need to downlo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 潘瑾瑜
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More