A method and apparatus for processing duplicate data, an electronic device, and a medium

CN122308735APending Publication Date: 2026-06-30JINAN INSPUR DATA TECH CO LTD

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: JINAN INSPUR DATA TECH CO LTD
Filing Date: 2026-03-25
Publication Date: 2026-06-30

AI Technical Summary

Technical Problem

In distributed storage systems, duplicate data leads to cross-node reference scenarios, increasing network bandwidth consumption and transmission latency, causing uneven node load, forming performance bottlenecks, and making it difficult to balance deduplication rate and node access.

Method used

By receiving write requests, it determines whether the data to be written is duplicated, counts the number of logical data blocks in the candidate physical data blocks, selects the target physical data block for mapping, prioritizes the use of local storage resources, avoids cross-node access, and achieves load balancing.

Benefits of technology

It effectively reduces network bandwidth consumption and transmission latency caused by cross-node data access, optimizes access distribution, avoids performance bottlenecks caused by excessive reference to a single physical data block, and improves overall system performance and storage efficiency.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122308735A_ABST

Patent Text Reader

Abstract

This application discloses a method, apparatus, electronic device, and medium for processing duplicate data, relating to the field of computer technology. The method is applied to a storage system and includes: receiving a write request; if the storage system stores data to be written, determining candidate physical data blocks storing the data to be written within the storage system; if all candidate physical data blocks are located in storage nodes that include a first node, counting the number of logical data blocks corresponding to each candidate physical data block in the first node based on a fingerprint table corresponding to the storage system; determining a first target physical data block from the storage system based on the number of logical data blocks corresponding to each candidate physical data block in the first node; and storing the mapping relationship between the first logical data block corresponding to the first logical block address and the first target physical data block in the fingerprint table. This application prioritizes data access within storage nodes, saving transmission bandwidth and reducing transmission latency.

Need to check novelty before this filing date? Find Prior Art