Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Data deduplication method and device

A data and data structure technology, applied in the computer field, can solve the problems of waste of resources and system resources, and achieve the effect of saving system resources

Active Publication Date: 2018-03-23
RUN TECH CO LTD BEIJING
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0013] 1. The same data will be saved in h1 and h2 at the same time, resulting in a waste of resources;
[0014] 2. Additional auxiliary threads are needed to clear the data in h1 and h2, resulting in waste of system resources

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data deduplication method and device
  • Data deduplication method and device
  • Data deduplication method and device

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0034] figure 1 is a schematic flowchart of a data deduplication method provided in the first embodiment of the present invention, as shown in figure 1 shown, including the following steps:

[0035] Step 101. Send a data collection request to a collection device, so that the collection device collects data from a network, and the data is network data packets or communication signaling.

[0036] Wherein, the data collection request includes a target address of the collected data, such as an IP address, and the collection device collects data from the network according to the target address. Wherein, the collected data may be network data packets, or communication signaling, or data files.

[0037] Wherein, the data collection request may contain multiple different target addresses, and the collection device sequentially acquires data from different networks according to the target addresses.

[0038] Step 102. Receive the first data sent by the collection device.

[0039] H...

no. 2 example

[0064] figure 2 is a schematic flowchart of a data deduplication method provided by the second embodiment of the present invention, as shown in figure 2 shown, including the following steps:

[0065] Step 201, receiving a deduplication request message input by a user.

[0066] For details, refer to the relevant description of this step in the foregoing embodiments, and details are not repeated here.

[0067] Step 202: Send a data collection request to the collection device, so that the collection device collects data from the network, and the data is network data packets or communication signaling.

[0068] Step 203. Receive the first data sent by the collection device.

[0069] For details, refer to the relevant description of this step in the foregoing embodiments, and details are not repeated here.

[0070] Step 204, detecting whether the first data is stored in the first data structure in the cache.

[0071] Wherein, the first data structure is used to store data an...

no. 3 example

[0094] image 3 It is a schematic flow chart of a data deduplication method provided by the third embodiment of the present invention. On the basis of the above embodiments, this embodiment further adds the requirement that the time for the data stored in the second data structure to reach the cache exceeds the preset duration The related steps of cleaning the second data, such as image 3 As shown, it specifically includes the following steps:

[0095] Step 301, query the time when the data stored in the second data structure arrives in the cache.

[0096] For example, the time when the data stored in the second data structure arrives in the cache can be queried regularly, for example, queried every first preset time period. The first preset duration may be the preset duration, or any duration greater than the preset duration.

[0097] Step 302: Determine the second data that stays in the cache longer than a preset duration according to the time when the data arrives in th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the present invention discloses a data deduplication method and device, wherein the method includes: sending a data collection request to the collection device, so that the collection device collects data from the network, and the data is network data packets or communication signaling ; Receive the first data sent by the collection device; detect whether the first data is stored in the buffer, if the first data is stored, discard the first data, if the first data is not stored data, then insert the first data into the cache. The embodiment of the present invention only needs to store one piece of data to complete deduplication, which not only achieves the purpose of data deduplication, but also saves system resources.

Description

technical field [0001] Embodiments of the present invention relate to the field of computer technology, and in particular, to a data deduplication method and device. Background technique [0002] With the development of computer and communication technology, the application of network has become popular rapidly, and it has increasingly become an indispensable tool for survival. At the same time, in order to meet the needs of network security and services, it is necessary to collect and analyze network data. Due to the network design and collection scheme, the collected data often has a large amount of duplicate data, which has a significant impact on subsequent storage and analysis. Therefore, in practical applications, data will be deduplicated before storage and analysis. [0003] The commonly used data deduplication method in the prior art is the double hash method. In the deduplication process of the double hash method, it mainly includes a data processing flow and an ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 陶小龙
Owner RUN TECH CO LTD BEIJING
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products