Unlock instant, AI-driven research and patent intelligence for your innovation.

Method, device, equipment and storage medium for deduplication and storage of massive log data

A database and log technology, applied in the field of data processing, can solve the problems of insufficient disk capacity of a single computer and inability to expand infinitely, and achieve the effect of avoiding excessive demand for disk capacity

Active Publication Date: 2020-09-11
RUN TECH CO LTD BEIJING
View PDF14 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The present invention provides a method, device, equipment and storage medium for deduplication and warehousing of massive log data, so as to improve the efficiency of deduplication processing of massive log data, and at the same time avoid the occurrence of a single computer disk due to the increase in the amount of log data. Insufficient capacity and unable to expand infinitely

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method, device, equipment and storage medium for deduplication and storage of massive log data
  • Method, device, equipment and storage medium for deduplication and storage of massive log data
  • Method, device, equipment and storage medium for deduplication and storage of massive log data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0027] figure 1 This is a flowchart of a method for deduplicating massive log data provided in the first embodiment of the present invention. This embodiment is applicable to the situation where valuable information is stored after deduplication of massive log data, for example, it is provided to the police The log data after deduplication provides convenience for police investigation. This method can be executed by the device for deduplication of massive log data provided by the embodiment of the present invention. The device can be implemented in software and / or hardware, and can generally be integrated in computer equipment, such as figure 1 As shown, the method of this embodiment specifically includes:

[0028] S110. Acquire massive log data to be stored in the first time interval.

[0029] The first time interval is a preset time interval, and the massive log data in the time interval is deduplicated. Preferably, the first time interval is daily, that is, the massive log data ...

Embodiment 2

[0064] figure 2 Shown is a schematic diagram of the structure of a device for deduplication of massive log data provided in the second embodiment of the present invention. This embodiment is applicable to a situation where valuable information is stored after deduplication of massive log data, for example, The police provide the log data after deduplication to facilitate police investigations. The device can be implemented in software and / or hardware, and generally can be integrated in computer equipment, such as figure 2 As shown, the device for deduplication of massive log data specifically includes: a data acquisition module 210 to be stored, a pre-deduplication result acquisition module 220 to be stored, a full deduplication result acquisition module 230, and a database update module 240, in which,

[0065] The to-be-stocked data acquisition module 210 is configured to acquire the massive log data to be-stocked in the first time interval;

[0066] The pre-deduplication result...

Embodiment 3

[0089] Such as image 3 Shown is a schematic diagram of the hardware structure of a computer device provided in the third embodiment of the present invention, such as image 3 As shown, the computer equipment includes:

[0090] One or more processors 310, image 3 Take one processor 310 as an example;

[0091] Memory 320;

[0092] The computer equipment may further include: an input device 330 and an output device 340.

[0093] The processor 310, the memory 320, the input device 330, and the output device 340 in the computer equipment may be connected by a bus or other means, image 3 Take the bus connection as an example.

[0094] As a non-transitory computer-readable storage medium, the memory 320 can be used to store software programs, computer-executable programs, and modules, such as program instructions / program instructions corresponding to a method for deduplication of massive log data in an embodiment of the present invention. Module (for example, attached figure 2 The dat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a de-duplication storage method, device and equipment for massive log data and a storage medium. The method comprises the steps that the to-be-stored mass log data is acquiredin a first time interval; local de-duplication is carried out on the to-be-stored massive log data, and a to-be-stored pre-de-duplication result is acquired; global de-duplication is carried out on the to-be-stored pre-de-duplication result and a reference whole de-duplication result, and a whole de-duplication result corresponding to the first time interval is obtained, wherein the reference whole de-duplication result is the whole de-duplication result obtained by a previous de-duplication storage operation; a log database is updated according to the whole de-duplication result correspondingto the first time interval. According to the method, the de-duplication storage processing of the mass log data is achieved, the problem that the requirement for the disk capacity of a single computer is too high is solved, and the de-duplication, statistics and storage efficiency of the mass log data is also greatly improved.

Description

Technical field [0001] The embodiments of the present invention relate to the technical field of data processing, and in particular, to a method, device, equipment, and storage medium for deduplication of massive log data. Background technique [0002] In a computer, log files are files that record events that occur during the operation of the operating system or other software, or messages between different users of the communication software. At present, people's work and life are inseparable from computers, and the total amount of log data is more than one trillion entries. Therefore, it is very necessary to extract valuable information from massive log data for deduplication and storage. [0003] Two methods are usually used for deduplication of massive log data: [0004] The first way is to use the Redis cache database to save the primary key information of the log data. The system reads massive log data one by one, obtains the primary key information of the log data from the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/23G06F16/215G06F16/22
CPCG06F16/215G06F16/2282G06F16/2358
Inventor 谢永恒邹焱火一莽万月亮
Owner RUN TECH CO LTD BEIJING