Check patentability & draft patents in minutes with Patsnap Eureka AI!

Method for carrying out operation processing on massive files by using bitmap

A technology for massive files and arithmetic processing, which is applied in the processing of input data, electrical digital data processing, special data processing applications, etc.

Inactive Publication Date: 2015-04-29
深圳市光息谷科技发展有限公司
View PDF3 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, when the size of the data set increases, the operation time of the above method increases sharply

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for carrying out operation processing on massive files by using bitmap
  • Method for carrying out operation processing on massive files by using bitmap
  • Method for carrying out operation processing on massive files by using bitmap

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] Embodiments of the present invention will now be described with reference to the drawings, in which like reference numerals represent like elements.

[0023] The core of the present invention is to use the bitmap to record whether a certain data has appeared in the data set. The time complexity of bitmap lookup is constant, which greatly improves processing efficiency.

[0024] Such as Figure 5 As shown, in a computer system, a byte is composed of 8 bits, and each bit can be in two states of 0 or 1. One byte can indicate whether there are at most 8 numbers, for example, from 0-7bit, respectively indicating whether the 8 numbers 0-7 are in the data set. If it exists, set the corresponding bit position to 1.

[0025] Currently commonly used 32-bit unsigned integer, the value range is 0 to 4294967295. If represented by one byte, 4G memory is required. In a 32-bit operating system, the available memory of the application is generally within 2G. Expressed by bits, o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for carrying out operation processing on massive files by using a bitmap. The method comprises the following steps: inserting one datum into the bitmap; judging whether a certain number exists in the bitmap or not; printing all data sets in the bitmap; carrying out duplication removal; taking a union set or a crossed set or a difference set. The method can be used for carrying out operation functions of taking the crossed set and the union set, carrying out the duplication removal, and taking the difference set and the like on massive data sets, so that the data operation processing speed is extremely accelerated.

Description

technical field [0001] The invention relates to a method for computing and processing massive files by using a bitmap. Background technique [0002] In many software systems, there are application scenarios similar to the following: [0003] From the two batches of numbers, take the intersection of the two. For example, QQ has 30 million members, and Yellow Diamond has 20 million members. It is necessary to extract the list of both members and Yellow Diamond users. [0004] Deduplicate a batch of numbers. For example, an e-commerce website carried out a 10.1 big promotion, and 70 million people logged in through the QQ number to browse and purchase. After exporting these purchase records, the list of QQ numbers is extracted, and each QQ is only extracted once. [0005] Take the union of two batches of data. For example, there are 3 million users who play game A and 4.5 million users who play game B. Each user can only post once, and it is necessary to extract the player ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F7/06G06F7/24G06F16/951
Inventor 国睿
Owner 深圳市光息谷科技发展有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More