Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and system of cardinality estimation based on real-time calculation

A real-time calculation and estimation algorithm technology, which is applied in the direction of data error detection, calculation, structured data retrieval, etc., to solve the problem of unexpandable processing capacity, failure to realize the merging characteristics of radix algorithms, and large operating costs. and other issues to achieve real-time high-efficiency counting, occupy less storage space, and avoid waste

Active Publication Date: 2017-11-14
BEIJING JINGDONG SHANGKE INFORMATION TECH CO LTD +1
View PDF1 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, using Redis's HyperLogLog Counting to perform cardinality estimation calculations still has the following disadvantages: Redis does not implement the merging feature of the cardinality algorithm, resulting in the inability to expand the processing capacity under large amounts of data; because the entire calculation link is handed over to Redis for processing, the system and Redis Form a strong dependency; in addition, building a Redis cluster also generates a large operating cost

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system of cardinality estimation based on real-time calculation
  • Method and system of cardinality estimation based on real-time calculation
  • Method and system of cardinality estimation based on real-time calculation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] Exemplary embodiments of the present invention are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present invention to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

[0028] figure 1 is a schematic diagram of main steps of a method for cardinality estimation based on real-time calculations according to an embodiment of the present invention.

[0029] Such as figure 1 As shown, the method for cardinality estimation based on real-time calculation in the embodiment of the present invention mainly includes performing the following steps i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method and a system of cardinality estimation based on real-time calculation. The method and the system can carry out high-efficient cardinality estimation calculation on the basis of probabilistic and statistical theory, and thus meet a real-time cardinality calculation need of a big data scene. The method includes the following steps executed in a Storm system: acquiring a log message in real time; parsing the log message to acquire index information, wherein the index information includes a name of each index and a corresponding index value; utilizing an HLL cardinality estimation algorithm to carry out cardinality estimation on each index; and outputting a cardinality of each index.

Description

technical field [0001] The invention relates to the field of computer technology and software, in particular to a method and system for cardinality estimation based on real-time calculation. Background technique [0002] Cardinality count is the calculation of the number of unique elements in a repeatable set. For example, count the unique visitors of the entire website or store, etc. In the context of big data, the traditional cardinality calculation method encounters some difficulties, mainly manifested in the rapid expansion of required computing resources and storage resources as the amount of data and analysis dimensions increase. Therefore, an efficient cardinality estimation mechanism is needed. [0003] The cardinality estimation algorithm is a kind of probabilistic algorithm, which can estimate the cardinality at a much lower time and space consumption than precise calculation under the premise of controllable error. Algorithm features: 1. The error is controllab...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F11/14G06F17/30
CPCG06F11/1448G06F11/1464G06F16/27
Inventor 王向长邵先凯李威张鹏
Owner BEIJING JINGDONG SHANGKE INFORMATION TECH CO LTD