Method and device for obtaining unique visitor (UV)

A technology of independent access and access numbers, applied in the database field, can solve the problems of huge calculation and time-consuming, and achieve the effect of saving computing resources and computing time

Active Publication Date: 2013-03-27
ALIBABA GRP HLDG LTD
View PDF6 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

If you need to calculate the UV of two dimensions separately, you need to traverse the whole table twice, and you need N times for N dimensions, and you need to perform two traversal deduplication processes for each dimension, and you need 2N traversals for N dimensions. In the heavy process, when there are many dimensions that need to be calculated, the number of full table traversals and the traversal deduplication process, the consumption of computing resources and computing time show a linear growth trend, and the calculation and time consumption are very huge.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for obtaining unique visitor (UV)
  • Method and device for obtaining unique visitor (UV)
  • Method and device for obtaining unique visitor (UV)

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049] In order to make the above objects, features and advantages of the present application more clearly understood, the present application will be described in further detail below with reference to the accompanying drawings and specific embodiments.

[0050] refer to figure 1 , showing a schematic flow chart of a method for obtaining the number of independent user visits of the present application, the steps of which include:

[0051] Labeling step 100, traverse the user log once, and use the user id as the benchmark, label each piece of data under each dimension with the global label of the dimension it belongs to;

[0052] In the first analysis operation step 110, the data is aggregated and summarized to obtain user granularity data by using tags, dimension combinations and user IDs as keywords;

[0053] In the second analysis operation step 120, the obtained user granularity data is traversed, and the user granularity data is aggregated and summarized by using the com...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method and device for obtaining a unique visitor (UV), relating to the field of a database. The method comprises the following steps of: traversing a primary user log and taking a user id as a standard to stamp an overall label on a dimension of each strip of data under each dimension; by taking the label, a dimension combination and the user id as keywords, collecting and gathering the data to obtain user granularity data; traversing the obtained user granularity data; and taking the label and the dimension combination as the keywords to collect and gather the user granularity data to obtain the UV of each sub-dimension under each dimension. According to the method and the device disclosed by the invention, only an original table is subjected to primary traversing in an HIVE to finally obtain each usual UV under any dimension, so that the calculation resource and the calculation time are greatly saved.

Description

technical field [0001] The present application relates to the field of databases, and in particular, to a method and device for acquiring the number of independent user visits. Background technique [0002] In the background ETL (Extraction-Transformation-Loading, data extraction, transformation and loading) scenario of the data warehouse, there are often such business scenarios, that is, for the user log generated by the same user behavior, or the user behavior flow meter, different dimensions are carried out. Under the UV (Unique Visitor, the number of user independent visits) calculation. For example, for search access user logs, user UV is calculated according to the dimensions of search categories, search keywords, and search sorting algorithms. The amount of these data is very large, usually hundreds of millions of records, so it is necessary to use hadoop (a distributed system infrastructure) or Hive (Hive is a data query and programming language based on the hadoop ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 刘凡吕春建
Owner ALIBABA GRP HLDG LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products