Method and device for predicting advertisement click rate

An advertisement click and advertisement technology, applied in the field of big data computing, can solve the problems of occupying memory, data impact, missing, etc., and achieve the effect of improving utilization efficiency and accurate advertisement click rate.

Inactive Publication Date: 2015-04-22
BEIJING ZHANGKUO TECH
View PDF5 Cites 36 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] At present, the method of logistic regression is generally used in the process of predicting the click-through rate of advertisements to calculate the estimated click-through rate. However, since the method of predicting the click-through rate of advertisements uses the method of logistic regression, and this method The dependence on the data is large. If the data shows a linear relationship, the method of predicting the click-through rate of the advertisement using the logistic regression method can get better results. However, for non-linear data, the effect of this method is obviously lower. Poor, currently in the relevant methods of advertising click-through rate estimation, the linear transformation of nonlinear data is mostly the method of using feature discretization and 0-1 encoding at the same time, but there is no feature extraction for the original features, excluding Regardless of the characteristics, this method will cause the following disadvantages: 1. The characteristics of the data will increase exponentially, and a large number of irrelevant characteristics will appear, which will affect the accuracy of subsequent advertising click rate estimation. 2. The characteristics of the data will increase. It leads to a very serious problem, that is, it takes up a lot of memory. 3. The whole process does not select irrelevant features, and the effect of this method is very sensitive to the lack of data. The loss of part of the data will use the method of logistic regression to click on the advertisement. rate estimates have a large impact on the

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for predicting advertisement click rate
  • Method and device for predicting advertisement click rate
  • Method and device for predicting advertisement click rate

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0049] This method proposes an idea and its specific algorithmic process and expresses it with a formula. It can be more preparatory to the existing advertising click-through rate estimation method to obtain a more effective advertising click-through rate estimation value, which can Select the most valuable features, exclude irrelevant or less relevant features, and are not sensitive to the lack of data, and automatically select irrelevant features.

[0050] Such as figure 1 , 2 , 3, a method for predicting the click-through rate of an advertisement, including:

[0051] Step 1) Obtain historical data samples as training data;

[0052] Use the random forest method to select the most valuable features and exclude irrelevant or less relevant features;

[0053] Step 2) Using the regression model to predict the click-through rate of advertisements on the training data obtained above after removing irrelevant or less relevant features.

[0054] In step 1), obtain historical data...

Embodiment 2

[0062] Step 2) can select the existing technology, of course, can also be further optimized according to the present invention, specifically, step 2), use the regression model to carry out advertisement on the above-mentioned obtained data after removing irrelevant or less relevant features Forecasting the click-through rate, specifically including: forecasting based on the Logistic regression model;

[0063] P{y=1|f(x)}=1 / (1+exp(-(w x+b))), (1)

[0064] Among them, f(x) is the predicted value of the advertisement through logistic regression, w is the advertisement weight vector, x is the advertisement sample data after removing irrelevant or less relevant features, P{y=1|f(x)} Represents the posterior probability value that the advertisement is actually clicked by the user through the predicted value, so that the probability of a new advertisement sample being clicked is obtained through the above formula.

[0065] Further include: Step 22) By constructing a loss function, u...

Embodiment 3

[0070] The following are specific embodiments, specifically including:

[0071] 1. Use the random forest method to select the most valuable features, exclude irrelevant or less relevant features, not sensitive to the lack of data, and automatically select irrelevant features.

[0072] The details are as follows:

[0073] For the training data T={x,y}, x is the training sample, the number of samples is n, the feature dimension is m-dimensional, y is the label of the corresponding sample, 0 means that the advertisement is not clicked by the user, 1 means The advertisement is clicked by the user. Randomly sample n training sample data, the number is also n, and at the same time, randomly select m-dimensional features to obtain a d(

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and a device for predicting advertisement click rate. The method includes: 1) acquiring a historical data sample as training data; utilizing a random forest method to select most valuable characteristic and removing irrelevant or small-correlation characteristic; 2) utilizing a regression model to predict the advertisement click rate of the training data without irrelevant or small-correlation characteristic. By means of the scheme, the usable data range can be greatly improved, and the data utilization efficiency can be improved after the irrelevant or small-correlation characteristic is removed. For the data missing problem, the method is not sensitive to data missing, the method can achieve a good effect even some data are lost. The irrelevant characteristic in data can be determined automatically, and accurate advertisement click rate can be achieved.

Description

technical field [0001] The invention belongs to the field of big data computing, and in particular relates to a method and device for predicting the click-through rate of advertisements. Background technique [0002] At present, the method of logistic regression is generally used in the process of predicting the click-through rate of advertisements to calculate the estimated click-through rate. However, since the method of predicting the click-through rate of advertisements uses the method of logistic regression, and this method The dependence on the data is large. If the data shows a linear relationship, the method of predicting the click-through rate of the advertisement using the logistic regression method can get better results. However, for non-linear data, the effect of this method is obviously lower. Poor, currently in the relevant methods of advertising click-through rate estimation, the linear transformation of nonlinear data is mostly the method of using feature di...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06Q30/02
CPCG06Q30/0242
Inventor 王玮
Owner BEIJING ZHANGKUO TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products