Supercharge Your Innovation With Domain-Expert AI Agents!

CTR estimation method and system based on fm algorithm

An algorithm and model technology, applied in the computer field, can solve the problems of lack of generalization, difficulty in implementation, and high complexity, and achieve the effect of enhancing generalization ability.

Active Publication Date: 2021-02-05
玩咖欢聚文化传媒(北京)有限公司
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For example, the Kernel method is not easy to implement because of its high complexity; for example, the Tree based method, which was first proposed by the Facebook team in 2014, effectively solves the feature combination problem of the LR model, but the disadvantage is still the memory of historical behavior , lack of generalization

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • CTR estimation method and system based on fm algorithm
  • CTR estimation method and system based on fm algorithm
  • CTR estimation method and system based on fm algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0040] A CTR estimation method based on FM algorithm, see figure 1 ,include:

[0041] S1: Extend the FM model to the ml package in the Spark cluster, and perform dimensionality reduction optimization on the FM model to obtain a quasi-linear model;

[0042] Specifically, Spark clusters can be used to build large-scale, low-latency data analysis applications. Spark enables in-memory distributed datasets, which can optimize iterative workloads in addition to being able to provide interactive queries. Spark is implemented in the Scala language, which uses Scala as its application framework, and the language features of Scala have also made most of Spark's success.

[0043] Step S1 refers to the machine learning library implemented by Spark, which makes the actual machine learning scalable and easy to use. In step S1, while relying on Spark to implement the FM algorithm, it also considers Spark’s official recommendation to implement the DataFrame API instead of the RDD API, beca...

Embodiment 2

[0054] Embodiment 2 On the basis of Embodiment 1, the following content is added:

[0055] The FM model is the sum of the linear model objective function and the cross combination feature, and the objective function of the FM model is as follows:

[0056]

[0057] Quadratic parameter ω ij A symmetric matrix W is formed, and the symmetric matrix W is decomposed into W=V T V, the jth column of V is the hidden vector of the jth dimension feature, each parameter ω ij =i ,v j >, so the FM model can be transformed into:

[0058]

[0059] where ω 0 ∈R,V∈R n×k , R is a real number, R n×k is a matrix of n×k, n is the number of sample features, k is the length of the hidden vector, where k is much larger than n, i, j are variables, x i is the value of the i-th feature, v i for x i Hidden vector of , ω 0 and ω i are the parameters of the FM model.

[0060] In terms of time complexity, the time complexity of direct calculation should be O(kn 2 ), since all pairwise inte...

Embodiment 3

[0067] A CTR estimation system based on FM algorithm, see figure 2 ,include:

[0068] Construction unit: used to implement the FM model extension of the ml package in the Spark cluster, and perform dimensionality reduction optimization on the FM model to obtain a quasi-linear model;

[0069] Training unit: used to select different feature combinations in the environment to be tested, and perform model training on the linear model;

[0070] Comparison unit: do A / B Test on the model training results of different feature combinations, select the best feature combination and the trained quasi-linear model as the best model, and persist it in HDFS;

[0071] Estimation unit: used to call the quasi-linear model of the best model, select the features in the environment to be tested according to the feature combination of the best model, and pass the selected features into the called quasi-linear model for calculation to obtain the CTR prediction result.

[0072] Further, the object...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The CTR estimation method and system based on the FM algorithm provided by the present invention, the method implements the FM model extension to the ml package in the Spark cluster, and performs dimensionality reduction optimization on the FM model to obtain a quasi-linear model; selects different features in the environment to be tested Combination, model training for quasi-linear models; do A / B Test on the model training results of different feature combinations, select the best combination of features and the trained quasi-linear model as the best model, and persist in HDFS ; Call the quasi-linear model of the best model, select the features in the environment to be tested according to the feature combination of the best model, and import the selected features into the called quasi-linear model for calculation, and obtain the CTR estimation result. The FM model can automatically learn the weights of high-order attributes, without manually selecting features for crossover, considering the relationship between features, which enhances the generalization ability of the model, and is suitable for processing sparse data, and can be used for time It requires a higher calculation of the CTR estimation of the advertising direction.

Description

technical field [0001] The invention belongs to the technical field of computers, and in particular relates to a method and system for estimating a CTR based on an FM algorithm. Background technique [0002] CTR (Click-Through-Rate) is the click-through rate, which is a commonly used term in Internet advertising. The number of actual clicks divided by the number of impressions for the ad. CTR estimation is a key technical link in mainstream Internet applications (advertising, recommendation, search, etc.), and the accuracy of estimation directly affects the user experience and revenue of Internet products. In the advertising industry, advertising click-through rate estimation is a very important component of the programmatic advertising transaction framework. There are two main indicators for click-through rate estimation: [0003] 1. Sorting indicators. The sorting index is the most basic index. The quality of the sorting determines whether we have the ability to find th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/62G06Q30/02
CPCG06Q30/0242G06F18/24
Inventor 张震吕传成
Owner 玩咖欢聚文化传媒(北京)有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More