Feature engineering recommendation method and device based on spark streaming real-time stream, video website

A technology of feature engineering and recommendation method, applied in video data retrieval, metadata video data retrieval, etc., can solve problems such as lack of timeliness, difficulty, and inability to meet video recommendation requirements, and achieve effectiveness and accuracy. Solve the effect of timeliness

Active Publication Date: 2019-10-11
FLYING FOX INFORMATION TECH TIANJIN CO LTD
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] The traditional feature engineering construction relies more on the understanding and experience of the business. After continuous experimental exploration, the features suitable for its own field are obtained, and most of them are offline features, which has strong limitations and lacks timeliness. , which will inevitably lead to the application range of feature engineering is not wide enough, and it is difficult, which is not suitable for platform construction and external promotion
These are not in line with the ecology, platformization, sharing, timeliness, effectiveness and convenience required by today's Internet era, especially cannot meet the needs of video recommendation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Feature engineering recommendation method and device based on spark streaming real-time stream, video website
  • Feature engineering recommendation method and device based on spark streaming real-time stream, video website

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] The present invention will be further described in detail below in conjunction with the drawings and specific embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, but not to limit the present invention.

[0034] As shown in the figure, the feature engineering recommendation method of the present invention based on spark streaming real-time streaming includes:

[0035] Step 101: Obtain the expanded log and click log of the client, and enter the distributed message queue after cleaning;

[0036] In this step, use related technologies, such as flume, to collect related logs from the client. There are two main types of logs collected. One is the display log, which is the exposure log of the video; the other is the click log, which is The video is clicked. And clean the collected logs, and then enter the cleaned logs into MQ, such as kafka. Among them, stream data cleaning mainly includes stream data fo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a feature engineering recommendation method based on spark streaming real-time streams. The method comprises the steps that expanded logs and clicked logs of a client are acquired, washed and then added into a distributed message queue; spark streaming is used for subscribing log streams of the expanded logs and the clicked logs, and stream data in the two log streams are combined in engineering; operation is performed on the stream data to generate labels to identify expanded and clicked stream data and expanded and non-clicked stream data; multidimensional features are constructed for the expanded logs and the clicked logs according to basic features, and meanwhile basic time features are combined; the stream data with new features is subjected to off-line training and on-line training to generate recommended stream data. A feature extraction mode widely applicable to most fields is provided, the problem of a small application range of feature engineering is solved, timeliness of feature engineering is realized by adopting the mode of on-line domination and off-line correction, and the effectiveness and accuracy of the features are achieved through a series of feature combination transformation.

Description

Technical field [0001] The present invention relates to the technical field of video recommendation processing, in particular to a feature engineering recommendation method based on spark streaming real-time streaming. Background technique [0002] With the advent of the Internet 2.0 era, the network is flooded with a large amount of information and data. How to use these huge and messy data to dig out valuable information has become a hot topic, which is also an important branch of data mining. The field of machine learning has brought a spring of development. In machine learning technology, few people pay attention to Feature Engineering, but more to the selection and optimization of models and algorithms. However, features are the raw materials of machine learning systems, and their impact on the final model is unquestionable. . [0003] Most models can learn well through the good structure in the data. Even if it is not the best model, high-quality features can also get good ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/78
Inventor 刘严泽田文宝李修鹏陈福欣莅党磊张玲
Owner FLYING FOX INFORMATION TECH TIANJIN CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products