Method and device for finding hot videos based on user query logs in real time

A user and log technology, used in video data query, video data retrieval, special data processing applications, etc., can solve problems such as inability to segment, results are not ideal, and cannot reflect semantic associations, so as to avoid combination explosion and engineering. To achieve simple and efficient, improve the effect of efficiency

Active Publication Date: 2013-04-03
ALIBABA (CHINA) CO LTD
View PDF3 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

One of the difficulties encountered when analyzing user logs is that new terms and hotspots will continuously emerge in the daily user query logs, such as "European Cup", "Corridor Faye Wong and Liu Meilin", etc., but the original word segmentation program cannot reflect Semantic associations of these new words, that is, it is possible to split the strings that should be semantically connected together to form a word into multiple words
The word cutting program generally adopts a vocabulary-based method, that is, scans a string according to a predetermined vocabulary, and finds a most suitable word cutting method through a certain matching method (forward maximum, reverse maximum, two-way matching, etc.). The disadvantage of this method is that it is impossible to segment words that are not included in the original vocabulary, that is, new words
This defect may lead to unsatisfactory results of fuzzy matching (that is, only matching part of the query words when searching)

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for finding hot videos based on user query logs in real time
  • Method and device for finding hot videos based on user query logs in real time
  • Method and device for finding hot videos based on user query logs in real time

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] In order to make the above-mentioned purposes, features and advantages of the present invention more obvious and understandable, the present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments:

[0033] Because real-time hot spots have the characteristics of a large number of searches in a relatively short period of time, it is most likely to discover new hot words and hot events by analyzing the latest user query logs, so as to improve the real-time response of search ranking results. figure 1 It is an implementation principle diagram of the method for discovering hotspot videos in real time based on user query logs in the present invention; figure 1 As shown, the present invention inputs user query logs within a period of time into the word segmentation program to obtain the word segmentation results of each user query. The extracted words here are called atomic words. Then, on this basis, count th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method and a device for finding hot videos based on user query logs in real time. The method comprises the steps of: firstly, carrying out word segmentation on user video query logs in a certain period of time to obtain atomic words; then, counting the occurrence number of each atomic word in the user video query logs in the certain period of time and the occurrence number of any two atomic words presenting in the same user query simultaneously; calculating the association degree of any two atomic words in the user video query logs by adopting a pointwise mutual information (PMI) method according to the obtained values of the occurrence number, merging any two atomic words with the association degree exceeding a certain threshold value into a compound word and placing the compound word into a compound word list; and finally, sorting the compound words in a descending manner, wherein the compound words ranking in the front are taken as keywords for finding the hot videos in real time according to a certain proportion finally.

Description

technical field [0001] The invention belongs to the technical field of statistical analysis of Internet data, in particular to a method and device for real-time discovery of hot videos based on user query logs. Background technique [0002] With the rapid development of the Internet, users have put forward higher requirements for video search results, which not only need to be relevant, but also have high real-time performance, which makes real-time search more and more important. Real-time video search refers to the instant and fast search of information in the video library to achieve the effect of instant search. Through real-time search, users can obtain first-hand information on hot events in the first time. However, compared with traditional search, real-time search also brings great challenges. For hot events, due to their suddenness and unpredictability, it is likely that the number of relevant videos and the number of clicks are relatively small, resulting in thei...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/73G06F16/951
Inventor 李力行姚健潘柏宇卢述奇尹玉宗
Owner ALIBABA (CHINA) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products