A hot event aggregation method and device

A technology of hot events and aggregation methods, applied in the field of information processing, can solve the problems of similarity expression defects, affecting text similarity, inaccurate similarity judgment, etc., and achieve the effect of good aggregation effect.

A technology of hot events and aggregation methods, applied in the field of information processing, can solve the problems of similarity expression defects, affecting text similarity, inaccurate similarity judgment, etc., and achieve the effect of good aggregation effect.

CN108829699BActive Publication Date: 2021-05-25BEIJING QIYI CENTURY SCI & TECH CO LTD

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A hot event aggregation method and device
  • A hot event aggregation method and device
  • A hot event aggregation method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0072] In order to make the above objects, features and advantages of the present invention more comprehensible, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0073] refer to figure 1 , which shows a flow chart of steps of an embodiment of a method for aggregating hotspot events in the present invention, which may specifically include the following steps:

[0074] Step 101, obtaining an original report based on the title of the hot event;

[0075] In practical applications, the search engine can obtain hot events from the hot search list, and can also dig out hot events from data with a sharp number of query hits. Certainly, hot events can also be determined in other ways. This is not limited.

[0076] In this embodiment of the present invention, there may be one hot event, that is, there may also be one title of the hot event.

[0077] In a preferred embodiment of the present invent...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

An embodiment of the present invention provides a hot event aggregation method and device, the method comprising: obtaining an original report based on the title of the hot event; determining a seed report based on the title of the hot event and the original report and multiple non-seed reports; use the seed report to generate hot event clusters; calculate the similarity between each non-seed report and the title of the hot event and each report in the hot event cluster; obtain the non-seed report with the highest similarity Seed report; determine whether the similarity of the non-seed report with the highest similarity is greater than a similarity threshold; if so, store the non-seed report with the highest similarity in the hotspot event cluster. The embodiment of the present invention introduces seed reports, hot events, and the similarity between reports in the aggregation process, so that the clustering algorithm can focus more on the event itself, measure the similarity of the text more accurately, and obtain better aggregation effects.

Description

technical field [0001] The present invention relates to the technical field of information processing, in particular to a hot event aggregation method and a hot event aggregation device. Background technique [0002] Hot event aggregation is an important basic technology of NLP (natural language processing, natural language processing), which plays an important role in recommendation, search, bubble and other businesses. [0003] According to the aggregation of reports related to hot events, most of them currently use the TF-IDF word weight clustering method to achieve a certain effect on the similarity between related reports. After the text is segmented, TF-IDF is calculated as the weight of the corresponding word. After the word vector is generated, the similarity is calculated according to the cosine distance, and then the corresponding reports are aggregated according to the similarity between the texts through the related clustering algorithm. [0004] However, since ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
25 May 2021
Publication
CN108829699B
IPC
G06F16/9535; G06F16/35; G06F40/258; G06F40/289; G06F40/30
CPC
G06F40/258; G06F40/289; G06F40/30
Inventors
张轩玮