Web news retrieval method based on event analysis

A news and event technology, applied in the field of information retrieval, can solve the problems of not analyzing the query content of the query items, failing to use the structural features of Web news, and failing to meet the needs of practical applications.

Active Publication Date: 2013-03-20
HUAIHAI INST OF TECH
View PDF2 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] However, the existing Web news retrieval methods do not analyze the effects of different query items in the query content, fail to use the structural characteristics of Web news, and do

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Web news retrieval method based on event analysis
  • Web news retrieval method based on event analysis
  • Web news retrieval method based on event analysis

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0043] Embodiment 1, a Web news retrieval method based on event analysis, the specific steps are as follows:

[0044] A. Enter the event item Q in the query box e And constraint Q c Two parts, get the query term Q={Q e , Q c };

[0045] B. Select a Web news d i The three parts of the title T, the first paragraph FP and the last paragraph LP indicate d i, d i ={T,FP,LP}, the specific steps are as follows:

[0046] B1. Extract from the HTML source code of Web news with The middle part of the two labels is d i Title (T);

[0047] B2. Extract the first paragraph of the main body of Web news as d i The first paragraph (FP);

[0048] B3. Extract the last paragraph of the web news body part as d i The last paragraph (LP), get the three parts of Web news d i ={T,FP,LP};

[0049] C. Calculate d i = The weight of each part of the feature item in {T, FP, LP}, the specific steps are as follows:

[0050] C1. Assuming event item Q e ={a 2 }, constraint Q c ={t, l, o, a 1 }, they are in the news i The n...

Embodiment 2

[0072] Example 2, reference figure 1 , An application experiment of Web news retrieval method based on event analysis, the method includes the following steps:

[0073] Step 101, input event item Q e And constraint Q c , Get the query term Q={Q e , Q c }, which is as follows:

[0074] A1. Set two types of input boxes for event items and limited items, and enter the query content in the designated input boxes;

[0075] A2. The event is expressed as a four-tuple e={t,l,o,a}, t means time, l means location, o means object, a means action, event item Q e ={a 1 , A 2 ,..., a m }, the general query items include an event item, that is, m = 1, the constraint item Q c ={t, l, o, a 1 , A 2 ,..., a n }, the constraints are t, l, o, or other events a i , The general query items include 0-2 event constraint items. For example, Q c ={"2008","Wenchuan","Earthquake"}, Q e ={"Death"}, then the query Q={"2008 Wenchuan earthquake death"}, where "2008" is a time constraint, "Wenchuan" is a location co...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a Web news retrieval method based on event analysis. The Web news retrieval method based on the event analysis comprises the following steps of: setting two types of input frames of an event term Qe and a bound term Qc, and obtaining a query term Q={Qe, Qc}; selecting the title (T), the first paragraph (FP) and the last paragraph (LP) of Web news di to represent di, wherein di={T, FP, LP}; calculating the weight of each part of characteristic term of the di={T, FP, LP}; setting the weight of the characteristic term in the query term to be one; calculating the relevancy R(Q, di) of the query term Q and the news di; and according to the relevancy R(Q, di), carrying out descending order, and outputting a retrieval result. According to the method, the functions of different terms in a query content are distinguished, a computation method for the relevancy of the event query term and the Web news is proposed by combining with an event movement element, the importance of the Web news title and distance between the event term and the bound term, and the retrieval accuracy of the event-type Web news can be obviously improved.

Description

Technical field [0001] The invention belongs to the field of information retrieval, and specifically relates to a Web news retrieval method based on event analysis. Background technique [0002] Since actual events are clearly reflected on the Internet, there are a large number of event-oriented Web news reports on the Internet. Using search engines to obtain event-related information from the Internet has become an urgent need for users. However, due to the rapid expansion of information on the Internet, the results returned by general search engines are often large amounts of information and inaccurate queries. After the user enters a certain keyword, there is not much useful information searched, especially for the retrieval of event information. [0003] Event retrieval refers to query keywords entered by users to obtain related information or accurate answers. TDT has a certain connection with event retrieval, and its main purpose is to organize and utilize information flow...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 仲兆满李存华管燕
Owner HUAIHAI INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products