Web news retrieval method based on event analysis

A news and event technology, applied in the field of information retrieval, can solve the problems of not analyzing the query content of the query items, failing to use the structural features of Web news, and failing to meet the needs of practical applications.

Active Publication Date: 2015-06-03
HUAIHAI INST OF TECH
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] However, the existing Web news retrieval methods do not analyze the effects of different query items in the query content, fail to use the structural characteristics of Web news, and do not consider the distance between query items. Therefore, the accuracy of retrieval results for some event information The rate is not high and cannot meet the needs of practical applications

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Web news retrieval method based on event analysis
  • Web news retrieval method based on event analysis
  • Web news retrieval method based on event analysis

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0043] Embodiment 1, a kind of web news retrieval method based on event analysis, concrete steps are as follows:

[0044] A. Enter the event item Q in the query box e and constraint term Q c Two parts, get the query item Q={Q e , Q c};

[0045] B. Select a piece of Web news d i The three parts of the title T, the first paragraph FP and the last paragraph LP indicate d i, d i ={T, FP, LP}, the specific steps are as follows:

[0046] B1. Extract the HTML source code of Web news and The middle part of the two labels as d i the title(T);

[0047] B2. Extract the first paragraph of the text of the Web news as d i first paragraph (FP);

[0048] B3. Extract the last paragraph of the Web news body part as d i Last paragraph (LP), get the three parts of the Web news d i = {T, FP, LP};

[0049] C. calculate d i =The weight of each partial feature item in {T, FP, LP}, the specific steps are as follows:

[0050] C1. Hypothetical event item Q e ={a 2}, constraint term Q ...

Embodiment 2

[0072] Embodiment 2, with reference to figure 1 , an application experiment of a web news retrieval method based on event analysis, the method includes the following steps:

[0073] Step 101, input event item Q e and constraint term Q c , get the query item Q={Q e , Q c}, which are as follows:

[0074] A1. Set two types of input boxes, event item and limited item, and enter the query content in the specified input box;

[0075] A2. The event is expressed as a quadruple e={t, l, o, a}, t represents time, l represents location, o represents object, a represents action, event item Q e ={a 1 , a 2 ,...,a m}, the general query items all contain an event item, that is, m=1, and the constraint item Q c ={t,l,o,a 1 , a 2 ,...,a n}, the constraints are t, l, o, or other events a i , general query items contain 0-2 event constraint items. For example, Q c ={"2008", "Wenchuan", "Earthquake"}, Q e ={"death"}, then query item Q={"2008 Wenchuan earthquake death"}, wherein "...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a Web news retrieval method based on event analysis. The Web news retrieval method based on the event analysis comprises the following steps of: setting two types of input frames of an event term Qe and a bound term Qc, and obtaining a query term Q={Qe, Qc}; selecting the title (T), the first paragraph (FP) and the last paragraph (LP) of Web news di to represent di, wherein di={T, FP, LP}; calculating the weight of each part of characteristic term of the di={T, FP, LP}; setting the weight of the characteristic term in the query term to be one; calculating the relevancy R(Q, di) of the query term Q and the news di; and according to the relevancy R(Q, di), carrying out descending order, and outputting a retrieval result. According to the method, the functions of different terms in a query content are distinguished, a computation method for the relevancy of the event query term and the Web news is proposed by combining with an event movement element, the importance of the Web news title and distance between the event term and the bound term, and the retrieval accuracy of the event-type Web news can be obviously improved.

Description

technical field [0001] The invention belongs to the field of information retrieval, and in particular relates to a Web news retrieval method based on event analysis. Background technique [0002] Since real events are clearly reflected on the Internet, there are a large number of event-oriented Web news reports on the Internet. Obtaining event-related information from the Internet with the help of search engines is already an urgent need for users. However, due to the rapid expansion of information on the Internet, the results returned by general search engines often have a large amount of information and inaccurate queries. After the user enters a certain keyword, there is not much useful information to search, especially for the retrieval of event information. [0003] Event retrieval refers to querying keywords for events entered by users to obtain relevant information or precise answers. TDT has a certain relationship with event retrieval, and its main purpose is to o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 仲兆满李存华管燕
Owner HUAIHAI INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products