Named entity recognition-based news search result similarity calculation method

A technology of named entity recognition and similarity calculation, which is applied in the field of computer science and can solve problems such as the decline of discrimination

Inactive Publication Date: 2013-07-24
BEIJING UNIV OF POSTS & TELECOMM
View PDF2 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] There is a high-dimensional sparsity problem in the term matrix constructed by the above method, and when calculating the similarity, words will affect each other, resulting in a decrease in discrimination

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Named entity recognition-based news search result similarity calculation method
  • Named entity recognition-based news search result similarity calculation method
  • Named entity recognition-based news search result similarity calculation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.

[0029] In order to illustrate the similarity calculation method of news search results based on named entity recognition, here is an implementation example containing two news search results. The first news search result is "On March 10, 2013, Kobe, Japan, Japan will usher in the second anniversary of the Fukushima nuclear accident, and the citizens of Kobe will hold an anti-nuclear parade." 1 Said. The second news search result is "On March 10, 2013, Toky...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a named entity recognition-based news search result similarity calculation method, which comprises the following steps: establishing a plurality of key word subsets for a news search result by using a named entity recognition technology; establishing a lexical item matrix corresponding to each subset; calculating similarity in each lexical item matrix respectively; and finally, weighting a plurality of similarities to obtain a final similarity. According to the named entity recognition-based news search result similarity calculation method, the characteristic element of a piece of news is highlighted, the dimension of the lexical item matrixes can be effectively reduced, and the interaction among lexical items of different types is calculated during similarity calculation. The named entity recognition-based news search result similarity calculation method has the three characteristics of extracting a key word based on the named entity recognition, establishing a plurality of lexical items based on the key word subsets and calculating the weighting similarity based on the lexical item matrixes.

Description

Technical field [0001] The invention relates to a method for calculating the similarity of news search results based on named entity recognition, which is mainly applied to the clustering and text classification applications of search engines and belongs to the technical field of computer science. Background technique [0002] At present, search engine is the main way for users to obtain information on the Internet, which brings great convenience to people. However, with the increase in the amount of information on the Internet, the search results returned by search engines are becoming more and more complicated, and people need to filter among a large number of search results to obtain the information they really want. Therefore, some researchers use clustering techniques in information retrieval to cluster search results and present them to users by category, which improves the browseability of search results. [0003] The principle of the search result clustering technique is t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 陆月明党秋月张吉伟
Owner BEIJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products