Polysemy keyword based text filtering method and device

A text filtering and keyword technology, applied in the Internet field, can solve the problems of low efficiency and high labor cost, and achieve the effect of high efficiency, low cost and good filtering effect

Active Publication Date: 2014-08-27
TENCENT TECH (SHENZHEN) CO LTD
View PDF4 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Although the existing manual review method has a good filtering effect, the efficiency is low. When the number of TAGs is large, due to the rapid update of information, the labor cost is also high.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Polysemy keyword based text filtering method and device
  • Polysemy keyword based text filtering method and device
  • Polysemy keyword based text filtering method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] The solution of the embodiment of the present invention is mainly: collect the text set with designated keywords, and screen out the text list corresponding to its mainstream meaning for ambiguous keywords; generate predetermined ambiguous keyword vectors and texts based on the text set Vector; calculate the similarity between the text vector and the predetermined ambiguous keyword vector, and filter out the text vector whose similarity is less than a predetermined threshold according to the similarity, so as to filter out the articles corresponding to the mainstream meaning of the ambiguous keyword required by the user.

[0028] Such as figure 1 As shown, a preferred embodiment of the present invention proposes a text filtering method based on ambiguous keywords, including:

[0029] Step S101, collecting text sets with designated keywords;

[0030] In this embodiment, based on the ambiguity TAG, the text list corresponding to the mainstream meaning is screened out, an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a polysemy keyword based text filtering method and device. The polysemy keyword based text filtering method comprises collecting a text set with an appointed keyword; generating into predetermine polysemy keyword vectors and text vectors based on the text set, wherein the predetermine polysemy keyword comprises the appointed keyword; calculating the similarity between the text vectors and the predetermine polysemy keyword vectors; filtering out texts of the text vectors with the similarity less than a predetermined threshold value. According to the polysemy keyword based text filtering method, a text list which is corresponding to the mainstream meaning is screened out based on the polysemy tag, then the used required texts are screened, the costs are low, the efficiency is high, the filtering efficiency is good, manual interference is not needed, and all polysemy keywords are applicable.

Description

technical field [0001] The invention relates to the technical field of the Internet, in particular to a text filtering method and device based on ambiguous keywords. Background technique [0002] Many words often have more than one meaning, and the primary meaning can vary in different contexts. For example, the word "apple" has related meanings such as technology, fruit, newspaper, etc. For the vast majority of information users, they often pay attention to its technical meaning and related article content. Therefore, articles with other meanings need to be removed from the list of articles subscribed by the user. [0003] Such as figure 1 as shown, figure 1 It is a list of articles that extract the polysemous word "Xiaomi" TAG (a keyword extracted from the text of the article, which can represent the main content of the article). Names and other related content. For users who subscribe to "Xiaomi", they should be most concerned about its technological meaning, and ar...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/9535
Inventor 蔡兵
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products