Field word identification method and device

A recognition method and domain word technology, applied in the field of information recognition, can solve the problems of domain word recognition difficulties, limitations, and accuracy limits the degree of human cognition.

Active Publication Date: 2011-06-01
BEIJING KINGSOFT OFFICE SOFTWARE INC
View PDF0 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The inventor found through research on the existing technology that the rule-based method actually uses linguistic rules to identify and extract terms. Since linguistic rules are difficult to find, especially in today's highly developed Internet, the expression methods are becoming more and more diverse. , the linguistic rules are even more difficult to find. At present, it is mainly to use artificial intelligence to discover linguistic rules, and then use them in automatic computer recognition. This method makes the recognition speed of domain words slow a

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Field word identification method and device
  • Field word identification method and device
  • Field word identification method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0066] figure 1 A method for identifying field words provided by an embodiment of the present invention includes:

[0067] S101. Search the search engine for domain words to be identified, obtain sub-results in the search results, and record the occurrence positions of each sub-result.

[0068] The method provided by the embodiment of the present invention utilizes the existing search engine to identify the field words to be recognized.

[0069] When a content to be queried is entered into a search engine as a search term, and the search term is searched by the search engine, information most relevant to the content to be queried can be obtained, and the information includes: corpus. Therefore, when we input the domain words to be recognized as search words into a search engine, and search for the domain words to be recognized through the search engine, the information obtained is information that is more relevant to the domain words to be recognized.

[0070] Under normal c...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a field word identification method and device. In the scheme provided by the embodiment of the invention, a search engine serves as the basis, and the field key word of a field to which a field word to be identified possibility belongs is determined according to the search result of the field word to be identified by the search engine; the score of the field word to be identified, which belongs to the field, is calculated according to the information of the pre-determined field key words and the search result; the score is compared with the field conformity threshold value of the field; and according to a comparison result, whether the field word to be identified belongs to the field is determined. The scheme provided by the embodiment obtains linguistic data having great correlation degree with the field word to be identified by using the characteristics of the search engine, thereby greatly improving the identification speed and accuracy of the field word.

Description

technical field [0001] The present invention relates to the field of information recognition, in particular to a method and device for field word recognition. Background technique [0002] Domain words refer to characteristic words with a strong text representation function, that is, they can clearly express the content characteristics of the text (such as domain category, theme, central meaning, etc.). Field words can be divided into field common words and field specific words according to the field circulation of words. [0003] Common words in the field are basic words representing the field, representing the centroid characteristics of this type of field, such as "games, teams" in sports; field-specific words are highly specific and highly differentiated, and can distinguish the detailed characteristics of the field For example, "World Boxing Council, Boxing Champion" in the sports category can not only distinguish the sports category from other categories, but also dis...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27G06F17/30
Inventor 于亮张宇峰
Owner BEIJING KINGSOFT OFFICE SOFTWARE INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products