Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Terms disambiguation method based on semantic dictionary

A semantic dictionary, word technology, applied in the semantic field, can solve problems such as difficulties

Inactive Publication Date: 2012-01-04
NANJING UNIV OF POSTS & TELECOMM
View PDF5 Cites 25 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But as a rule, this technique needs a language processor as a basis, because if you don't divide a text into a series of words, sentences and fixed expressions, you don't know whether it is a noun or a verb, and you need to determine the meaning of a word in the context. meaning can be very difficult

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Terms disambiguation method based on semantic dictionary
  • Terms disambiguation method based on semantic dictionary
  • Terms disambiguation method based on semantic dictionary

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0061] Based on the correlation between concepts, with the help of semantic dictionaries, context-based semantic disambiguation is realized. The detailed steps are as follows:

[0062] 1. Obtain the list of sentences in the text set:

[0063] Step 11) read in text set D;

[0064] Step 12) Use the word segmentation component to segment each text in the text set D to obtain the marked text format word1 / pos1 word2 / pos2 word3 / pos3, denoted as D1.

[0065] Step 13) read in the text set D1, and process one of the text files;

[0066] Step 14) convert the quoted sentence in the text into an ordinary sentence, that is, remove the quotation mark of the quoted sentence;

[0067] Step 15) read each sentence in the text: read full stop, question mark or exclamation mark and just be a sentence, the sentence that will read is put into the sentence list line by line;

[0068] 2. Segment the words and part-of-speech tags in the text set, use the semantic dictionary to find the definition ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a terms disambiguation method based on a semantic dictionary. In the method, term concept relevance is calculated to realize pretreatment on automatic text summarization, namely term disambiguation; the correlativity of concepts is used for realizing the term disambiguation method; requirements of the concept, concept paraphrase, synonym of the concept, expanded paraphrase of the concept, synset of expanded concept and other factors as well as sentence coherence are comprehensively considered; and a correlation degree calculation formula of the concept and a backtracking method are used for selecting the optimum meaning of a word, thus realizing the semantic disambiguation based on the context. An experiment proves that recall and accuracy of the semantic disambiguation can be improved and the method can be better used in acquisition of text summarization.

Description

technical field [0001] The invention proposes a word disambiguation method based on a semantic dictionary. The proposed method utilizes the calculation of the concept correlation of words to realize the preprocessing work of automatic text summarization—word disambiguation, which belongs to the field of semantic technology. Background technique [0002] The development of semantic dictionaries is based on three main assumptions: one is the assumption of separability, that is, the lexical components of a language can be isolated and extracted through certain methods, and special research is conducted on them; Master all the vocabulary needed for the language he uses, unless he can use the existing systematic patterns and relationships between word meanings. The third is a broad assumption, that is, if computational linguistics really handles natural language like a human, it must be like a human. Store as much lexical knowledge as possible. Semantic dictionaries make use of ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
Inventor 张卫丰张静王慕妮周国强张迎周许碧欢陆柳敏
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products