Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Word sense disambiguation method fusing sentence local context with document domain information

A technology of word sense disambiguation and domain information, applied in the field of natural language processing, can solve problems such as domain mismatch, and achieve the effect of improving accuracy and adaptability

Active Publication Date: 2016-07-06
山东经伟晟睿数据技术有限公司
View PDF3 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to overcome the "domain mismatch" problem faced by the existing word sense disambiguation technology, and to propose a new fusion sentence local Word Sense Disambiguation Method for Context and Document Domain Information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Word sense disambiguation method fusing sentence local context with document domain information
  • Word sense disambiguation method fusing sentence local context with document domain information
  • Word sense disambiguation method fusing sentence local context with document domain information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] The specific implementation manner of the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0043] Take the sentence "TheArgentinestooka18-9advantageintothesecond'halfofthebasketballgame." as an example. The document in which the sentence belongs belongs to the field of sports, and the word sense disambiguation process is performed on the noun half in it.

[0044] According to the WordNet3.0 dictionary, the meaning of the ambiguous word half is shown in Table 1.

[0045] Table 1 The semantic table of half#n

[0046] lexical number

Glossary

half#n#1

one-half, half -- (one of two equal parts of a divisible whole; "

half a loaf"; "half an hour"; "a century and one half")

half#n#2

(one of two divisions into which some games or performances are

divided: the two divisions are separated by an interval)

[0047] Among them, #n indicates that the pa...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a word sense disambiguation method fusing a sentence local context with document domain information, and belongs to the technical field of natural language processing. The word sense disambiguation method comprises the steps of: 1, carrying out dependency grammar analysis on a sentence where an ambiguous word is positioned, and obtaining sentence local context related words with a direct dependency relationship with the ambiguous word; 2, carrying out dependency grammar analysis on a domain document set, collecting all dependency tuples which the domain document set contains, and constructing a dependency tuple library; 3, carrying out statistic analysis on the dependency tuple library, and finding a group of domain related words with the closest relationship with the ambiguous word; 4, according to a dependency distribution similarity of the domain related words and word sense relevance between the domain related words and the local context, determining disambiguation weights of the domain related words; 5, merging the sentence local context related words with the domain related words, and constructing a related word set; and 6, according to weighted accumulation relevance of each word sense of the ambiguous word and the related word set, judging a correct word sense. According to the method disclosed by the invention, adaptability of a word sense disambiguation system on a specific domain can be improved, and disambiguation accuracy can be improved.

Description

technical field [0001] The invention relates to a word sense disambiguation method, in particular to a word sense disambiguation method that combines sentence local context and document domain information, and belongs to the technical field of natural language processing. Background technique [0002] The problem of "domain mismatch" is common in natural language processing systems, and word sense disambiguation is no exception. The same method often has very different performances in different fields. In large-scale word sense disambiguation tasks, the domain types of texts to be processed vary widely. If the word sense disambiguation system cannot actively adapt to the differences in the text domain, its disambiguation performance will be greatly reduced. "Domain adaptation" has become a key issue restricting the improvement of word sense disambiguation performance in specific domains. Domain-specific word sense disambiguation has attracted the attention of researchers ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
CPCG06F40/211G06F40/216G06F40/30
Inventor 鹿文鹏孟凡擎杜月寒
Owner 山东经伟晟睿数据技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products