A Word Sense Disambiguation Method Fused with Sentence Local Context and Document Domain Information

A word meaning disambiguation and domain information technology, applied in the field of natural language processing, can solve problems such as domain mismatch, and achieve the effect of improving accuracy and adaptability

Active Publication Date: 2019-02-01
山东经伟晟睿数据技术有限公司
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to overcome the "domain mismatch" problem faced by the existing word sense disambiguation technology, and to propose a new fusion sentence local Word Sense Disambiguation Method for Context and Document Domain Information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Word Sense Disambiguation Method Fused with Sentence Local Context and Document Domain Information
  • A Word Sense Disambiguation Method Fused with Sentence Local Context and Document Domain Information
  • A Word Sense Disambiguation Method Fused with Sentence Local Context and Document Domain Information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] The specific implementation manner of the present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments.

[0043] Take the sentence "The Argentinas took a@18-9 advantage into the second'half of the basketball game." as an example. The document in which the sentence belongs belongs to the field of sports, and the noun half in it is disambiguated.

[0044] According to the WordNet 3.0 dictionary, the meaning of the ambiguous word half is shown in Table 1.

[0045] Table 1 The meaning table of half#n

[0046]

[0047] Among them, #n indicates that the part of speech is a noun; #1 and #2 indicate the sequence number of the word meaning in WordNet 3.0.

[0048] Step 1: Perform dependency syntactic analysis on the sentence where the ambiguous word is located, and obtain local context-related words in the sentence that have a direct dependency relationship with the ambiguous word; the details are as follo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a word sense disambiguation method fusing a sentence local context with document domain information, and belongs to the technical field of natural language processing. The word sense disambiguation method comprises the steps of: 1, carrying out dependency grammar analysis on a sentence where an ambiguous word is positioned, and obtaining sentence local context related words with a direct dependency relationship with the ambiguous word; 2, carrying out dependency grammar analysis on a domain document set, collecting all dependency tuples which the domain document set contains, and constructing a dependency tuple library; 3, carrying out statistic analysis on the dependency tuple library, and finding a group of domain related words with the closest relationship with the ambiguous word; 4, according to a dependency distribution similarity of the domain related words and word sense relevance between the domain related words and the local context, determining disambiguation weights of the domain related words; 5, merging the sentence local context related words with the domain related words, and constructing a related word set; and 6, according to weighted accumulation relevance of each word sense of the ambiguous word and the related word set, judging a correct word sense. According to the method disclosed by the invention, adaptability of a word sense disambiguation system on a specific domain can be improved, and disambiguation accuracy can be improved.

Description

technical field [0001] The invention relates to a word sense disambiguation method, in particular to a word sense disambiguation method that combines sentence local context and document domain information, and belongs to the technical field of natural language processing. Background technique [0002] The problem of "domain mismatch" is common in natural language processing systems, and word sense disambiguation is no exception. The same method often has very different performances in different fields. In large-scale word sense disambiguation tasks, the domain types of texts to be processed vary widely. If the word sense disambiguation system cannot actively adapt to the differences in the text domain, its disambiguation performance will be greatly reduced. "Domain adaptation" has become a key issue restricting the improvement of word sense disambiguation performance in specific domains. Domain-specific word sense disambiguation has attracted the attention of researchers ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/27
CPCG06F40/211G06F40/216G06F40/30
Inventor 鹿文鹏孟凡擎杜月寒
Owner 山东经伟晟睿数据技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products