Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

System and method for processing text utilizing a suite of disambiguation techniques

Inactive Publication Date: 2005-04-14
IDILIA
View PDF25 Cites 416 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0016] After applying the selection to the text, the method may refine a knowledge base of each component in the selection utilizing the disambiguated sense (or senses).

Problems solved by technology

It is accepted by those skilled in the art that, although humans perform word sense disambiguation effortlessly, and this is a critical step in understanding naturally expressed language, no system has yet been developed to accomplish word sense disambiguation of general texts to an accuracy sufficient to permit deployment in such applications.
Even current advanced word sense disambiguation systems may have an accuracy of only approximately 33%, thereby making their results too inaccurate for many applications.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • System and method for processing text utilizing a suite of disambiguation techniques
  • System and method for processing text utilizing a suite of disambiguation techniques
  • System and method for processing text utilizing a suite of disambiguation techniques

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] The description which follows, and the embodiments described therein, are provided by way of illustration of an example, or examples, of particular embodiments of the principles of the present invention. These examples are provided for the purposes of explanation, and not limitation, of those principles and of the invention. In the description, which follows, like parts are marked throughout the specification and the drawings with the same respective reference numerals.

[0037] The following terms will be used in the following description, and have the meanings shown below:

[0038] Computer readable storage medium: hardware for storing instructions or data for a computer. For example, magnetic disks, magnetic tape, optically readable medium such as CD ROMs, and semi-conductor memory such as PCMCIA cards. In each case, the medium may take the form of a portable item such as a small disk, floppy diskette, cassette, or it may take the form of a relatively large or immobile item su...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a system and method for processing natural language text utilizing disambiguation components to identify a disambiguated sense for the text. For the method, it comprises applying a selection of the components to the text to identify a local disambiguated sense for the text. Each component provides a local disambiguated sense of the text with a confidence score and a probability score. The disambiguated sense is determined utilizing a selection of local disambiguated senses. The invention also relates to a system and method for generating sense-tagged text. For the method, it comprises steps of: disambiguating a quantity of documents utilizing a disambiguation component; generating a confidence score and a probability score for a sense identified for a word provided by the component; if the confidence score for the sense for the word is below a set threshold, the sense is ignored; and if the confidence score for the sense for the word is above the set threshold, the sense is added to the sense-tagged text.

Description

RELATED APPLICATION [0001] This application claims the benefit of U.S. Provisional Application No. 60 / 496,681 filed on Aug. 21, 2003.FIELD OF THE INVENTION [0002] The present invention relates to disambiguating natural language text, such as queries to an Internet search engine, web pages and other electronic documents, and disambiguating textual output of a speech to text system. BACKGROUND [0003] Word sense disambiguation is the process of determining the meaning of words in text. For example, the word “bank” can mean a financial institution, an embankment, or an aerial manoeuvre (or several other meanings). When humans listen to or read naturally expressed language, they automatically select the correct meaning of each word based on the context in which it is expressed. A word sense disambiguator is a computer-based system for accomplishing this task, and is a critical component of technology for making naturally expressed language understandable to computers. [0004] A word sense...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F7/00G06F17/27G06F17/30
CPCG06F17/2785G06F17/30864G06F17/2795G06F16/951G06F40/247G06F40/30Y10S707/99935Y10S707/99934Y10S707/99933
Inventor COLLEDGE, MATTHEWBELZILE, PIERREBARNES, JEREMY
Owner IDILIA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products