Systems and methods for aggregating related inputs using finite-state devices and extracting meaning from multimodal inputs using aggregation

a technology of related inputs and finite-state devices, applied in the field of aggregating related inputs, can solve the problems of limited screen real estate, limited keyboard interface, complex unimodal gestures,

Inactive Publication Date: 2003-03-20
NUANCE COMM INC +1
View PDF6 Cites 40 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

One barrier to adopting such wireless portable computing devices is that they offer limited screen real estate, and often have limited keyboard interfaces, if any keyboard interface at all.
While the technique disclosed in Johnston 1 overcomes many of the limitations of earlier multimodal systems, this technique does not scale well to support multi-gesture utterances, complex unimodal gestures, or other modes and combinations of modes.
However, the unification-based approach disclosed in Johnston 1-Johnston 3 does not allow for tight coupling of multimodal parsing with speech and gesture recognition.
Moreover, multimodal parsing cannot directly influence the progress of either speech recognition or gesture recognition.
The multi-dimensional parsing approach is also subject to significant concerns in terms of computational complexity.
This complexity is manageable when the inputs yield only n-best results for small n. However, the complexity quickly gets out of hand if the inputs are sizable lattices with associated probabilities.
The unification-based approach also runs into significant problems when choosing between multiple competing parses and interpretations.
Systems and methods for recognizing and representing gestural input have been very limited.
However, neither approach allows for efficiently and generically representing arbitrary gestures.
However, it is generally difficult, if not effectively impossible to capture all of different possible sequences of coordinates that occur in a gestural input in order to copy them from the gesture input tape to the meaning output tape of the finite-state automaton.
However, this limits the number of distinct gesture events in a single utterance to no more than the available number of variables, because the number of gestural inputs that can be handled is limited by the number of variables used.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Systems and methods for aggregating related inputs using finite-state devices and extracting meaning from multimodal inputs using aggregation
  • Systems and methods for aggregating related inputs using finite-state devices and extracting meaning from multimodal inputs using aggregation
  • Systems and methods for aggregating related inputs using finite-state devices and extracting meaning from multimodal inputs using aggregation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0076] FIG. 1 illustrates one exemplary embodiment of an automatic speech recognition system 100 usable with the multimodal recognition and / or meaning system 1000 according to this invention that is shown in FIG. 2. As shown in FIG. 1, automatic speech recognition can be viewed as a processing pipeline or cascade.

[0077] In each step of the processing cascade, one or two lattices are input and composed to produce an output lattice. In automatic speech recognition and in the following description of the exemplary embodiments of the systems and methods of this invention, the term "lattice" denotes a directed and labeled graph, which is possibly weighted. In each lattice, there is typically a designated start node "s" and a designated final node "t". Each possible pathway through the lattice from the start node s to the final node t induces a hypothesis based on the arc labels between each pair of nodes in the path. For example, in a word lattice, the arc labels are words and the variou...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

When using finite-state methods to represent inputs, it is beneficial to combine related inputs and to represent combinations of related inputs in the finite-state device. In multimodal recognition systems and methods that integrate different recognized modes, combining related inputs helps the unification process.

Description

[0001] This non-provisional application claims the benefit of U.S. provisional application No. 60 / 313,121, filed on Aug. 15, 2001, which is incorporated herein by reference in its entirety.[0002] 1. Field of Invention[0003] This invention is directed to aggregating related inputs when using finite-state methods.[0004] 2. Description of Related Art[0005] Multimodal interfaces allow input and / or output to be conveyed over multiple different channels, such as speech, graphics, gesture and the like. Multimodal interfaces enable more natural and effective interaction, because particular modes are best-suited for particular kinds of content. Multimodal interfaces are likely to play a critical role in the ongoing migration of interaction from desktop computing to wireless portable computing devices, such as personal digital assistants, like the Palm Pilot.RTM., digital cellular telephones, public information kiosks that are wirelessly connected to the Internet or other distributed networks...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(United States)
IPC IPC(8): G06F3/00G06F3/01G06F3/038G06F9/44G06K9/68G10L15/24
CPCG06F3/017G06F3/038G06F3/0481G06F3/04883G06F8/35G06F8/37G06K9/6885G10L15/183G10L15/193G10L15/24G10L15/32G01C21/3664G06V30/1985
Inventor JOHNSTON, MICHAEL J.BANGALORE, SRINIVAS
Owner NUANCE COMM INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products