Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Learning Personalized Entity Pronunciations

A technology of entity names and pronunciation dictionaries, applied in speech analysis, speech recognition, instruments, etc., can solve the problem that automatic speech recognizers are difficult to accurately recognize speech commands, achieve high-quality transcription, and improve speech recognition effects

Active Publication Date: 2017-08-11
GOOGLE LLC
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

When processing voice commands, automatic speech recognizers (ASRs) may have difficulty accurately recognizing voice commands if the speaker uses pronunciations of a particular word that deviate from the canonical pronunciation associated with the word in the pronunciation dictionary

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Learning Personalized Entity Pronunciations
  • Learning Personalized Entity Pronunciations
  • Learning Personalized Entity Pronunciations

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0015] FIG. 1A is a context diagram illustrating features of a system 100A for learning pronunciation of personalized entity names. System 100A may include a user 100 , a user's mobile device 120 , and a server 130 . The mobile device 120 may communicate 160, 162, 164 with the server 130 via one or more wired or wireless networks. The network may include, for example, a wireless cellular network, a wireless local area network (WLAN) or a Wi-Fi network, a third generation (3G) or fourth generation (4G) mobile communication network, a private network (such as an intranet), a public network ( such as the Internet), or any combination thereof. Mobile device 120 may be a mobile phone, smart phone, smart watch, tablet computer, laptop or desktop computer, e-book reader, music player, PDA, or may include one or more processors and a computer-readable medium other fixed or portable devices.

[0016] The user's mobile device 120 may include one or more physical buttons 121 a , 121 b...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer-storage medium, for implementing a pronunciation dictionary. The method includes receiving audio data corresponding to an utterance that includes a command and an entity name. Additionally, the method may include generating, by an automated speech recognizer, an initial transcription for a portion of the audio data that is associated with the entity name, receiving a corrected transcription for the portion of the utterance that is associated with the entity name, obtaining a phonetic pronunciation that is associated with the portion of the audio data that is associated with the entity name, updating a pronunciation dictionary to associate the phonetic pronunciation with the entity name, receiving a subsequent utterance that includes the entity name, and transcribing the subsequent utterance based at least on the updated pronunciation dictionary. Improved speech recognition and more higher quality transcription can be provided.

Description

technical field [0001] This description generally relates to speech recognition. Background technique [0002] A user of a device can interact with the device in a number of different ways, including, for example, using a mouse or trackpad to make selections from a displayed set of items, entering characters via a keyboard, or speaking voice commands into a microphone . When processing a voice command, an automatic speech recognizer (ASR) may have difficulty accurately recognizing the voice command if the speaker uses a pronunciation of a particular word that deviates from the canonical pronunciation associated with the word in the pronunciation dictionary. Contents of the invention [0003] Aspects of the present disclosure may facilitate implementing a pronunciation dictionary that may store different, non-canonical pronunciations of entity names based on user interaction with a mobile device. In some cases, pronunciation dictionaries can adapt to unique features of a ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/22G10L15/187
CPCG10L15/187G10L15/22G10L2015/223G10L15/065G10L15/063G10L15/26G10L2015/0636G10L2015/0635
Inventor 安托万·让·布吕盖彭福春弗朗索瓦丝·博费
Owner GOOGLE LLC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products