Supercharge Your Innovation With Domain-Expert AI Agents!

Environment aware voice-assistant devices, and related systems and methods

A technology of speech and speech output, applied in the system field of intelligibility and user experience, which can solve problems affecting the perceived quality or intelligibility of synthesized speech

Pending Publication Date: 2021-03-30
APPLE INC
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Various conditions associated with the user's listening environment and aspects of the synthesized speech may adversely affect the quality or intelligibility of the synthesized speech as perceived by the user

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Environment aware voice-assistant devices, and related systems and methods
  • Environment aware voice-assistant devices, and related systems and methods
  • Environment aware voice-assistant devices, and related systems and methods

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0055] figure 1 An example of a first system 10 is shown for adapting playback by a voice assistant device to the environment to preserve or improve the intelligibility of synthesized speech or other playback speech. System 10 may include audio appliance 100 . Audio appliance 100 may embody, for example, a computing device. In some aspects, an audio appliance is embodied as a mobile communication device, such as a smartphone or tablet, or a personal or home assistant device, such as a smart speaker. Audio appliance 100 may include a voice assistant application (not shown) configured to listen for, understand, and respond to a user's spoken requests or commands. Alternatively, the voice assistant application may reside on a different device, such as a network-connected device, or the voice assistant application may be distributed among multiple network-connected devices.

[0056] The appliance 100 may be configured to listen for and respond to speech activation commands, eg,...

Embodiment 2

[0078] figure 2 An example of a second system 20 for adapting playback by a voice assistant device to the environment to preserve or improve the intelligibility of synthesized speech or other playback speech is shown. System 20 may include audio appliance 200, and may be similar in some respects to figure 1 System 10. For example, system 20 may also use output DSP 150 . However, Embodiment 1 selects from different speech synthesis parameters, and Embodiment 2 selects from different speech synthesis models. In system 20, speech synthesizer 240 may include a plurality of speech synthesis models, such as models 244-1, . . . 244-n. The plurality of speech synthesis models may include a speech synthesis model specific to low voice speech, a second speech synthesis model specific to Lombard effect speech, and a third speech synthesis model specific to normal speech. More, fewer or different speech synthesis models may be included. For example, there may be a single speech synt...

Embodiment 3

[0084] image 3 An example of a third system 30 for adapting playback by a voice assistant device to the environment to preserve or improve the intelligibility of synthesized speech or other playback speech is shown. System 30 may include audio appliance 300, and may be similar to figure 1 and figure 2 Systems 10 and 20 are shown in respectively. Embodiments 1 and 2 affect how the speech is synthesized relative to the selected speech output mode, while in embodiment 3 the synthesized speech is modified according to the speech output mode after synthesis. For example, system 30 may also use input DSP 132, speech classifier 126, ASR system 110, and output DSP 150 as in system 10, or may use input DSP 232, speech classifier 226, ASR system as in system 20 210 and output DSP 150.

[0085] Speech synthesizer 340 may synthesize speech without input from decision component 334 . In some cases, speech synthesizer 340 may operate remotely from audio appliance 300 .

[0086] Inst...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an appliance. The appliance can include a microphone transducer, a processor, and a memory storing instructions. The appliance is configured to receive an audio signal at the microphone transducer and to detect an utterance in the audio signal. The appliance is further configured to classify a speech mode based on the utterance. The appliance is further configured to determine conditions of an environment of the appliance. The appliance is further configured to select at least one of a playback volume or a speech output mode from a plurality of speech output modes basedon the classification, and the conditions of the environment of the appliance. The appliance is further configured to adapt the playback volume and / or mode of played-back speech according to the speech output mode. The appliance may be configured to synthesize speech according to the speech output mode, or to modify synthesized speech according to the speech output mode.

Description

technical field [0001] This patent application and the subject matter disclosed herein (collectively, the "disclosure") relate generally to voice assistant devices, including voice assistant devices capable of synthesized voice playback, and related systems and methods. More specifically, but not exclusively, the present disclosure relates to systems, methods and components for adapting synthesized speech or playback of synthesized speech to cues observed in a listening environment in order to improve the intelligibility and user experience of synthesized speech. Background technique [0002] Audio appliances are increasingly able to perform multiple tasks in response to commands spoken by the user, and have often interfaced more naturally with machines such as smart speakers, computers, mobile devices, navigation systems, automobiles, and other computing devices than, for example, using tactile or keyed inputs. environment to interact. In principle, such appliances collect...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L13/02G10L13/027G10L13/08
CPCG10L13/02G10L13/027G10L13/08
Inventor N·莱祖姆S·J·乔瑟尔R·鲍威尔A·德什潘德A·乔希
Owner APPLE INC
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More