Environment aware voice-assistant devices, and related systems and methods

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of speech and speech output, applied in the system field of intelligibility and user experience, which can solve problems affecting the perceived quality or intelligibility of synthesized speech

Pending Publication Date: 2021-03-30

APPLE INC

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Various conditions associated with the user's listening environment and aspects of the synthesized speech may adversely affect the quality or intelligibility of the synthesized speech as perceived by the user

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0055] figure 1 An example of a first system 10 is shown for adapting playback by a voice assistant device to the environment to preserve or improve the intelligibility of synthesized speech or other playback speech. System 10 may include audio appliance 100 . Audio appliance 100 may embody, for example, a computing device. In some aspects, an audio appliance is embodied as a mobile communication device, such as a smartphone or tablet, or a personal or home assistant device, such as a smart speaker. Audio appliance 100 may include a voice assistant application (not shown) configured to listen for, understand, and respond to a user's spoken requests or commands. Alternatively, the voice assistant application may reside on a different device, such as a network-connected device, or the voice assistant application may be distributed among multiple network-connected devices.

[0056] The appliance 100 may be configured to listen for and respond to speech activation commands, eg,...

Embodiment 2

[0078] figure 2 An example of a second system 20 for adapting playback by a voice assistant device to the environment to preserve or improve the intelligibility of synthesized speech or other playback speech is shown. System 20 may include audio appliance 200, and may be similar in some respects to figure 1 System 10. For example, system 20 may also use output DSP 150 . However, Embodiment 1 selects from different speech synthesis parameters, and Embodiment 2 selects from different speech synthesis models. In system 20, speech synthesizer 240 may include a plurality of speech synthesis models, such as models 244-1, . . . 244-n. The plurality of speech synthesis models may include a speech synthesis model specific to low voice speech, a second speech synthesis model specific to Lombard effect speech, and a third speech synthesis model specific to normal speech. More, fewer or different speech synthesis models may be included. For example, there may be a single speech synt...

Embodiment 3

[0084] image 3 An example of a third system 30 for adapting playback by a voice assistant device to the environment to preserve or improve the intelligibility of synthesized speech or other playback speech is shown. System 30 may include audio appliance 300, and may be similar to figure 1 and figure 2 Systems 10 and 20 are shown in respectively. Embodiments 1 and 2 affect how the speech is synthesized relative to the selected speech output mode, while in embodiment 3 the synthesized speech is modified according to the speech output mode after synthesis. For example, system 30 may also use input DSP 132, speech classifier 126, ASR system 110, and output DSP 150 as in system 10, or may use input DSP 232, speech classifier 226, ASR system as in system 20 210 and output DSP 150.

[0085] Speech synthesizer 340 may synthesize speech without input from decision component 334 . In some cases, speech synthesizer 340 may operate remotely from audio appliance 300 .

[0086] Inst...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an appliance. The appliance can include a microphone transducer, a processor, and a memory storing instructions. The appliance is configured to receive an audio signal at the microphone transducer and to detect an utterance in the audio signal. The appliance is further configured to classify a speech mode based on the utterance. The appliance is further configured to determine conditions of an environment of the appliance. The appliance is further configured to select at least one of a playback volume or a speech output mode from a plurality of speech output modes basedon the classification, and the conditions of the environment of the appliance. The appliance is further configured to adapt the playback volume and / or mode of played-back speech according to the speech output mode. The appliance may be configured to synthesize speech according to the speech output mode, or to modify synthesized speech according to the speech output mode.

Description

technical field [0001] This patent application and the subject matter disclosed herein (collectively, the "disclosure") relate generally to voice assistant devices, including voice assistant devices capable of synthesized voice playback, and related systems and methods. More specifically, but not exclusively, the present disclosure relates to systems, methods and components for adapting synthesized speech or playback of synthesized speech to cues observed in a listening environment in order to improve the intelligibility and user experience of synthesized speech. Background technique [0002] Audio appliances are increasingly able to perform multiple tasks in response to commands spoken by the user, and have often interfaced more naturally with machines such as smart speakers, computers, mobile devices, navigation systems, automobiles, and other computing devices than, for example, using tactile or keyed inputs. environment to interact. In principle, such appliances collect...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L13/02G10L13/027G10L13/08

CPCG10L13/02G10L13/027G10L13/08

Inventor N·莱祖姆S·J·乔瑟尔R·鲍威尔A·德什潘德A·乔希

Owner APPLE INC

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Environment aware voice-assistant devices, and related systems and methods

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology