Voice input ending judgment method, device, equipment and system and storage medium

A voice input and caching technology, applied in voice analysis, voice recognition, instruments, etc., can solve problems such as long waiting time, increase interaction delay, and reduce user interaction experience.

Pending Publication Date: 2020-02-21
ALIBABA GRP HLDG LTD
View PDF4 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This solution is relatively simple to implement, but a higher time threshold needs to be set to ensure the accuracy of the judgment result and reduce the false interruption ra...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice input ending judgment method, device, equipment and system and storage medium
  • Voice input ending judgment method, device, equipment and system and storage medium
  • Voice input ending judgment method, device, equipment and system and storage medium

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment

[0100] The solution for judging the end of voice input of the present disclosure can be applied to various voice interaction scenarios, for example, it can be applied to a voice query scenario to judge the end of voice query. Wherein, when the voice interaction scene is a multi-round voice dialogue scene, it may be judged whether the user's voice input is finished during each round of interaction.

[0101] Figure 5 is a schematic structural diagram showing a voice interaction system according to an embodiment of the present disclosure.

[0102] like Figure 5 As shown, the speech interaction system of this embodiment mainly includes a speech activity detection module 510 , an automatic speech recognition module 520 , a speech end prediction module 530 , a cache module 540 and a natural language understanding module 550 .

[0103] The voice activity detection module 510 is mainly used to detect the user's voice input in real time, that is, to detect whether the user has inpu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a voice input ending judgment method and device, equipment and a storage medium; the method comprises steps of detecting voice input of a user in real time, and analyzing text features and/or acoustic features of at least part of previously detected voice input to determine whether the user ends the voice input or not under the condition that it is detected that the time duration of no voice input exceeds a preset time threshold value. Therefore, the method can be regarded as a grading judgment scheme, for example, the method can be regarded as a grading judgment schemecombining voice activity detection and voice analysis. And as a primary judgment mode, voice activity detection does not need to have relatively high accuracy. Therefore, the time threshold value usedin voice activity detection can be set to be a small value; text feature analysis and/or acoustic feature analysis can be used as a judgment reassuring scheme, so that the accuracy of a judgment result can be ensured. Therefore, while the accuracy is ensured, the interaction delay can be greatly reduced, and the interaction experience of the user is improved.

Description

technical field [0001] The present disclosure relates to the technical field of voice interaction, and in particular to a method, device, device, system and storage medium for judging whether a user finishes voice input. Background technique [0002] Voice interaction belongs to the category of human-computer interaction, and it is a relatively cutting-edge interaction method that has been developed to the present. Voice interaction is the process in which users give instructions to the machine through natural language to achieve their own goals. During the voice interaction process, it is necessary to judge whether the user's voice input is finished, so as to obtain the complete voice input in time and improve the user's interaction experience. [0003] Currently, voice activity detection is mainly used to determine whether the user ends the voice input. Simply put, it is determined that the voice input ends when it is detected that the user does not have voice input for ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L15/02G10L15/08G10L15/16G10L15/26G10L15/30G10L25/30
CPCG10L15/02G10L15/08G10L15/16G10L15/30G10L25/30G10L15/26
Inventor 郎皓吴丽娟于浩淼严念念
Owner ALIBABA GRP HLDG LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products