Speech recognition method, device, electronic device and storage medium

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology for speech recognition and target speech, which is applied in the field of deep learning and can solve the problems of low speech recognition accuracy, highly restrictive methods, and incomplete acquisition of speech information.

Active Publication Date: 2021-09-14

BEIJING BAIDU NETCOM SCI & TECH CO LTD

View PDF4 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] In related technologies, in order to obtain complete voice information, tail point detection is performed on the voice information, that is, to detect the pause duration of the voice information, which can also be understood as the silence duration. When the pause duration reaches a fixed value, it is considered that the complete voice information has been obtained. Voice information. Obviously, this method of determining whether the voice information is complete or not is highly restrictive, which may lead to incomplete acquisition of voice information and low accuracy of voice recognition.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

example 1

[0042] In this example, the correspondence between the semantic completeness and the monitoring duration is preset, so that the preset correspondence is queried to obtain the monitoring duration corresponding to the semantic completeness.

example 2

[0044]In this example, the baseline semantic integrity corresponding to the monitoring duration baseline value is preset. The monitoring duration baseline value can be understood as the preset default monitoring duration, and the semantics of the current target voice information and the voice integrity of the baseline semantic integrity are calculated. Difference, according to the difference to determine the monitoring duration adjustment value, wherein the semantic difference is inversely proportional to the monitoring duration adjustment value, calculate the sum of the monitoring duration adjustment value and the monitoring duration reference value, and use the sum as the monitoring duration .

[0045] Step 104, if no voice information is detected within the monitoring period, perform voice recognition according to the target voice information.

[0046] In this embodiment, if no voice information is detected within the monitoring period, it indicates that the user has finish...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The application discloses a speech recognition method, device, electronic equipment and storage medium, and relates to the field of deep learning technology and the field of speech technology in the field of artificial intelligence technology. The status information and context information of the application corresponding to the information; according to the status information and context information, calculate the semantic integrity of the target voice information; determine the monitoring duration corresponding to the semantic integrity, and monitor the voice information within the monitoring duration; if the monitoring duration If no voice information is detected in the target voice information, voice recognition is performed according to the target voice information. Thus, the semantic completeness of the acquired voice information is determined according to the multi-dimensional parameters, and the duration of detecting the voice information is flexibly adjusted according to the semantic completeness, so as to avoid truncating the voice information and improve the accuracy of voice recognition.

Description

technical field [0001] The present application relates to the field of deep learning technology and the field of speech technology in the field of artificial intelligence technology, and in particular to a speech recognition method, device, electronic equipment and storage medium. Background technique [0002] With the development of artificial intelligence technology, smart home products such as smart speakers and smart robots have also been developed. Users can control the work of related products based on voice information input. The speaker performs operations such as opening a music application. [0003] In related technologies, in order to obtain complete voice information, tail detection is performed on the voice information, that is, to detect the pause duration of the voice information, which can also be understood as the silence duration. When the pause duration reaches a fixed value, it is considered that the complete voice information has been obtained. Voice in...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G10L15/04G10L15/183G10L15/26G10L15/16

CPCG10L15/04G10L15/183G10L15/16G10L15/22G10L2015/228G10L25/78G10L2025/783G10L15/02G10L15/08G10L15/1815

Inventor吴震周茂仁王知践崔亚峰吴玉芳瞿琴刘兵革家象

OwnerBEIJING BAIDU NETCOM SCI & TECH CO LTD

Speech recognition method, device, electronic device and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

example 1

example 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology