Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Speech recognition method, device, electronic device and storage medium

A technology for speech recognition and target speech, which is applied in the field of deep learning and can solve the problems of low speech recognition accuracy, highly restrictive methods, and incomplete acquisition of speech information.

Active Publication Date: 2021-09-14
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In related technologies, in order to obtain complete voice information, tail point detection is performed on the voice information, that is, to detect the pause duration of the voice information, which can also be understood as the silence duration. When the pause duration reaches a fixed value, it is considered that the complete voice information has been obtained. Voice information. Obviously, this method of determining whether the voice information is complete or not is highly restrictive, which may lead to incomplete acquisition of voice information and low accuracy of voice recognition.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech recognition method, device, electronic device and storage medium
  • Speech recognition method, device, electronic device and storage medium
  • Speech recognition method, device, electronic device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0042] In this example, the correspondence between the semantic completeness and the monitoring duration is preset, so that the preset correspondence is queried to obtain the monitoring duration corresponding to the semantic completeness.

example 2

[0044]In this example, the baseline semantic integrity corresponding to the monitoring duration baseline value is preset. The monitoring duration baseline value can be understood as the preset default monitoring duration, and the semantics of the current target voice information and the voice integrity of the baseline semantic integrity are calculated. Difference, according to the difference to determine the monitoring duration adjustment value, wherein the semantic difference is inversely proportional to the monitoring duration adjustment value, calculate the sum of the monitoring duration adjustment value and the monitoring duration reference value, and use the sum as the monitoring duration .

[0045] Step 104, if no voice information is detected within the monitoring period, perform voice recognition according to the target voice information.

[0046] In this embodiment, if no voice information is detected within the monitoring period, it indicates that the user has finish...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The application discloses a speech recognition method, device, electronic equipment and storage medium, and relates to the field of deep learning technology and the field of speech technology in the field of artificial intelligence technology. The status information and context information of the application corresponding to the information; according to the status information and context information, calculate the semantic integrity of the target voice information; determine the monitoring duration corresponding to the semantic integrity, and monitor the voice information within the monitoring duration; if the monitoring duration If no voice information is detected in the target voice information, voice recognition is performed according to the target voice information. Thus, the semantic completeness of the acquired voice information is determined according to the multi-dimensional parameters, and the duration of detecting the voice information is flexibly adjusted according to the semantic completeness, so as to avoid truncating the voice information and improve the accuracy of voice recognition.

Description

technical field [0001] The present application relates to the field of deep learning technology and the field of speech technology in the field of artificial intelligence technology, and in particular to a speech recognition method, device, electronic equipment and storage medium. Background technique [0002] With the development of artificial intelligence technology, smart home products such as smart speakers and smart robots have also been developed. Users can control the work of related products based on voice information input. The speaker performs operations such as opening a music application. [0003] In related technologies, in order to obtain complete voice information, tail detection is performed on the voice information, that is, to detect the pause duration of the voice information, which can also be understood as the silence duration. When the pause duration reaches a fixed value, it is considered that the complete voice information has been obtained. Voice in...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G10L15/04G10L15/183G10L15/26G10L15/16
CPCG10L15/04G10L15/183G10L15/16G10L15/22G10L2015/228G10L25/78G10L2025/783G10L15/02G10L15/08G10L15/1815
Inventor 吴震周茂仁王知践崔亚峰吴玉芳瞿琴刘兵革家象
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products