Speech recognition method, device, equipment and storage medium

A technology of speech recognition and speech fragments, applied in speech recognition, speech analysis, instruments, etc., can solve the problem of low accuracy of speech information and achieve the effect of improving the accuracy

Active Publication Date: 2020-06-02
TENCENT TECH (SHENZHEN) CO LTD
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The embodiment of the present application provides a speech recognition method, device and equipment to solve the problem of low accuracy in recognizing speech information through a weighted finite state machine network in the related art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech recognition method, device, equipment and storage medium
  • Speech recognition method, device, equipment and storage medium
  • Speech recognition method, device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach

[0075] Please refer to image 3 , which shows a method flowchart of a speech recognition method provided in an exemplary embodiment of the present application. This method can be applied as Figure 1A and Figure 1B The server 130 shown can also be applied to the terminal, and the method can be figure 2 An optional implementation of step 202 in the embodiment, the method includes:

[0076] In step 202a, the voice information is divided into frames to obtain a multi-frame voice segment.

[0077] Exemplarily, the server divides the voice information into frames by using a moving window to obtain a multi-frame voice segment. The moving window has a preset window length and step length, and each frame of speech segment has its own corresponding start and end position and serial number index.

[0078] If the voice information is a time-domain function, the window length and step length take the preset time length as the unit, such as Figure 4 As shown, the window length of t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The application discloses a speech recognition method, device and equipment, belonging to the field of speech recognition. The method includes: acquiring voice information; determining the start and end positions of candidate voice segments in the voice information through a weighted finite state machine network; intercepting the candidate voice segments in the voice information according to the start and end positions of the candidate voice segments; inputting the candidate voice segments into the machine In the learning model, a machine learning model is used to detect whether the candidate speech segment contains preset keywords. This application uses a machine learning model to verify the candidate speech segments coarsely positioned by the weighted finite state machine network to determine whether the candidate speech segments contain preset keywords, which solves the problem of recognizing speech information without semantics as having semantics in related technologies voice information, which leads to the problem of false wake-up, and improves the accuracy of voice recognition.

Description

technical field [0001] The present application relates to the field of speech recognition, in particular to a speech recognition method, device, equipment and storage medium. Background technique [0002] Voice wake-up, also known as keyword spotting (Keyword Spotting, KWS), is an electronic device in a sleep or lock screen state that recognizes the user's voice and determines that the user's voice contains preset keywords to release sleep and / or lock screen. State function, and then open the voice interactive operation. In the voice wake-up process, voice recognition is a key step. [0003] Typical speech recognition methods in related technologies include: extracting features from speech information, converting speech information into corresponding text information through a weighted finite state machine (Weighted Finite State Transducer, WFST) network, and detecting whether the text information contains preset key word. [0004] In the process of converting voice infor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G10L15/02G10L15/04G10L15/08G10L15/22G10L25/30
CPCG10L15/02G10L15/04G10L15/08G10L15/22G10L25/30G10L2015/088G10L2015/223G10L15/16G10L15/142G10L2015/025G06N7/01G06N3/045G10L15/05
Inventor 林诗伦张玺霖麻文华刘博李新辉卢鲤江修才
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products