Speech recognition method, speech recognition device and electronic equipment based on sparse neural network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A neural network and speech recognition technology, applied in the information field, can solve the problems of large scale of neural network, difficult embedded devices or mobile devices, and high cost, and achieve the effect of short training time and reduced scale

Active Publication Date: 2021-07-30

FUJITSU LTD

View PDF4 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] The inventors of the present application have found that if the speech recognition technology is to be more widely applied to real life, there are still two urgent problems to be solved: first, it takes a lot of time to perform speech recognition based on neural networks. To adjust the structure and parameters of the neural network to train a suitable neural network; second, the scale of the currently used neural network is very large, it is difficult to apply it to embedded devices or mobile devices

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0031] Embodiment 1 of the present application provides a speech recognition method based on a sparse neural network, which is used to recognize a speech segment to be recognized, so as to determine a text corresponding to the speech segment to be recognized.

[0032] figure 1 It is a schematic diagram of the speech recognition method of embodiment 1, such as figure 1 As shown, the method includes:

[0033] S101. Process the speech segment to be recognized to obtain a feature vector of each speech frame in the speech segment to be recognized;

[0034] S102. Using a sparse neural network to identify the feature vector to obtain a state label value corresponding to the feature vector, wherein the weight matrix W of the sparse neural network is obtained based on dimension transformation; and

[0035] S103. Use a decoding model to decode the state tag value to obtain text corresponding to the speech segment to be recognized.

[0036] In this embodiment, speech recognition is pe...

Embodiment 2

[0060] In Embodiment 2, the method for training the weight matrix W based on dimension transformation is described, and the weight matrix W obtained according to the method of this embodiment is used for the sparse neural network adopted in step S102 of Embodiment 1. middle.

[0061] image 3 is a schematic diagram of the method for obtaining the weight matrix W through training in Example 2, such as image 3 As shown, the method includes:

[0062] S301. For the first predetermined number of training speech frames, calculate the Hessian matrix (hessian) of the feature vectors of each training speech frame and the first gradient of the feature vectors of each training speech frame in the first space, and, Based on the first current weight matrix Wm of the sparse neural network in the first space, calculate the state label value corresponding to the feature vector of each training speech frame;

[0063] S302. Project the first current weight matrix Wm and the first gradient f...

Embodiment 3

[0101] This embodiment provides a speech recognition device based on a sparse neural network, corresponding to the speech recognition methods in Embodiment 1 and Embodiment 2.

[0102] Figure 5 is a schematic diagram of the speech recognition device of this embodiment, such as Figure 5 As shown, the speech recognition device 500 includes: a first processing unit 501 , a first recognition unit 502 and a first decoding unit 503 .

[0103] Wherein, the first processing unit 501 is used to process the speech segment to be recognized, so as to obtain the feature vector of each speech frame in the speech segment to be recognized; the first identification unit 502 uses a sparse neural network to identify the feature vector , to obtain a state label value (state id) corresponding to the feature vector, wherein the weight matrix of the sparse neural network is obtained based on dimension transformation; the first decoding unit 503 uses a decoding model to decode the state label valu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

Embodiments of the present application provide a speech recognition method, device, and electronic device based on a sparse neural network. The method includes: processing a speech segment to be recognized to obtain a feature vector of each speech frame in the speech segment to be recognized; Using a sparse neural network to identify the feature vector to obtain a state label value (state id) corresponding to the feature vector, wherein the weight matrix of the sparse neural network is obtained based on dimension transformation; and using a decoding model to The state label value is decoded to obtain the text corresponding to the speech segment to be recognized. According to the present embodiment, the scale of the sparse neural network for speech recognition is reduced, and the training time of the sparse neural network is shortened, and the training result is improved.

Description

technical field [0001] The present application relates to the field of information technology, and in particular to a speech recognition method based on a sparse neural network, a speech recognition device and an electronic device. Background technique [0002] Speech recognition technology has been widely used in many fields, including voice dialing, call routing, home appliance control, voice search, simple data input, structured document preparation, voice-to-text and civil aviation applications, etc. [0003] Due to the development of deep learning technology and big data technology, the accuracy of speech recognition has been significantly improved, laying the foundation for the large-scale application of speech recognition. [0004] It should be noted that the above introduction to the technical background is only for the convenience of a clear and complete description of the technical solution of the present application, and for the convenience of understanding by tho...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G10L15/16G10L15/02G10L15/06

CPCG10L15/02G10L15/063G10L15/16

Inventor 石自强刘柳刘汝杰

Owner FUJITSU LTD

Speech recognition method, speech recognition device and electronic equipment based on sparse neural network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology