A Chinese-English mixed speech recognition method and device
A technology of mixed speech and recognition methods, applied in speech recognition, speech analysis, instruments, etc., can solve problems such as poor speech recognition performance and large network parameters, and achieve the effect of reducing the number of classifications, reducing the amount of parameters, and improving recognition performance
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0047] Such as figure 1 As shown, the embodiment of the present invention provides a Chinese-English mixed speech recognition method, including but not limited to the following steps:
[0048] S101. Acquire voice training samples.
[0049] In the above step S101, the speech training samples are sampled from Chinese and English corpora. Chinese and English corpora include Chinese corpus, English corpus, and Chinese-English mixed corpus.
[0050] In the embodiment of the present invention, the Chinese and English corpus can be used as a data set, wherein the speech training samples are used as a training set or a verification set drawn proportionally from the data set to estimate the model, determine the model network structure, and determine the model parameters.
[0051]In practical applications, the test set can also be extracted from the data set to simulate the robustness of the network model constructed from the training set or verification set in general application sce...
Embodiment 2
[0081] An embodiment of the present invention provides a Chinese-English mixed speech recognition device 20, including:
[0082] Voice sample acquisition module 21, used to obtain voice training samples, the voice training samples are sampled in Chinese and English corpora;
[0083] Chinese and English corpora include Chinese corpus, English corpus, Chinese and English mixed corpus;
[0084] The model training module 22 is used to train the LSTM-CTC end-to-end network by the voice training sample, and modify the softmax layer of the LSTM-CTC end-to-end network, so that the characters output by the softmax layer are Unicode encoding;
[0085] Speech recognition network model acquisition module 23, for obtaining the speech recognition network model according to the character of softmax layer output;
[0086] Speech recognition module 24, for inputting the voice to be recognized into the speech recognition network model, and processing the output of the speech recognition networ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com