Multi-dialect mixed speech recognition method, device, system and storage medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of mixed speech and recognition methods, applied in speech recognition, speech analysis, instruments, etc., can solve the problems of high speech recognition accuracy, difficult to guarantee, unable to guarantee the accuracy of multi-dialect mixed speech speech recognition, etc., to achieve effective Recognition, high accuracy effect

Active Publication Date: 2021-11-26

上海企创信息科技有限公司

View PDF13 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, when performing speech recognition on speech files mixed with multiple dialects (including Mandarin Chinese, Chinese dialects, and even languages from different countries), it is difficult to ensure a high speech recognition accuracy rate.

[0003] Existing speech recognition technologies mostly perform targeted speech recognition on the speech of a single language, and cannot recognize speech files mixed with multiple dialects or the recognition effect is very poor, and it is impossible to guarantee a high performance for multi-dialect mixed speech. speech recognition accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0030] figure 1 It is a schematic flowchart of a multi-dialect mixed speech recognition method provided by Embodiment 1 of the present invention. This embodiment can be applied to the situation where effective recognition of multi-dialect mixed speech is realized based on the existing dialect recognition subsystem corresponding to each dialect. , the method can be executed by a multi-dialect mixed speech recognition device, the device can be realized by software and / or hardware, and can be integrated in a multi-dialect mixed speech recognition system.

[0031] It is understandable that the existing speech recognition technology mostly recognizes the speech of a single language, and can achieve a high accuracy rate of speech recognition; however, it cannot effectively recognize the mixed speech of multiple dialects, let alone guarantee a high accuracy. speech recognition accuracy. The purpose of the present invention is to use the existing speech recognition technology for a s...

Embodiment 2

[0068] image 3 It is a schematic flowchart of a multi-dialect mixed speech recognition method provided in Embodiment 2 of the present invention. This embodiment is further optimized on the basis of Embodiment 1. In this embodiment, adding each semantic text and timeline information to the historical word segmentation set of the corresponding dialect recognition subsystem is embodied as: for each semantic text, it is judged whether the target voice corresponding to the semantic text is the The initial speech to be recognized; if the target speech corresponding to the semantic text is the initial speech to be recognized, then the semantic text is determined to be the first semantic text corresponding to the dialect recognition subsystem that generates the semantic text, and Add the binary information group composed of the first semantic text and timeline information to the historical word segmentation set corresponding to the dialect recognition subsystem that generates the fir...

Embodiment 3

[0096] Figure 4 It is a structural schematic diagram of a multi-dialect mixed speech recognition device provided in Embodiment 3 of the present invention. This embodiment can be applied to the situation where effective recognition of multi-dialect mixed speech is realized based on the existing dialect recognition subsystem corresponding to each dialect. , the device can be implemented by software and / or hardware, and specifically includes: a semantic acquisition module 301 , a semantic addition module 302 , an unprocessed acquisition module 303 , a sequence formation module 304 , and a result determination module 305 . in,

[0097] The semantic acquisition module 301 is configured to use the initial speech to be recognized as the target speech, and obtain the semantic text obtained by processing the target speech by at least one dialect recognition subsystem and the timeline information corresponding to the semantic text, each of the dialect recognition The type of dialect c...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The embodiment of the invention discloses a multi-dialect mixed speech recognition method, device, system and storage medium. The embodiment of the present invention is based on the existing dialect recognition subsystems corresponding to each dialect, by processing multi-dialect mixed speech files into blocks, the possible dialect combinations of the entire speech file are obtained, and finally all the dialect combinations are input into the full-text discrimination subsystem for scoring and selection In an optimal way, the speech recognition results of multi-dialect mixed speech files can be obtained, thereby realizing effective recognition of multi-dialect mixed speech files and ensuring a high speech recognition accuracy.

Description

technical field [0001] Embodiments of the present invention relate to the technical field of speech recognition, and in particular, to a recognition method, device, system and storage medium for multi-dialect mixed speech. Background technique [0002] Speech recognition is an important application branch in the field of artificial intelligence, and the accuracy of speech recognition is an important evaluation index of speech recognition effect. However, when speech recognition is performed on speech files mixed with multiple dialects (including Mandarin Chinese, Chinese dialects, and even languages from different countries), it is difficult to ensure a high speech recognition accuracy. [0003] Existing speech recognition technologies mostly perform targeted speech recognition on the speech of a single type of language, and cannot recognize speech files mixed with multiple dialects or the recognition effect is very poor, and it is impossible to guarantee a high performanc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G10L15/26G10L15/18G10L15/00

CPCG10L15/005G10L15/1815G10L15/1822G10L15/26

Inventor 顾欣欣陆文渊曾传名

Owner 上海企创信息科技有限公司

Multi-dialect mixed speech recognition method, device, system and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology