Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and system for eliminating channel difference in speech interaction, electronic equipment and medium

A channel difference and voice interaction technology, which is applied in voice analysis, voice recognition, instruments, etc., can solve problems such as the degradation of back-end recognition performance, and achieve the effect of eliminating channel differences and improving accuracy

Pending Publication Date: 2020-09-04
RDA MICROELECTRONICS TECH SHANGHAI CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The technical problem to be solved by the present invention is to overcome the defect in the prior art that the training corpus does not match the channel environment of the actual voice collection, which leads to the degradation of the back-end recognition performance, and to provide a method, system, and electronic equipment for eliminating channel differences in voice interaction and media

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for eliminating channel difference in speech interaction, electronic equipment and medium
  • Method and system for eliminating channel difference in speech interaction, electronic equipment and medium
  • Method and system for eliminating channel difference in speech interaction, electronic equipment and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0078] This embodiment provides a method for eliminating channel differences in voice interaction, such as figure 2 shown, including:

[0079] During the training phase of the speech model:

[0080] Step S101 , extracting cepstral features from the training corpus in each scenario.

[0081] Among them, different scenes can be set for different places, such as offices, squares, homes, subway stations, etc. The training corpus can be recorded in different scenarios.

[0082] Step S102. Calculate the cepstrum mean value of the background environment signal in the corresponding scene according to the cepstrum feature.

[0083] Step S103, using the cepstral feature of the speech signal in the training corpus to subtract the cepstral mean value of the background environment signal to obtain a normalized cepstral sequence, and using the cepstral sequence to train the speech model; wherein, the The speech signal includes a background environment signal.

[0084] The process of o...

Embodiment 2

[0129] This embodiment provides a system 400 for eliminating channel differences in voice interaction, such as Figure 4 As shown, it includes: the first extraction module 411, the first calculation module 412 and the first normalization module 413 for the speech model training stage, and the second extraction module 421 and the second calculation module for the speech model use stage 422 and the second normalization module 423 .

[0130] The first extraction module 411 is used for extracting cepstral features for the training corpus in each scene.

[0131] The first calculation module 412 is configured to calculate the cepstrum mean value of the background environment signal in a corresponding scene according to the cepstrum feature.

[0132] The first normalization module 413 is used to subtract the cepstral mean value of the background environment signal from the cepstral feature of the speech signal in the training corpus to obtain a normalized cepstral sequence, and use ...

Embodiment 3

[0144] Figure 5 A schematic structural diagram of an electronic device provided in this embodiment. The electronic device includes a memory, a processor, and a computer program stored on the memory and operable on the processor. When the processor executes the program, the method for eliminating channel differences in voice interaction in Embodiment 1 is implemented. Figure 5 The electronic device 3 shown is only an example, and should not impose any limitation on the functions and application scope of the embodiments of the present invention.

[0145] The electronic device 3 may be in the form of a general computing device, eg it may be a server device. Components of the electronic device 3 may include but not limited to: the at least one processor 4 mentioned above, the at least one memory 5 mentioned above, and the bus 6 connecting different system components (including the memory 5 and the processor 4 ).

[0146] The bus 6 includes a data bus, an address bus and a cont...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and a system for eliminating channel difference in speech interaction, electronic equipment and a medium. The method for eliminating the channel difference in the speech interaction comprises the following steps: in a training stage of a speech model, extracting cepstrum features from training corpora in each scene; calculating a cepstrum mean value of the background environment signal in the corresponding scene according to the cepstrum features; subtracting the cepstrum mean value of the background environment signal from the cepstrum feature of the speech signal to obtain a normalized cepstrum sequence, and training a speech model by using the cepstrum sequence; in the use stage of the speech model, acquiring a user speech signal, and extracting cepstrumfeatures; estimating a cepstrum mean value of the background environment signal according to the cepstrum features; and subtracting the cepstrum mean value of the background environment signal from the cepstrum features to obtain a normalized cepstrum sequence, and inputting the normalized cepstrum sequence into the speech model. The difference between speech channels in the speech model trainingand using stages is successfully eliminated, so the accuracy of back-end recognition is improved.

Description

technical field [0001] The invention relates to the field of voice processing, in particular to a method and system for eliminating channel differences in voice interaction, electronic equipment and media. Background technique [0002] With Amazon's Echo detonating the artificial intelligence product of intelligent speakers, major speaker manufacturers and various fields of artificial intelligence have begun to deploy intelligent audio interactive devices. Google's Google home and Xiaomi's Xiaoai have launched one after another. , using voice interaction as the carrier to lay out smart home control functions. At present, there are various application methods of products, such as speaker-centered control of home appliances through the network, this method requires users to have conversations within 5 meters or even further away from the speaker, so as to achieve voice interaction anytime, anywhere. At the same time, voice conversations under specific products, such as voice ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/06G10L25/24
CPCG10L15/063G10L25/24
Inventor 陆成叶顺舟
Owner RDA MICROELECTRONICS TECH SHANGHAI CO LTD