Method of voice recognition self-adaptive system based on heterogeneous dual-MIC in mobile environment

A technology of speech recognition and mobile environment, applied in speech recognition, speech analysis, instruments and other directions, to achieve the effect of simple implementation, reducing the pressure of complex scenes, and improving the accuracy rate

Inactive Publication Date: 2017-03-15
深圳凡豆信息科技有限公司
7 Cites 2 Cited by

AI-Extracted Technical Summary

Problems solved by technology

Therefore, in addition to big data and deep learning, how to maintain good...
View more

Abstract

The invention discloses a method of a voice recognition self-adaptive system based on heterogeneous dual-MIC in a mobile environment. The method includes the following steps: setting a preferential recognition rule of a main Microphone (MIC) and an auxiliary MIC according to a signal-to-noise characteristic; when a terminal enters a recording mode, starting recording channels of the main MIC and the auxiliary MIC at the same time, and detecting a voice endpoint in real time; if a voice signal is detected, picking out data of an optimal audio channel according to the preferential rule to perform voice recognition; and finally, performing soft control of a main MIC 1 hardware Power Amplifier (PA) according to information in a current audio file, so as to realize dynamic adjustment of the PA. After implementation of the method of a voice recognition self-adaptive system based on heterogeneous dual-MIC in a mobile environment, in noisy environment, the auxiliary MIC 2 with a relatively small pickup range is preferred, thereby reducing influence of the environment on a recognition engine, and the main MIC 1 can detect the distance of a user and adjust the PA in real time, thereby realizing a self-adaptive system combining a recording front end and a recognition effect, and improving recognition performance and perfecting user experience.

Application Domain

Speech recognition

Technology Topic

Self adaptiveSpeech identification +6

Image

  • Method of voice recognition self-adaptive system based on heterogeneous dual-MIC in mobile environment
  • Method of voice recognition self-adaptive system based on heterogeneous dual-MIC in mobile environment
  • Method of voice recognition self-adaptive system based on heterogeneous dual-MIC in mobile environment

Examples

  • Experimental program(1)

Example Embodiment

[0028] The present invention will be further explained below in conjunction with the drawings:
[0029] As attached figure 1 And attached Figure 4 As shown, the mobile terminal of the present invention includes: a PA binding module, an optimization module and an update module. Initially set the PA value of the omnidirectional main MIC1 and the directional sub-MIC2. The main MIC1 realizes the dynamic binding of PA, and the sub-MIC2 binds the fixed PA value; after binding the PA module, enter the preferred module, and first need to set the preference of the main and sub MIC Recognition rules, and when the terminal enters the recording mode, the recording channels of the main and secondary MICs are activated at the same time, and the recording state is always maintained; real-time detection of whether the main and secondary MICs have voice endpoint characteristics, if so, select the optimal audio according to the optimization rules The channel data performs voice recognition until the end point after the voice appears, and the recognition result is given; finally, it enters the update module and softly controls the main MIC1 hardware PA according to the current main MIC1 generated wav information to realize the dynamic adjustment of the main MIC1 recording channel PA.
[0030] Among them, the preferred rules are as attached figure 2 Shown. When the front-end point is detected, the voice energy, noise energy, signal-to-noise ratio, etc. of the main MIC1 and secondary MIC2 are used to determine the recording channel with higher voice clarity and recognition.
[0031] IF Main_veng> Main_noise Flag_channel = 2
[0032] ELSEIF Sub_veng> Sub_vmin Flag_channel = 2
[0033] ELSEIF Main_veng> Main_vmax Flag_channel = 2
[0034] ELSEIF Sub_veng
[0035] ELSEIF Main_veng> Main_vmin Flag_channel=1
[0036] ELSEIF Main_snr> Sub_snr Flag_channel = 1
[0037] ELSE Flag_channel = 2
[0038] among them:
[0039] Main_noise represents the noise energy threshold of the main MIC1;
[0040] Main_veng represents the voice energy value of the main MIC1;
[0041] Main_vmax represents the clipping energy threshold of the main MIC1;
[0042] Main_vmin represents the lowest voice energy threshold of the main MIC1;
[0043] Sub_veng represents the voice energy value of sub MIC2;
[0044] Sub_vmin represents the lowest voice energy threshold of sub MIC2;
[0045] Sub_mmax represents the highest mute energy threshold of the sub MIC2;
[0046] Main_snr represents the signal-to-noise ratio of the main MIC1;
[0047] Sub_snr represents the signal-to-noise ratio of the sub-MIC2;
[0048] Flag_channel represents the preferred channel,
[0049] Flag_channel=1 indicates the preferred main MIC1,
[0050] Flag_channel=2 indicates the preferred secondary MIC2.
[0051] The wav information generated by the main MIC1 softly controls the hardware PA of the main MIC1, and the method to realize the dynamic adjustment of the main MIC1 recording channel PA is as attached image 3 Shown. When the main MIC1 generates wav, analyze the wav to determine whether the PA value of the main MIC1 is appropriate. If the maximum energy value eng_max in the wav is greater than the preset clipping energy threshold eng_thresh1, reduce the analog gain PA of the main MIC1 to achieve rapid reduction of PA; if the maximum energy value eng_max in the wav is less than the preset minimum speech energy threshold eng_thresh2, increase The analog gain PA of the main MIC1 realizes a slow increase in PA. When eng_max is very small, PA will increase rapidly. Its implementation is as follows:
[0052]
[0053] among them:
[0054] eng_max represents the maximum energy value in the wav of the main MIC1;
[0055] eng_thresh1 represents the clipping energy threshold of the main MIC1;
[0056] eng_thresh2 represents the lowest speech energy threshold of the main MIC1;
[0057] PA represents the PA change of the main MIC1 when recording next time;
[0058] step_down represents the step size to be adjusted when the PA decreases;
[0059] step_up represents the step size adjusted when PA increases.
[0060] The above-mentioned embodiments are only preferred examples of the present invention and are not intended to limit the scope of implementation of the present invention. Therefore, all equivalent changes or modifications made in accordance with the structure, features and principles described in the scope of the patent application of the present invention shall be It is included in the scope of the patent application of the present invention.

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.

Similar technology patents

Video conferencing system

InactiveUS20050151836A1increase resolutioneasy to implement
Owner:CAMELOT TECH ASSOCS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products