Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

50results about How to "Improve speech intelligibility" patented technology

Hearing aid system, a hearing aid and a method for processing audio signals

A composite hearing aid system comprises two hearing aids (11, 31) with respective microphones (12, 32) and electronic receivers (17, 37), a microphone (42) and a transmitter (41) adapted to transmit the signal from the microphone (42) to the electronic receivers. At least one of the hearing aids (11, 31) comprises means for inverting the phase of the signal received by the electronic receivers (17, 37). When the phase of the received signal is inverted in one of the hearing aids (11, 31), a release from masking is obtained, and the perceived signal-to-noise ratio is improved. The invention provides a composite hearing aid system, a hearing aid and a method for processing audio signals.
Owner:WIDEX AS

Speech enhancement through partial speech reconstruction

A system improves speech intelligibility by reconstructing speech segments. The system includes a low-frequency reconstruction controller programmed to select a predetermined portion of a time domain signal. The low-frequency reconstruction controller substantially blocks signals above and below the selected predetermined portion. A harmonic generator generates low-frequency harmonics in the time domain that lie within a frequency range controlled by a background noise modeler. A gain controller adjusts the low-frequency harmonics to substantially match the signal strength to the time domain original input signal.
Owner:BLACKBERRY LTD

Spectral enhancement using digital frequency warping

A frequency-warped processing system using either sample-by-sample or block processing is provided. Such a system can be used, for example, in a hearing aid to increase the dynamic-range contrast in the speech spectrum, thus improving ease of listening and possibly speech intelligibility. The processing system is comprised of a cascade of all-pass filters that provide the frequency warping. The power spectrum is computed from the warped sequence and then compression gains are computed from the warped power spectrum for the auditory analysis bands. Spectral enhancement gains are also computed in the warped sequence allowing a net compression-plus-enhancement gain function to be produced. The speech segment is convolved with the enhancement filter in the warped time-domain to give the processed output signal. Processing artifacts are reduced since the frequency-warped system has no temporal aliasing.
Owner:GN HEARING AS

Time-domain receive-side dynamic control

A system improves the speech intelligibility and the speech quality of a speech segment. The system includes a dynamic controller that detects a background noise from an input by modeling a signal. A variable gain amplifier adjusts the variable gain of the amplifier in response to an output of dynamic controller. A shaping filter adjusts a speech signal by tilting portions of the speech signal of the dynamic controller.
Owner:BLACKBERRY LTD

Improved spectrum subtraction method based on human ear masking effect and Bayesian estimation

InactiveCN108735225AQuick response to changesOvercoming the defect of inaccurate noise estimationSpeech analysisNoise power spectrumNoise estimation
The invention discloses an improved spectrum subtraction method based on a human ear masking effect and Bayesian estimation. The improved spectrum subtraction method comprises the steps of: (1) adopting an improved minimum control value recursive averaging algorithm to obtain noise power spectrum estimation of an original noisy speech; (2) combining the obtained noise power spectrum estimation forperforming preliminary spectrum subtraction on a noisy speech signal; (3) performing Bayesian estimation based on weighted likelihood ratio distortion measurement on the signal after preliminary spectrum subtraction, and calculating the optimal estimated amplitude spectrum of the signal; (4) calculating a subtraction parameter of secondary spectrum subtraction by utilizing the human ear masking effect; (5) performing IMCRA noise estimation again before secondary spectrum subtraction, and carrying out secondary spectrum subtraction to obtain a final enhanced speech signal; (6) and performing inverse Fourier transform on the enhanced speech signal to obtain a final enhanced speech. The improved spectrum subtraction method better guarantees the intelligibility of the speech while improving the noise elimination capability of the algorithm, thereby improving the overall effect of speech enhancement.
Owner:NANJING UNIV OF POSTS & TELECOMM

Voice enhancement method and system based on phase compensation

The invention discloses a voice enhancement method and system based on phase compensation. The method comprises steps that a to-be-processed noisy voice signal is obtained; short-time Fourier transform of the noisy voice signal is carried out to obtain the amplitude spectrum and the phase spectrum of the noisy voice signal; a phase spectrum compensation function is obtained, and a Sigmoid type function whose factor is correspondingly changed with the change of the noisy voice signal-to-noise ratio is compensated; the phase spectrum of the noisy voice signal is compensated according to the phase spectrum compensation function to obtain the compensated phase spectrum; the amplitude of a pure voice signal is obtained according to the amplitude spectrum of the noisy voice signal; the compensated phase spectrum and the amplitude of the pure voice signal are reconstructed to obtain an enhanced voice signal. The method is advantaged in that compared with a traditional phase compensation voiceenhancement method, estimation of the noise is closer to the real noise power spectrum, the noise in an audio signal can be effectively inhibited, and intelligibility of the voice signal is improvedwhile the quality of the voice signal is enhanced.
Owner:TAIYUAN UNIV OF TECH

Glottal wave analog type artificial electronic throat with personal characteristics

The invention relates to a glottal wave analog type artificial electronic throat with personal characteristics, in particular to a real glottal wave analog type artificial electronic throat with personal characteristics, which comprises a wave shape generating and processing system with the characteristics of amplitude and frequency jitter, a power amplifying circuit 18 and a miniature electricity-force conversion system 19, wherein the electricity-force conversion system 19 can be regulated and controlled. The invention has a working form that: the wave shape generating and processing system can store glottal wave shapes with personal sounding characteristics; a wave shape generating module 12 generates initial glottal waves according to the stored glottal waves; an amplitude jitter module 14 adds amplitude jitter, and a frequency jitter generating module 13 adds frequency jitter; the generated wave shapes are converted into analog signals through a digital-to-analogue conversion module 17; the output signal wave shapes are converted into mechanical vibration by the electricity-force conversion device 19 after power amplification; the mechanical vibration is applied to the neck of a patient to generate glottal waves; and the glottal wave shapes are modulated by the tongue, the nasal cavity, the oral cavity, the lip, and other organs of the patient to form voice outside the lip. A wave shape frequency regulating module 15 and a wave shape amplitude regulating module 16 are respectively applied to the wave shape generating module 12 so as to regulate frequency and amplitude.
Owner:BEIHANG UNIV +1

Shiatsu type fundamental frequency adjustment electronic artificial larynx

The invention relates to a shiatsu type fundamental frequency adjustment electronic artificial larynx. The shiatsu type fundamental frequency adjustment electronic artificial larynx is characterized by changing fundamental frequency of a glottal wave by a shiatsu switch button at any time so as to change voice tones and mainly comprising a shiatsu sensing part, a waveform generating and processing system, a power amplification circuit and an electricity-force conversion system. The shiatsu type fundamental frequency adjustment electronic artificial larynx is characterized in that a glottal waveform having individual voice characteristics is stored in the waveform generating and processing system; the fundamental frequency of the waveform is changed under the control of the switch / shiatsubutton at any time during the process of waveform generation; the generated waveform is converted to an analog signal through a digital to analog conversion module in the system; and the signal waveform output by a digital to analog converter is applied to the electricity-force conversion system after power amplification. The amplified waveform is converted to mechanical vibration through an electricity-force energy converter of a high magnetic field, the vibration is applied tp the neck of a patient through a vibration film to produce the glottal wave, and the waveform forms sound outside a lip after being modulated by a tongue, a nasal cavity, an oral cavity, a lip, and the like of the patient.
Owner:BEIHANG UNIV +1

Method and test signal for measuring speech intelligibility

The invention generally relates to measurement of speech intelligibility in a mobile communication network component handling two-way communication between two ends, typically a near-end and a far-end. A basic idea is to simulate (Sl) two-way speech communication based on test signals adapted for speech intelligibility measurements, detect double-talk (S2) during the simulated speech communication by using a double-talk detector, and perform (S3) speech intelligibility measurements only at periods of double-talk. In this way, it is for example possible to take the effects of echo into account in the speech intelligibility measurements, while avoiding undesirable effects from non-linear processing (and possible added comfort noise) in the signal path for which speech intelligibility is measured.; Optionally, the operation of voice enhancement devices can be adjusted (S4) in response to the estimated speech intelligibility.
Owner:TELEFON AB LM ERICSSON (PUBL)

Efficient noise reduction earphone with low power consumption and noise reduction system

The present invention provides an efficient noise reduction earphone with low power consumption. The earphone comprises an earphone housing, the earphone housing is internally provided with a play horn, and the play horn is connected with a signal processing chip; the earphone further comprises a sensor arranged at the front end of the earphone housing to allow the sensor to stretch into a wearer's ear canal to collect jaw bone vibration signals at an inner ear; and the earphone further comprises a call microphone, the sensor collects the jaw bone vibration signals at an inner ear for voice activity detection and determines whether there is a person talking or not so as to guide an uplink conversation access to perform noise estimation for the signals collected by the call microphone and perform noise inhibition for the voice signals of the call microphone. The efficient noise reduction earphone with low power consumption employs the sensor to collect the jaw bone vibration signals atthe inner ear when the wearer is speaking, and does not need to perform echo elimination compared to a traditional earphone of an inner ear microphone which cannot collect music or conversation voicesignals played by a loudspeaker so that the power consumption is lower.
Owner:BESTECHNIC SHANGHAI CO LTD

Air pressure type base frequency-adjusted electronic artificial throat

The invention discloses an air pressure type base frequency-adjusted electronic artificial throat, which is characterized in that: an air hood is arranged outside a throat fistula and the air pressure inside the gas hood is detected; and the base frequency of a glottal wave is controlled according to a pneumatic signal so as to change the voice tone. The air pressure type base frequency-adjusted electronic artificial throat comprises an air pressure sensing part (101), a waveform generating and processing part (102), a power amplifier circuit (103) and a miniature electricity-force conversionpart (104), wherein the pneumatic sensing part (101) is used for detecting the air pressure outside a throat fistula of a patient and generating an air pressure detection signal; the waveform generating and processing part (102) is used for generating a glottal wave signal so as to change the base frequency according to the air pressure detection signal; the power amplifier circuit (103) amplifies the glottal wave signal; and the miniature electricity-force conversion part (104) generates vibration under the driving of the glottal wave signal. The vibration is applied to the neck of the patient to generate the glottal wave signal, and the glottal wave signal is modulated by the tongue, the nasal cavity, the oral cavity, the lips and other organs of the patient to form voice outside the lips.
Owner:BEIHANG UNIV +1

Frequency-no-masking hearing-aid for double ears

This invention discloses a frequency-no-masking hearing aid for ears. The novel double-ear hearing aid uses a new method to compensate the comprehension of a hearing-impaired user in noise. The novel method comprises steps of providing at least microphone audio frequency signal (18,20) by responding to the sound; and providing at least one evaluation of evaluations of object signal (26) and noise signals (30) based on at least one microphone audio frequency signal (18,20). At least one of the evaluations of the object signals (26) and the noise signals (30) is modified by the following methods of modifying the evaluation of the object signal (26) and the evaluation of the noise signal (30) which are positioned within different frequency bands; sending an evaluation of modified the object signal to the drumhead of the user of the double-ear hearing aid (10); and sending another modified evaluation of the noise signal of the drumhead of the user.
Owner:GN HEARING AS

Burst noise processing system and burst noise detection and suppression method and device

The invention relates to a burst noise processing system and a burst noise detection and suppression method and device. The burst noise detection method comprises the following steps: calculating frequency spectrum gradient corresponding to each frequency point according to amplitude spectra of the adjacent frequency points in frame-by-frame frequency spectral information of sound signals; determining gradient threshold upper and lower limits based on the frequency spectrum gradient; determining amplitude spectral gradient marks of the frequency points according to the frequency spectral gradient and the gradient threshold upper and lower limits, wherein the amplitude spectral gradient marks comprise a first gradient mark and a second gradient mark; alternatively searching out a first frequency point corresponding to the amplitude spectral gradient mark of the first gradient mark and a second frequency point corresponding to the amplitude spectral gradient mark of the second gradient mark as a group of frequency points; judging that the first frequency point and the second frequency point in the group of the frequency points are a frequency band corresponding to the burst noise if the difference between the second frequency point and the first frequency point in at least one group of the frequency points is smaller than the preset burst noise maximum frequency band. By adopting the scheme, the specific burst noise in a background environment can be effectively detected and suppressed.
Owner:SPREADTRUM COMM (SHANGHAI) CO LTD

Phase-sensitive gated multi-scale dilated convolutional network speech enhancing method and system

The invention provides a phase-sensitive gated multi-scale dilated convolutional network speech enhancing method. The method comprises the following steps: constructing a mapping relationship betweencomplex frequency spectrums of speech signals by using a neural network model, mapping a real and imaginary part frequency spectrum of noisy speech subjected to time-frequency analysis processing to obtain an enhanced real and imaginary part frequency spectrum, and recovering the spectrum into an enhanced time domain voice signal. The invention also provides a phase-sensitive gating multi-scale dilated convolutional network speech enhancing system. The method has the beneficial effects that: the method improves the speech enhancement effect, guarantees that the enhanced speech has good speechintelligibility, and better avoids the problem of speech distortion.
Owner:SHENZHEN INSTITUTE OF INFORMATION TECHNOLOGY

Remote switch type artificial electronic larynx

The invention relates to a remote switch type artificial electronic larynx; a switch (101) is separated from an equipment main body, the equipment main body is fixed at the neck part of a user, the switch (101) is arranged on the finger or in a pocket of the user; the equipment comprises a remote switch, wireless transmitting / receiving devices (102, 103), waveform generation and processing system(104, 10), a power amplifying circuit (105, 9) and a minitype electricity-power conversion system (106). The remote switch transmits a switch signal to the equipment main body wirelessly, after the equipment is started, the wave forms generated by the waveform generation and processing system are converted into an analog signal which is transmitted to the power amplifying circuit, and then the analog signal is converted into mechanical vibration by the electricity-power conversion system after being power-amplified, the vibration is applied to the neck part of a patient to generate glottis waves, the wave forms form voice outside the lips through the modulation of the tongue, nasal cavity, oral cavity, lips and the like.
Owner:BEIHANG UNIV +1

Speech signal processing method and apparatus

A speech signal processing method is performed at a terminal device, including: obtaining a recorded signal and a to-be-output speech signal, the recorded signal including a noise signal and an echo signal; calculating a loop transfer function according to the recorded signal and the speech signal; calculating a power spectrum of the echo signal and a power spectrum of the noise signal according to the recorded signal, the speech signal, and the loop transfer function; calculating a frequency weighted coefficient according to the two power spectra of the echo signal and the noise signal; adjusting a frequency amplitude of the speech signal based on the frequency weighted coefficient; and outputting the adjusted speech signal to a speaker electrically coupled to the terminal device. As such, the frequency amplitude of the speech signal is automatically adjusted according to the relative frequency distribution of a noise signal and the speech signal.
Owner:TENCENT TECH (SHENZHEN) CO LTD

Method for compensating for hearing loss in a telephone system and in a mobile telephone apparatus

The method makes it possible to extend functional possibilities, and to increase sound quality and the intelligibility of speech in mobile telephone apparatuses and communication systems for hearing-impaired subscribers. The mentioned technical result is achieved in that, in the method, personalized audio signals (A) for hearing-impaired users are generated on the basis of attributes thereof received from audiograms - frequency characteristics of the hearing of the hearing-impaired user stored in a database on the server of the communications network and linked to the telephone numbers of hearing-impaired users. A is processed on the server in a broadband frequency range on the basis of attributes of the hearing of the hearing-impaired user, the power of the processed audio signals is adjusted according to the attributes of the hearing-impaired user, and the adjusted personalized audio signals are transmitted from the communication server to the telephone apparatuses of the hearing-impaired users. The communications network used is a cellular network, and the telephone apparatus used is a mobile telephone apparatus (MTA). A mode which combines the function of a mobile telephone and of a hearing device is implemented.
Owner:A·Y·布莱帝希恩

Speech enhancement method based on constraint naive generative adversarial network

The invention discloses a speech enhancement method based on a constraint naive generative adversarial network. The method comprises the following steps of 1) performing noise data collection and marking; 2) performing voice framing and windowing; 3) performing amplitude compression; 4) inputting the constraint naive generative adversarial network for training; 5) performing amplitude decompression; 6) performing inverse short-time Fourier transform to generate an enhanced speech. The method has the advantages that by adversarial learning between a generative model and a discriminant model inthe generative adversarial network, the sample generation capability of the generative model is continuously enhanced and finally distribution of clean speech samples is obtained; no any assumption exists for statistical distribution of speeches or noises; and a complex number spectrum mapping method is adopted, so that phase information is added in training samples. According to the method, the problem that speech and noise signal distribution is difficult to estimate is ingeniously solved, the speech intelligibility is improved, and phase distortion is avoided.
Owner:NANCHANG HANGKONG UNIVERSITY

Mobile communication device with hearing-aid function and method thereof for realizing hearing-aid of earphone

The invention relates to a mobile communication device with a hearing-aid function, which comprises a shell and an earphone, wherein the shell is internally provided with an audio digital signal processor, a memory, a central processor and an audio power amplifier; the earphone is internally provided with a microphone and a receiver; the microphone in the earphone is used for inputting an audio signal to the mobile communication device; and the audio signal is subjected to voice processing such as noise reduction, frequency conversion, and the like in the mobile communication device, and then the processed signal is output to a user with poor hearing through the receiver to assist the user to hear. The invention combines the mobile communication device and a hearing-aid into a whole and ensures that the user with the poor hearing can simply realize hearing-aid through the mobile communication device without additionally adding an independent hearing-aid, which brings convenience to the user with the poor hearing. In addition, the invention also provides a method for realizing hearing-aid of the earphone in the mobile communication device.
Owner:SHENZHEN FUZHI SOFTWARE TECH CO LTD

Active sound insulation earmuff with voice enhancement function

The invention discloses an active sound insulation earmuff with a voice enhancement function. The active sound insulation earmuff comprises an earmuff body with high passive sound insulation performance. The microphone pickup microphone circuit is used for collecting external environment noise and voice signals and collecting error noise in the earmuff cavity; the microprocessor control circuit isused for extracting a voice signal and synthesizing an anti-noise signal; the audio driving and power amplifying circuit is used for carrying out reconstruction filtering and power amplification on the voice signal and the anti-noise signal; and the silencing loudspeaker is used for sending out anti-noise and outputting a voice signal. According to the invention, most of high-frequency noise is isolated by the passive sound insulation earmuff; low-frequency noise is suppressed near ears through an active noise reduction technology; meanwhile, a microphone array is arranged outside the earmuffs, an array voice enhancement algorithm is combined to extract voice signals, the voice signals are coupled into an active noise reduction system and input into earmuff cavities, more effective hearing protection is achieved, meanwhile, the voice intelligibility between wearers is improved, and normal voice communication is guaranteed.
Owner:NANJING UNIV OF AERONAUTICS & ASTRONAUTICS

Headphone signal processing method, system and headphone

ActiveCN111131947BImprove call qualityImprove low-frequency signal-to-noise ratioMicrophonesSignal processingNoiseHeadphones
The embodiments of the present invention disclose an earphone signal processing method, a system and an earphone. The earphone signal processing method includes: acquiring a signal picked up by a first microphone of the earphone located outside the ear canal near the mouth, a signal picked up by a second microphone of the earphone located outside the ear canal far from the mouth, and a third microphone of the earphone The third microphone is located in the cavity formed by the earphone and the ear canal; the signal picked up by the first microphone and the signal picked up by the second microphone are subjected to double-microphone noise reduction to obtain the first intermediate signal; The signal picked up by the third microphone and the signal picked up by the third microphone are subjected to double-microphone noise reduction to obtain a second intermediate signal; the first intermediate signal and the second intermediate signal are fused to obtain a fused voice signal; and the fused voice signal is output. The solutions of the embodiments of the present invention can improve the call quality of the headset in a high-noise environment.
Owner:LITTLE BIRD CO LTD

Device voice noise reduction, electronic device and storage medium

PendingCN114121031AActive real-time callImprove speech intelligibilitySpeech analysisNerve networkIntelligibility (communication)
The invention discloses a device voice noise reduction method, an electronic device and a storage medium, and the device voice noise reduction method is characterized in that at least one noise microphone is arranged near a noise source in the device, and a voice microphone is arranged outside the device. The method comprises the following steps: performing echo cancellation on an initial voice signal based on a noise signal collected by a noise microphone and the initial voice signal collected by a voice microphone to obtain an echo cancellation voice signal; and inputting the echo cancellation voice signal into a pre-trained neural network model to suppress the equipment noise in the echo cancellation voice signal. And finally, clean user voice is obtained, and the voice intelligibility is greatly improved, so that effective real-time communication between users is established.
Owner:AISPEECH CO LTD

Generative adversarial network speech enhancement method based on sparse continuous constraint

The invention discloses a generative adversarial network speech enhancement method based on sparse continuous constraint. The method comprises the following steps of: 1) collecting and classifying data; 2) carrying out speech framing and windowing; 3) carrying out amplitude compression; 4) inputting sparsity constraint-based generative adversarial network training; 5) carrying out amplitude decompression; and 6) synthesizing an enhanced speech. The method has the advantages that distribution of clean voice samples is finally obtained through adversarial learning between the generative model and the discrimination model in the generative adversarial network; there is no assumption for statistical distribution of voice or noise; and sparsity and continuity constraints are added to a loss function of the generator, so that the obtained sparse matrix can better conform to speech spectrum distribution. According to the method, the problem that the voice and noise signal distribution is difficult to estimate is ingeniously solved, the voice intelligibility is improved, and the enhanced speech more conforming to the pure voice spectrum distribution is obtained.
Owner:NANCHANG HANGKONG UNIVERSITY

Method and test signal for measuring speech intelligibility

The invention generally relates to measurement of speech intelligibility in a mobile communication network component handling two-way communication between two ends, typically a near-end and a far-end. A basic idea is to simulate (Sl) two-way speech communication based on test signals adapted for speech intelligibility measurements, detect double-talk (S2) during the simulated speech communication by using a double-talk detector, and perform (S3) speech intelligibility measurements only at periods of double-talk. In this way, it is for example possible to take the effects of echo into accountin the speech intelligibility measurements, while avoiding undesirable effects from non-linear processing (and possible added comfort noise) in the signal path for which speech intelligibility is measured.; Optionally, the operation of voice enhancement devices can be adjusted (S4) in response to the estimated speech intelligibility.
Owner:TELEFON AB LM ERICSSON (PUBL)

Method and system for suppressing communication noise of baseband voice signals

ActiveCN106856623ASuppress noiseThe signal amplitude is close to the samePower managementSpeech analysisBandpass filteringLow-pass filter
The invention discloses a method of suppressing communication noise of baseband voice signals. At a transmitting end, the baseband voice signal band is divided into two sub bands; low pass filter processing and automatic gain control (AGC) constant amplitude processing are carried out on original voice signals, and the obtained signals are transmitted in the low frequency sub band; an original voice signal waveform envelope is extracted during an AGC process, and envelope signals are obtained and are subjected to frequency modulation and are transmitted in the high frequency sub band; during the AGC constant amplitude processing process, voiced-segment signals are subjected to amplitude compression processing and silent-segment signals are subjected to amplitude expansion processing, the signal amplitudes tend to be consistent, and the dynamic range becomes small; at a receiving end, a voice signal envelope is extracted through bandpass filtering and frequency modulation, and according to the envelope, low pass constant amplitude voice signals are subjected to waveform recovery; and the voice waveform recovery is an AGC inverse process. As the silent-segment signals are compressed and the voiced-segment signals are expanded, the silent-segment noise is thoroughly suppressed, the voiced-segment noise is covered by the voice signals, and good voice intelligibility can be obtained.
Owner:鲁睿

Fitting of sound processors using improved sounds

In one embodiment, a method for customising a sound processing device for an individual listener including presentation of one or more sounds to the listener directly from the sound processing device, each sound comprising a collection of two or more harmonically related tones, spectrally positioned about a frequency of interest, and having a temporal envelope consisting of a rise time, sustain time, and decay time, obtaining information from the listener, adjusting the level of the sounds, and using the adjusted levels to set up the sound processing device for the listener.
Owner:BLAMEY & SAUNDERS HEARING

Voice signal processing method and device, equipment and storage medium

PendingCN113707162AThe enhanced effect is stableImprove speech intelligibilitySpeech analysisEngineeringSpeech sound
The invention provides a voice signal processing method and device, equipment and a storage medium, and belongs to the technical field of artificial intelligence. For a to-be-processed voice signal, a first power spectrum and phase information of the voice signal at each frequency point in a frequency domain are firstly obtained, then the first power spectrum is enhanced by obtaining a frequency band gain value corresponding to each frequency point, and a second power spectrum of each frequency point is obtained; and a target voice signal meeting a voice playing condition is generated according to the second power spectrum and the phase information of each frequency point. According to the processing mode, the power spectrum of each frequency point is enhanced in a targeted manner, so that the enhancement effect of the voice signal is more stable, the voice quality is effectively improved, and the voice intelligibility is enhanced; moreover, no matter whether the to-be-processed voice signal is subjected to cascade coding processing or not, the processing mode can be adopted to enhance the voice signal, and the application range is wide.
Owner:TENCENT TECH (SHENZHEN) CO LTD

Audio and video hybrid voice front-end processing method for service robot voice interaction

The invention discloses an audio-video mixed voice front-end processing method for voice interaction of a service robot. The specific steps are as follows: (1) capture the mouth movement information of the expected speaker through video processing means; (2) capture the mouth movement information of the expected speaker according to the mouth movement of the expected speaker (3) Optimize the beam algorithm of the robot microphone array according to the voice activity detection results; (4) Realize voice enhancement through the array microphone, suppress environmental noise, and improve the signal-to-noise ratio of the robot's collected voice. The invention can effectively improve the signal quality of the voice collected by the robot in the complex sound field environment where the robot is located.
Owner:南京南大电子智慧型服务机器人研究院有限公司 +2
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products