Patents

Literature

Patsnap Eureka AI that helps you search prior art, draft patents, and assess FTO risks, powered by patent and scientific literature data.

272 results about "Voice activity detection" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

Voice activity detection (VAD), also known as speech activity detection or speech detection, is a technique used in speech processing in which the presence or absence of human speech is detected. The main uses of VAD are in speech coding and speech recognition. It can facilitate speech processing, and can also be used to deactivate some processes during non-speech section of an audio session: it can avoid unnecessary coding/transmission of silence packets in Voice over Internet Protocol applications, saving on computation and on network bandwidth.

Unified communications system and method

ActiveUS20160036962A1Minimal data consumptionCommon problemPublic address systemsSubstation equipmentCommunications systemVoice communication

A unified communications system enables a user to simultaneously manage real time voice communication with background audio streams such as a music. In one embodiment, the system comprises a plurality of devices connected through a persistent voice over IP channel, where each device is playing background audio independently, and each device is operatively coupled to a microphone. When a connected user speaks, Voice Activity Detection results in the automatic adjustment of background audio on other connected devices, the adjustments based upon user-input settings on each device.

Unified communications system and method

Unified communications system and method

Unified communications system and method

Owner:RAND JAMES S

Method and an apparatus for voice activity detection

InactiveUS20120232896A1Easy to adaptFast processingSpeech recognitionDecision combinationSpeech sound

A voice activity detection apparatus (1) comprising: a signal condition analyzing unit (3) which analyses at least one signal parameter of an input signal to detect a signal condition SC of said input signal; at least two voice activity detection units (4-i) comprising different voice detection characteristics, wherein each voice activity detection unit (4-i) performs separately a voice activity detection of said input signal to provide a voice activity detection decision VADD; and a decision combination unit (5) which combines the voice activity detection decisions VADDs provided by said voice activity detection units (4-i) depending on the detected signal condition SC to provide a combined voice activity detection decision cVADD.

Method and an apparatus for voice activity detection

Method and an apparatus for voice activity detection

Method and an apparatus for voice activity detection

Owner:HUAWEI TECH CO LTD

Method and apparatus for comfort noise generation in speech communication systems

InactiveUS20070050189A1Speech analysisWireless communicationCommunications systemVoice communication

A method that may be used in variety of electronic devices for generating comfort noise includes receiving (705) a plurality of information frames indicative of speech plus background noise, estimating (710) one or more background noise characteristics based on the plurality of information frames, and generating a comfort noise signal (715) based on the one or more background noise characteristics. The method may further include generating a speech signal (720) from the plurality of information frames, and generating an output signal (725) by switching between the comfort noise signal and the speech signal based on a voice activity detection.

Method and apparatus for comfort noise generation in speech communication systems

Method and apparatus for comfort noise generation in speech communication systems

Method and apparatus for comfort noise generation in speech communication systems

Owner:GOOGLE TECH HLDG LLC

Acoustic echo devices and methods

InactiveUS20060018460A1Two-way loud-speaking telephone systemsFrequency spectrumSpeech sound

Hands-free phones with voice activity detection using a comparison of frame power estimate with an adaptive frame noise power estimate, automatic gain control with fast adaptation and minimal speech distortion, echo cancellation updated in the frequency domain with stepsize optimization and smoothed spectral whitening, and echo suppression with adaptive talking-state transitions.

Acoustic echo devices and methods

Acoustic echo devices and methods

Acoustic echo devices and methods

Owner:TEXAS INSTR INC

System and method for winding audio content using a voice activity detection algorithm

ActiveUS20070112562A1Pleasant audio playbackEliminate needSpeech analysisRecord information storageConductor CoilSpeech sound

A system and method for locating a preferable playback start location after a winding or rewinding action in an audio playing device. In response to an adjustment of the playing location for audio content to a desired playing position, the system determines whether at least one non-speech or silent period of at least a predetermined duration exists within the vicinity of the desired playing position. If at least one such non-speech or silent period exists within the vicinity of the desired playing position, the system adjusts the playing position to fall within one of the at least one non-speech period or silent period.

System and method for winding audio content using a voice activity detection algorithm

System and method for winding audio content using a voice activity detection algorithm

System and method for winding audio content using a voice activity detection algorithm

Owner:WSOU INVESTMENTS LLC

Device And Method For Voice Activity Detection

InactiveUS20080091421A1Remove uncertaintyMicrophonesLoudspeakersEngineeringSpeech sound

A device includes a sound signal analyser configured to determine whether a sound signal comprises speech. The device further includes a microphone system configured to discriminate sounds emanating from sources located in different directions from the microphone system so that sounds only emanating from a range of directions are included as signals possibly containing speech.

Device And Method For Voice Activity Detection

Device And Method For Voice Activity Detection

Device And Method For Voice Activity Detection

Owner:SONY CORP

Echo cancellation in telephones with multiple microphones

InactiveUS20060147063A1Improve performanceReduce adverse effectsTwo-way loud-speaking telephone systemsTransmissionProximal pointEngineering

The present invention is directed to a telephone equipped with multiple microphones that provides improved performance during operation of the telephone in a speaker-phone mode. For example, the multiple microphones can be used to improve voice activity detection, which in turn, can improve echo cancellation. In addition, the multiple microphones can be configured as an adaptive microphone array and used to reduce the effects of (i) room reverberation, when a near-end user is speaking, and / or (ii) acoustic echo, when a far-end user is speaking.

Echo cancellation in telephones with multiple microphones

Echo cancellation in telephones with multiple microphones

Echo cancellation in telephones with multiple microphones

Owner:AVAGO TECH WIRELESS IP SINGAPORE PTE

Method for adaptively adjusting sound effect and equipment thereof

ActiveCN102436821AReduce the impact on useImprove experienceSpeech analysisEnvironmental noiseEngineering

A method for adaptively adjusting a sound effect and equipment thereof are disclosed. The method comprises the following steps: acquiring an energy value of current environmental noise; receiving a first trigger instruction and adjusting a current output volume according to the energy value of the current environmental noise; when the energy value of the current environmental noise is larger thana first threshold, carrying out treble enhancement processing; when the energy value of the current environmental noise is less than a second threshold, carrying out bass enhancement processing. In the method, through collecting sound data, voice activity detection is performed to the sound data. When the first trigger instruction is received, the current output volume can be adjusted according to the current environmental noise energy value and a frequency response can be adjusted through the treble enhancement or the bass enhancement. The better sound effect can be obtained and is easy to realize.

Method for adaptively adjusting sound effect and equipment thereof

Method for adaptively adjusting sound effect and equipment thereof

Method for adaptively adjusting sound effect and equipment thereof

Owner:HYTERA COMM CORP

Voice activity detection and wake-up method and device

ActiveCN108010515ALower latencyReduce power consumptionSpeech recognitionFeature extractionNetwork output

The invention provides a voice activity detection and wake-up method and device, and relates to the technical field of machine learning speech recognition. The method includes the steps of acquiring voice activity detection data and wake-up data, and performing Fbank feature extraction on the voice activity detection data and wake-up data to obtain voice Fbank feature data; inputting the voice Fbank feature data to a binary neural network model to obtain binarized neural network output result data; and according to a preset backend evaluation strategy, processing the binarized neural network output result data, determining a voice start position and a voice end position of the voice activity detection data, and detecting wake-up word data in the wake-up data. The system framework of the invention can be applied to voice activity detection and voice wake-up technologies at the same time, and can implement accurate, fast, low-delay, small-model and low-power voice activity detection technologies and voice wake-up technologies.

Voice activity detection and wake-up method and device

Voice activity detection and wake-up method and device

Voice activity detection and wake-up method and device

Owner:TSINGHUA UNIV

Multi-band structure self-adaptive filter switching method for AEC (acoustic echo cancellation)

ActiveCN106782593AAchieving Convergence Speed AdvantageOvercome speedSpeech analysisMulti bandAdaptive filter

The invention discloses a multi-band structure self-adaptive filter switching method for AEC (acoustic echo cancellation). Firstly, a far-end voice signal is acquired; a voice endpoint is detected, and a VAD (voice activity detection) flag bit and an improved envelope decision threshold are output; the voice signal is fed into a loudspeaker to serve as a desired signal and also input into a self-adaptive filter; the self-adaptive filter adopts a switchable multi-band structure and a corresponding self-adaptive algorithm, parameters of the filter are adjusted by use of the least mean square criterion according to feedback information, and the optimal solution is obtained. According to the provided switching method, voice characteristics are considered sufficiently under the condition that steady maladjustment is guaranteed, and optimized configuration of the convergence rate and the algorithm complexity is realized while advantages of the algorithm in the convergence rate are utilized. During actual application of echo cancellation, a single algorithm does not easily meet various variable demands. The variable switching algorithm provides more probability for a user and has great significance in application of self-adaptive echo cancellation.

Multi-band structure self-adaptive filter switching method for AEC (acoustic echo cancellation)

Multi-band structure self-adaptive filter switching method for AEC (acoustic echo cancellation)

Multi-band structure self-adaptive filter switching method for AEC (acoustic echo cancellation)

Owner:CHONGQING UNIV OF POSTS & TELECOMM

Acoustic echo devices and methods

ActiveUS20060018459A1Two-way loud-speaking telephone systemsComputer scienceSpeech sound

Hands-free phones with voice activity detection using a comparison of frame power estimate with an adaptive frame noise power estimate, automatic gain control with fast adaptation and minimal speech distortion, echo cancellation updated in the frequency domain with stepsize optimization and smoothed spectral whitening, and echo suppression with adaptive talking-state transitions.

Acoustic echo devices and methods

Acoustic echo devices and methods

Acoustic echo devices and methods

Owner:TEXAS INSTR INC

Signal presence detection using bi-directional communication data

InactiveUS20090043577A1Speech recognitionSpecial data processing applicationsData connectionVoice activity

A system and method for using bi-directional conversation data to improve signal presence detection are disclosed. The detector module is adapted to communicate with a signal enhancement module. The detector module collects data from a transmit direction of the connection and a receive direction of a data connection. The collected data from the transmit and the receive direction is used to classify at least one of data in the transmit direction and data in the receive direction. Responsive to the classification, the signal enhancement module enhances data in one of the transmit direction and the receive direction. Hence, data classification accuracy is improved by using data from both the transmit and receive directions. In one embodiment, the detector module applies a voice activity detection module (VAD) process to detect the presence or absence of voice data in the collected data.

Signal presence detection using bi-directional communication data

Signal presence detection using bi-directional communication data

Signal presence detection using bi-directional communication data

Owner:DITECH NETWORKS

Method, device and electronic equipment for voice activity detection

ActiveCN102044242ACapable of self-adaptive adjustmentImprove the performance of voice activation detectionSpeech analysisTime domainOperation mode

The embodiment of the invention discloses a method, device and electronic equipment for voice activity detection. The method comprises the following steps: acquiring time domain sorting parameters and frequency domain sorting parameters from audio frames; acquiring first distances between the time domain sorting parameters and the long-time sliding average value of the time domain sorting parameters in historical background noise frames; acquiring second distances between the frequency domain sorting parameters and the long-time sliding average value of the frequency domain sorting parameters in historical background noise frames; and determining whether the audio frames are foreground voice frames or background noise frames according to the first distances, the second distances and a determining polynomial group based on the first and second distances, wherein at least one coefficient in the determining polynomial group is a variable which can be changed with the operation mode of voice activity detection or the characteristics of input signals. The technical scheme can endue the determining criterion with self-adaptive regulation capability, thereby improving the performance of voice activity detection.

Method, device and electronic equipment for voice activity detection

Method, device and electronic equipment for voice activity detection

Method, device and electronic equipment for voice activity detection

Owner:HUAWEI TECH CO LTD

Method and apparatus to facilitate voice activity detection and coexistence manager decisions

ActiveUS8825860B2Digital computer detailsTime-division multiplexResource basedSpeech sound

A system and method to facilitate voice activity detection and coexistence manager decisions is provided and include identifying a connection utilizing a first resource and a content stream corresponding to the connection, where the first resource conflicts with a second resource. The content of the content stream is classified into multiple levels based on a value of the content and then a priority is assigned to the first and second resources based on the level of the content of the first resource.

Method and apparatus to facilitate voice activity detection and coexistence manager decisions

Method and apparatus to facilitate voice activity detection and coexistence manager decisions

Method and apparatus to facilitate voice activity detection and coexistence manager decisions

Owner:QUALCOMM INC

Method and device for voice activity detection (VAD) and encoder

ActiveCN102044243AHigh judgment performanceImprove efficiencySpeech analysisSelf adaptiveSpeech sound

The embodiment of the invention discloses a method and device for voice activity detection (VAD) and an encoder. The method for VAD comprises the following steps: when an input signal is a background noise, acquiring a fluctuation characteristic value of the background noise, wherein the fluctuation characteristic value is used for indicating the fluctuation amplitude of the background noise; carrying out self-adaptive regulation on relevant parameters of a determining criterion of VAD in accordance with the fluctuation characteristic value; and carrying out VAD determination on the input signal based on the relevant parameters of the determining criterion subject to self-adaptive regulation. The embodiment of the invention can be self-adapted to the fluctuation of the background noise so as to carry out VAD determination, thereby improving the performance of VAD determination, saving the limited channel bandwidth resources and realizing the high-efficiency utilization of the channel bandwidth.

Method and device for voice activity detection (VAD) and encoder

Method and device for voice activity detection (VAD) and encoder

Method and device for voice activity detection (VAD) and encoder

Owner:HUAWEI TECH CO LTD

Single-channel voice enhancement method and system

InactiveCN102157156ASuppress noiseImprove signal-to-noise ratioSpeech analysisFrequency UnitFeature extraction

The invention provides a single-channel voice enhancement method and a single-channel voice enhancement system. The method comprises the following steps of: extracting a noise signal from a noisy voice signal through voice activity detection; respectively performing outer ear, inner ear and middle ear simulation manipulation to the noisy voice signal and the noise signal through peripheral analysis; obtaining energy difference of each time frequency unit of the noisy voice signal and the noise signal subjected from simulation manipulation through feature extraction; generating different masking values to the energy difference of each time frequency unit and weighing the different masking values to obtain a masking processing signal; and rebuilding the voice signal to the masking processing signal and the noisy voice signal subjected from simulation manipulation to obtain a voice enhancement signal. The invention can decrease damage to a target voice signal and realize better denoising effect and keep higher voice quality under the environment with multi noises.

Single-channel voice enhancement method and system

Single-channel voice enhancement method and system

Single-channel voice enhancement method and system

Owner:WUXI RES INST OF APPLIED TECH TSINGHUA UNIV +1

Noise suppressing multi-microphone headset

ActiveUS8340309B2Small and cheap physical packageImprove application performanceEar treatmentNoise generationEarconHeadphones

A new type of headset that employs adaptive noise suppression, multiple microphones, a voice activity detection (VAD) device, and unique mechanisms to position it correctly on either ear for use with phones, computers, and wired or wireless connections of any kind is described. In various embodiments, the headset employs combinations of new technologies and mechanisms to provide the user a unique communications experience.

Noise suppressing multi-microphone headset

Noise suppressing multi-microphone headset

Noise suppressing multi-microphone headset

Owner:JAWBONE INNOVATIONS LLC

Method and system for speech processing for enhancement and detection

InactiveUS7343284B1Efficient and reliableSpeech analysisSpeech soundSpeech enhancement

A method for discriminating noise from signal in a noise-contaminated signal involves decomposing a frame of samples of the signal into decorrelated components, and using a difference between probability distributions of the noise contributions and the signal contributions to identify signal and noise. A Gaussian distribution is used to determine whether the components are only noise whereas a Laplacian distribution is used to determine whether the components contain the signal. Such discrimination may be used in speech enhancement or voice activity detection apparatus.

Method and system for speech processing for enhancement and detection

Method and system for speech processing for enhancement and detection

Method and system for speech processing for enhancement and detection

Owner:RPX CLEARINGHOUSE

Voice activity detection method in complex background noise

ActiveCN102194452ADifferentiate voiceDistinguish background noiseSpeech analysisBackground noiseSpeech sound

The invention discloses a voice activity detection method in complex background noise. The method sequentially comprises the following steps of: (1) performing TEO (Teager Energy Operator) operation on data; (2) pre-weighting input data x(n); (3) performing band-pass filtering; (4) framing and windowing; (5) calculating an evolution value of autocorrelation of each frame and a standard variance thereof; (6) calculating Stati of 20 frames at the initial stage, and a mean (Stati) and a standard variance std (Stati) thereof, comparing the std (Stati) with a preset threshold to judge whether voice is available; (7) calculating subsequent data; (8) calculating Stati of continuous FrameN frames, and performing secondary determination according to the mean (Stati) and the standard variance std (Stati) thereof; (9) considering that the speech interval Speechmin is equal to 100-200ms and duration Silencemin is equal to 500-1,000ms, judging that voice occurs under the condition that Statusfinalis equal to 0 when continuous Ns (the value is related to the FrameN) atatus is equal to 1; and judging that the voice is ended under the condition that Statusfinal is equal to 1 when continuous NE (the value is also related to the FrameN) atatus is equal to 0, and finally judging actual end points of the voice.

Voice activity detection method in complex background noise

Voice activity detection method in complex background noise

Voice activity detection method in complex background noise

Owner:西安烽火电子科技有限责任公司

Activity detection by joint human and object detection and tracking

ActiveUS20190180090A1Efficient network trainingLimited dataImage enhancementMedical data miningCommunication interfaceObject detection

A computing device includes a communication interface, a memory, and processing circuitry. The processing circuitry is coupled to the communication interface and to the memory and is configured to execute the operational instructions to perform various functions. The computing device is configured to process a video frame of a video segment on a per-frame basis and based on joint human-object interactive activity (HOIA) to generate a per-frame pairwise human-object interactive (HOI) feature based on a plurality of candidate HOI pairs. The computing device is also configured to process the per-frame pairwise HOI feature to identify a valid HOI pair among the plurality of candidate HOI pairs and to track the valid HOI pair through subsequent frames of the video segment to generate a contextual spatial-temporal feature for the valid HOI pair to be used in activity detection.

Activity detection by joint human and object detection and tracking

Activity detection by joint human and object detection and tracking

Activity detection by joint human and object detection and tracking

Owner:FUTUREWEI TECH INC

System and method for reducing VOIP (voice over internet protocol) communication resource overhead

ActiveCN105321525AReduce computational complexityReduce overheadSpeech recognitionInternet protocol suiteCommunication quality

The invention discloses a system for reducing VOIP (voice over internet protocol) communication resource overhead, comprising an input layer, a convolution layer, a sampling sub-layer and an output layer, each layer being composed of a characteristic spectrum, each characteristic spectrum containing nerve cells; a method of using the system to reduce VOIP communication resource overhead includes specifically: 1, training a convolutional neural network; 2, initializing the convolutional neural network; 3, inputting voice to be measured into a VAD (voice activity detection) system; 4, extracting voice characteristic parameter MFCC and its first-order differential characteristic parameter from each frame in order; 5, composing the parameters of each frame into a one-dimensional characteristic map taken into the convolutional neural network system; 6, the convolutional neural network system outputting in order a result [x, y] of each frame of the voice to be detected, and the VAD system making judgment and recording the results. The system and method have the advantages that the convolutional neural network system is used in the VAD system for detecting, the misjudgment rate of the VAD system is reduced, calculation time and bandwidth are saved, and VOIP voice resource overhead can be reduced at the premise of ensuring communication quality.

System and method for reducing VOIP (voice over internet protocol) communication resource overhead

System and method for reducing VOIP (voice over internet protocol) communication resource overhead

System and method for reducing VOIP (voice over internet protocol) communication resource overhead

Owner:BEIJING UNIV OF POSTS & TELECOMM

Acoustic echo devices and methods

ActiveUS20060018458A1Two-way loud-speaking telephone systemsComputer scienceSpeech sound

Hands-free phones with voice activity detection using a comparison of frame power estimate with an adaptive frame noise power estimate, automatic gain control with fast adaptation and minimal speech distortion, echo cancellation updated in the frequency domain with stepsize optimization and smoothed spectral whitening, and echo suppression with adaptive talking-state transitions.

Acoustic echo devices and methods

Acoustic echo devices and methods

Acoustic echo devices and methods

Owner:TEXAS INSTR INC

Voice acquiring method and device adopting plurality of microphones

ActiveCN108597498ASpeech recognitionComputer scienceSpeech sound

The invention provides a voice acquiring method and a voice acquiring device adopting a plurality of microphones. The method comprises the following steps: carrying out voice acquiring by adopting theplurality of microphones, wherein the microphones correspond to different voice acquiring channels, and thus voice signals of each voice acquiring channel are obtained; carrying out analog-digital conversion on the voice signals, thus obtaining voice digital signals; carrying out framing processing on PCM binary data of the voice digital signals, thus obtaining short-time stable audio signals corresponding to each frame of PCM binary data; carrying out voice activity detection on the short-time stable audio signals in sequence according to the frames, and determining the frames correspondingto the short-time stable audio signals as voice frames or non-voice frames; carrying out voice quality detection on fragment audio files corresponding to the voice frames by adopting the preset framenumber as the step size, and saving the fragment audio files with the qualified quality; and splicing the saved fragment audio files with the qualified quality for synthesizing the complete audio file.

Voice acquiring method and device adopting plurality of microphones

Voice acquiring method and device adopting plurality of microphones

Voice acquiring method and device adopting plurality of microphones

Owner:SPEAKIN TECH CO LTD

Voice activity detection apparatus and method

InactiveCN101080765ASpeech analysisStatistical modelSpeech sound

A voice activity detection method comprising the steps of (a) Estimating in a noise power estimator the noise power within a signal having a speech component and a noise component, and (b) Calculating a likelihood ratio for the presence of speech in the signal from the estimated power of noise signals from step (a) and a complex Gaussian statistical model.

Voice activity detection apparatus and method

Voice activity detection apparatus and method

Voice activity detection apparatus and method

Owner:KK TOSHIBA

Audio signal segmentation algorithm

InactiveUS20070271093A1Convenience to workImprove abilitiesSpeech recognitionFeature extractionAudio signal flow

The present invention discloses an audio signal segmentation algorithm comprising the following steps. First, an audio signal is provided. Then, an audio activity detection (AAD) step is applied to divide the audio signal into at least one noise segment and at least one noisy audio segment. Then, an audio feature extraction step is used on the noisy audio segment to obtain multiple audio features. Then, a smoothing step is applied. Then, multiple speech frames and multiple music frames are discriminated. The speech frames and the music frames compose at least one speech segment and at least one music segment. Finally, the speech segment and the music segment are segmented from the noisy audio segment.

Audio signal segmentation algorithm

Audio signal segmentation algorithm

Audio signal segmentation algorithm

Owner:NAT CHENG KUNG UNIV

Echo reduction system

ActiveUS20080031467A1Quality improvementReduce echoInterconnection arrangementsPublic address systemsSignal onMicrophone signal

The present invention relates to a method for reducing an echo in a microphone signal generated by a microphone, comprising echo compensating the microphone signal by subtracting an estimated echo signal from the microphone signal to generate an echo compensated signal, detecting a speech activity of a local speaker on the basis of the microphone signal and the estimated echo signal and suppressing a residual echo in the echo compensated signal on the basis of the detected speech activity to obtain an output signal. The invention further relates to a system for processing a microphone signal generated by a microphone, comprising echo compensation filtering means configured to receive and echo compensate the microphone signal to output an echo compensated signal based on the received microphone signal, a speech activity detection means configured to detect speech activity of a local speaker by receiving and analyzing the echo compensated signal and to output a detection signal and a residual echo suppressing means configured to receive the detection signal and to receive and filter the echo compensated signal on the basis of the detection signal to output an output signal.

Echo reduction system

Echo reduction system

Echo reduction system

Owner:CERENCE OPERATING CO

Intelligent voice mixing method and device for multi-party voice communication

ActiveCN104539816AImprove execution efficiencyImprove clarityHybrid switching systemsSpecial service for subscribersVoice communicationCLARITY

The invention discloses an intelligent voice mixing method and device for multi-party voice communication, and belongs to the technical field of multimedia. The method comprises the steps that in the voice communication process, current frame data of all active voice channels except a home terminal are obtained; voice active detection results of the current frame data of all the active voice channels and the short time average energy of all the active voice channels are obtained; voice channels for conducting voice mixing processing are selected according to the voice active detection results of the current frame data of all the active voice channels, the short time average energy of all the active voice channels, the number of voice channels with effective voice and gating identifiers corresponding to all the active voice channels; superposition voice mixing processing is conducted on the current frame data of the selected voice channels, and voice mixing data obtained after the superposition voice mixing are output. By means of the intelligent voice mixing method and device, noise generated in the multi-party voice communication is lowered, the clarity of voice in the multi-party voice communication is improved, and the execution efficiency of the multi-party voice communication is improved.

Intelligent voice mixing method and device for multi-party voice communication

Intelligent voice mixing method and device for multi-party voice communication

Intelligent voice mixing method and device for multi-party voice communication

Owner:GUANGZHOU HUADUO NETWORK TECH

Voiceprint identification method and system

InactiveCN108766445ASharp angular boundariesImprove accuracySpeech analysisPattern recognitionNetwork model

The invention provides a voiceprint identification method and system. The method comprises steps that features of voiced frames in a training corpus set are extracted through VAD voice activity detection; inter-class angle boundaries of the features of the voiced frames are expanded based on the A-softmax loss function, and the intra-class angle of the features of the voiced frames is limited to train a neural network model; deep voiceprint features of a to-be-registered target are determined according to the trained neural network model, and the to-be-registered target and the deep voiceprintfeatures are registered in a voiceprint database; the deep voiceprint features of the to-be-registered target are determined according to the trained neural network model; identification is carried out according to similarity of each deep voiceprint feature in the voiceprint database and the deep voiceprint feature of the to-be-registered target. The invention further provides a voiceprint identification system. The method is advantaged in that the A-softmax loss function is utilized to limit the intra-class angle, so obvious angle boundaries are between corresponding different classes of embedding vectors, discriminability is improved, and identification accuracy is higher.

Voiceprint identification method and system

Voiceprint identification method and system

Voiceprint identification method and system

Owner:AISPEECH CO LTD

Voice activity detection based on far-end and near-end statistics

InactiveUS7263074B2Improve the level ofLower levelBroadband local area networksTime-division multiplexCommunications systemProximal point

Methods and apparatus of managing a communication system, wherein a decision regarding a level of activity at a first end is made based at least in part on the level of activity at the second end. In one embodiment, the energy level of a first-end audio signal is measured. The first end is declared voice-active if the first-end energy level is greater than or equal to a first threshold value. The first end is declared voice-inactive if the first-end energy level is less than the first threshold value. To determine the value of the first threshold value, the energy level of a second-end audio signal is measured. If the second-end energy level is greater than or equal to a second threshold value, the second end is declared voice-active, in which case the first threshold is maintained at a relatively high level. If the second-end energy level is less than the second threshold value, the second end is declared voice-inactive, in which case the first threshold is maintained at a relatively lower level.

Voice activity detection based on far-end and near-end statistics

Voice activity detection based on far-end and near-end statistics

Voice activity detection based on far-end and near-end statistics

Owner:AVAGO TECH WIRELESS IP SINGAPORE PTE

Multichannel voice detection in adverse environments

InactiveCN1679083ASpeech analysisFrequency spectrumFourier transform on finite groups

A multichannel source activity detection system, e.g., a voice activity detection (VAD) system, and method that exploits spatial localization of a target audio source is provided. The method includes the steps of receiving a mixed sound signal by at least two microphones (102, 104); Fast Fourier transforming each received mixed sound signal into the frequency domain (110); filtering the transformed signals to output a signal corresponding to a spatial signature of a source (120); summing an absolute value squared of the filtered signal over a predetermined range of frequencies (122); and comparing the sum to a threshold to determine if a voice is present (124). Additionally, the filtering step includes multiplying the transformed signals by an inverse of a noise spectral power matrix (132), a vector of channel transfer function ratios (130), and a source signal spectral power (128).

Multichannel voice detection in adverse environments

Multichannel voice detection in adverse environments

Multichannel voice detection in adverse environments

Owner:SIEMENS AG

Popular searches

User input Voice over IP Unified communications Connected device Audio frequency Voice activity detection Microphone Automatic tuning Signal parameter Activity detection