Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

81 results about "Speaker recognition system" patented technology

System and method for speaker recognition on mobile devices

A speaker recognition system for authenticating a mobile device user includes an enrollment and learning software module, a voice biometric authentication software module, and a secure software application. Upon request by a user of the mobile device, the enrollment and learning software module displays text prompts to the user, receives speech utterances from the user, and produces a voice biometric print. The enrollment and training software module determines when a voice biometric print has met at least a quality threshold before storing it on the mobile device. The secure software application prompts a user requiring authentication to repeat an utterance based at least on an attribute of a selected voice biometric print, receives a corresponding utterance, requests the voice biometric authentication software module to verify the identity of the second user using the utterance, and, if the user is authenticated, imports the voice biometric print.
Owner:CIRRUS LOGIC INC

Speaker recognition system

A method automatically recognizes speech received through an input. The method accesses one or more speaker-independent speaker models. The method detects whether the received speech input matches a speaker model according to an adaptable predetermined criterion. The method creates a speaker model assigned to a speaker model set when no match occurs based on the input.
Owner:NUANCE COMM INC

Methods and apparatus for statstical biometric model migration

In large-scale deployments of speaker recognition systems the potential for legacy problems increases as the evolving technology may require configuration changes in the system thus invalidating already existing user voice accounts. Unless the entire database of original speech waveform were stored, users need to reenroll to keep their accounts functional, which, however, may be expensive and commercially not acceptable. Model migration is defined as a conversion of obsolete models to new-configuration models without additional data and waveform requirements. The present disclosure investigates ways to achieve such a migration with minimum loss of system accuracy.
Owner:NUANCE COMM INC

Speaker recognition system and method

The invention discloses a speaker recognition system and a speaker recognition method. The speaker recognition system comprises a characteristic extraction unit, a background model generation unit, a registered speaker model generation unit, a metric value calculation unit and a recognition unit, wherein the characteristic extraction unit is configured to extract a characteristic vector of speech data of a speaker; the background model generation unit is configured to perform internal clustering on the characteristic vector of the speech data of a background speaker and generate a universal background model aiming at a normal speaker according to the result of the internal clustering; the registered speaker model generation unit is configured to adapt to the universal background model by using the characteristic vector of the speech data of each registered speaker so as to generate a registered speaker model of each registered speaker; the metric value calculation unit is configured to calculate metric values of the characteristic vector of a tested speaker on the universal background model generated by the background model generation unit and on the registered speaker model of each registered speaker, which is generated by the registered speaker model generation model; and the recognition unit is configured to recognize the tested speaker according to the metric values calculated by the metric value calculation unit.
Owner:SONY CORP

Backend i-vector enhancement method for speaker recognition system

The invention discloses a backend i-vector enhancement method for a speaker recognition system. The method is based on a deep neural network; and a backend i-vector regression model for the speaker recognition system is built on the basis of the application of the deep neural network in speech enhancement, and a backend feature processor applicable to the speaker recognition system is obtained. Compared with a conventional front-end speech enhancement algorithm, the backend i-vector enhancement method of the invention can improve the anti-noise performance of the speaker recognition system and optimize the structure model of the speaker recognition system, so that the practicability of the speaker recognition system in a noise environment can be effectively improved.
Owner:NANJING UNIV OF POSTS & TELECOMM

Long window scaling factor-based playback voice attack detection algorithm

The invention discloses a long window scaling factor-based playback voice attack detection algorithm which is developed for solving a problem that rights and interests of a legitimate are damaged when conventional attackers use playback voice to enter a speaker identification system. The detection algorithm can be used for effectively identifying playback voice from different sources and is high in detection accuracy; after a module of the detection algorithm is loaded to a GMM-UBM system, playback voice attack resistant capability is improved, error probability of the identification system and the like is lowered by 32%, and a safety problem of the identification system is greatly alleviated.
Owner:NINGBO UNIV

Record replay attack detection method and system based on channel mode noise

The invention relates to the technical field of intelligent voice signal processing, mode recognition and artificial intelligence and in particular relates to a record replay attack detection method and system in a speaker recognition system based on a channel mode noise. The invention discloses a simpler and more efficient record replay attack detection method in a speaker recognition system. The method comprises the following steps: (1) inputting a to-be-recognized voice signal; (2) pre-processing the voice signal; (3) extracting the channel mode noise in the pre-processed voice signal; (4) extracting a long time statistic feature based on the channel mode noise; and (5) classifying the long time statistic feature according to a channel noise classifying judging model. By using the channel mode noise to perform the record replay attack detection, the extracted feature dimension is low, the computation complexity is low, and the recognition error rate is low, therefore, the safety performance of the speaker recognition system is greatly improved, and the method and system provided by the invention can be used in the reality more easily.
Owner:SOUTH CHINA UNIV OF TECH

Speaker recognition method, equipment and system

The embodiment of the invention relates to a speaker recognition method, speaker recognition equipment and a speaker recognition system. The speaker recognition method comprises the following steps of: receiving a speaker confirmation instruction which is sent by a media gateway controller; performing speaker confirmation operation according to the speaker confirmation instruction, and acquiring a result of the speaker confirmation operation; and then reporting the result of the speaker confirmation operation to the media gateway controller. A media gateway of the embodiment of the invention performs the speaker confirmation operation according to the speaker confirmation instruction which is sent by the media gateway controller, and then reports the result of the speaker confirmation operation to the media gateway controller so as to realize speaker recognition through a media gateway control protocol in a separated framework.
Owner:HUAWEI TECH CO LTD

Method and apparatus for active speaker selection using microphone arrays and speaker recognition

A method and apparatus for performing active speaker selection in teleconferencing applications illustratively comprises a microphone array module, a speaker recognition system, a user interface, and a speech signal selection module. The microphone array module separates the speech signal from each active speaker from those of other active speakers, providing a plurality of individual speaker's speech signals. The speaker recognition system identifies each currently active speaker using conventional speaker recognition / identification techniques. These identities are then transmitted to a remote teleconferencing location for display to remote participants via a user interface. The remote participants may then select one of the identified speakers, and the speech signal selection module then selects for transmission the speech signal associated with the selected identified speaker, thereby enabling the participants at the remote location to listen to the selected speaker and neglect the speech from other active speakers.
Owner:WSOU INVESTMENTS LLC

Speaker recognition method based on three-dimensional convolutional neural network text independence and system

The invention discloses a speaker recognition system based on three-dimensional convolutional neural network text independence. The speaker recognition system comprises a module I, namely a voice acquisition module, a module II, namely a voice preprocessing module, a module III, namely a speaker recognition model training module, and a module IV, namely a speaker recognition module, wherein the voice acquisition module is used for acquiring voice data; the voice preprocessing module is used for extracting mel-frequency cepstrum coefficient characteristics of original voice data and used for ejecting non-voice data in the characteristics, and thus final training data are acquired; the speaker recognition model training module is sued for training off-line models recognized by a speaker; and the speaker recognition module is used for recognizing identity of a speaker in real time. The invention further discloses a speaker recognition method based on three-dimensional convolutional neural network text independence. By adopting the speaker recognition method and the speaker recognition system based on three-dimensional convolutional neural network text independence, the purpose that registration of a user is independent from a recognized text is achieved, and thus the user experience can be improved.
Owner:SICHUAN CHANGHONG ELECTRIC CO LTD

Method for quickly recognizing speaker

The invention provides a method for quickly recognizing a speaker and belongs to speaker recognition methods. The method comprises the following steps of: combining a Gaussian mixture model, and taking the supervector of the Gaussian mixture model as the feature parameter of the speaker; taking the supervector of the Gaussian mixture model as input, designing a one-class support vector machine classifier; and training N classifiers corresponding to N speakers, thus obtaining a voice sample of one speaker from one classifier. By utilizing the method, the speaker recognition speed is increased; for every new registered speaker, only one one-class support vector machine classifier is trained for the new speaker, so that the speaker recognition system has good extensibility.
Owner:JILIN UNIV

Recording attack prevention voiceprint recognition method and device and access control system

ActiveCN108039176AReal voiceRealize the judgment of the recordingSpeech analysisIndividual entry/exit registersFeature vectorSpeaker recognition system
The invention discloses a recording attack prevention voiceprint recognition method and device and an access control system. The method comprises the following steps of obtaining an audio to be detected; extracting a first MFCC feature vector, a first GFCC feature vector and a first CQCC feature vector of the audio to be detected; synthesizing the first MFCC feature vector, the first GFCC featurevector and the first CQCC feature vector; obtaining the first acoustic feature vector of the audio to be detected; performing matching degree comparison on the first acoustic feature vector, a recording acoustic feature model obtained through the training by an SVM classifier in the preset training template base and a real voice acoustic feature model; judging whether the first matching degree ofthe first acoustic feature vector and the recording acoustic feature vector model is greater than or equal to the second matching degree of the first acoustic feature and the real voice acoustic feature vector model or not; if so, judging that the audio to be detected is the recording audio; if not, judging that the audio to be detected is the real voice audio. The technical problems that the existing speaker recognition system has low voice recognition accuracy and relies on the specific text are solved.
Owner:SPEAKIN TECH CO LTD

Text-related speaker recognition method based on infinite-state hidden Markov model

The invention discloses a text-related speaker recognition method based on an infinite-state hidden Markov model, which can be used for solving the problem that overfitting or underfitting data is easily generated in the traditional hidden Markov model. The text-related speaker recognition method disclosed by the invention comprises the following steps of: firstly, carrying out preprocessing and feature extraction on a voice signal set for training; then, describing the set for training in a training process by adopting the infinite-state hidden Markov model, wherein the model has an infinite state number before training data arrives and an output probability distribution function corresponding to each state is expressed by using a student's t mixed model; after the training data arrives, calculating to obtain a parameter value in the model and the distribution condition of random variables; and during recognition, calculating a likelihood value related to each trained speaker model on the basis of recognizable voices subjected to the processing and feature extraction, wherein a speaker corresponding to the maximal likelihood value is used as a recognition result. The method disclosed by the invention can be used for effectively improving the recognition accuracy rate of a text-related speaker recognition system, and in addition, the text-related speaker recognition system has better robustness for noises.
Owner:NANJING UNIV OF POSTS & TELECOMM

Speaker identifying method and system

The invention discloses a speaker identifying method and a system. The speaker identifying method comprises the following steps of: establishing a universal background model; establishing a to-be-identified speaker module; and using a training voice signal of the speaker to identify the speaker. Compared with the prior art, the invention has the following advantages that the high-performance speaker identifying system is disclosed by combining model space transformation and characteristic space transformation, and personal pronunciation features of the speaker are comprehensively reflected by the transformation of the two spaces. The transformation of the two spaces is calculated by using a self-adaptive algorithm based on the universal background model, so that good stability is achieved. Compared with the way of identifying the speaker by independently adopting model space transformation in the prior art, the identification rate of the system is greatly improved. Meanwhile, the system is more stable, and not easy to imitate.
Owner:镇江每科电子科技有限公司

Confidence levels for speaker recognition

The present invention relates to a system and method of making a verification decision within a speaker recognition system. A speech sample is gathered from a speaker over a period of time a verification score is then produce for said sample over the period. Once the verification score is determined a confidence measure is produced based on frame score observations from said sample over the period and a confidence measure calculated based on the standard Gaussian distribution. If the confidence measure indicates with a set level of confidence that the verification score is below the verification threshold the speaker is rejected and gathering process terminated.
Owner:TORQX +1

Speaker recognition method based on Gaussian super vector and deep neural network

The invention discloses a speaker recognition method based on a Gaussian super vector and a deep neural network. The method comprises a speaker feature extraction stage, a deep neural network design stage, and a speaker identification and decision-making stage. According to the invention, the deep neural network is fused with a speaker recognition system model, and the obvious effect of a multilayer structure combining the Gaussian super vector and the deep neural network in the aspect of improving the characterization capability of an evaluation model is achieved. The speaker recognition method provided by the invention can effectively improve the recognition performance of a system in the environment of background noise, reduces the influence of the noise on the system performance, improves the robustness of the system noise, optimizes the system structure, and improves the competitiveness of a corresponding speaker recognition product.
Owner:HUBEI UNIV OF TECH

Dual-mode voice identity recognition method

The invention discloses a dual-mode voice identity recognition method which is applied to an identity recognition system comprising a sound acquisition device and an information processing device. The system also comprises a voice password feature library and a vocal print feature library. Password recognition and vocal print recognition are integrated in one identity recognition system. The technical effects of the dual-mode voice identity recognition method are that: the invention provides the dual-mode voice identity recognition method based on isolated word recognition, i.e. password recognition, and speaker recognition, i.e. vocal print recognition so that stability of a distinguishing system performing vocal print feature recognition of single speaker is greatly enhanced, practical value of similar systems is increased and security of the recognition system is enhanced under the premise of not increasing calculation amount. With dual judgment, a defect of mis-judgment of a speaker recognition system caused by simulation can be overcome to some extent, and the effect that passwords of a single voice password distinguishing system are liable to be leaked can also be overcome.
Owner:西安远声电子科技有限公司

Emotional speaker identification method based on reliability detection of fuzzy support vector machine

The invention discloses an emotional speaker identification method based on reliability detection of fuzzy support vector machine, which comprises the following steps of: extracting speech component characteristics and combining the speech component characteristics with corresponding weight in a universal broadcast modem (UBM) model to form background model component characteristics; taking the obtained background model component characteristics as a fuzzy membership, and establishing a fuzzy support component machine model in a general model component; carrying out a reliability detection byusing the fuzzy support vector machine model to obtain the reliability characteristics; and calculating the reliability characteristics and identifying the speaker. The method provided by the invention improves the robustness of the speaker identification system and the performance for identifying a speaker.
Owner:ZHEJIANG UNIV

Speaker recognition systems

Speaker recognition (identification and / or verification) methods and systems, in which speech models for enrolled speakers consist of sets of feature vectors representing the smoothed frequency spectrum of each of a plurality of frames and a clustering algorithm is applied to the feature vectors of the frames to obtain a reduced data set representing the original speech sample, and wherein the adjacent frames are overlapped by at least 80%. Speech models of this type model the static components of the speech sample and exhibit temporal independence. An identifier strategy is employed in which modelling and classification processes are selected to give a false rejection rate substantially equal to zero. Each enrolled speaker is associated with a cohort of a predetermined number of other enrolled speakers and a test sample is always matched with either the claimed identity or one of its associated cohort. This makes the overall error rate of the system dependent only on the false acceptance rate, which is determined by the cohort size. The false error rate is further reduced by use of multiple parallel modelling and / or classification processes. Speech models are normalised prior to classification using a normalisation model derived from either the test speech sample or one of the enrolled speaker samples (most preferably from the claimed identity enrolment sample).
Owner:SECURIVOX

Speaker identification method based on deep stack autoencoder network

The invention relates to a speaker identification method based on the deep stack autoencoder network. The method comprises steps of S1, speaker feature extraction; S2, stack autoencoder network design; and S3: speaker identification and decision making. The method is advantaged in that compared with traditional speaker identification, the deep stack autoencoder network is fused with a speaker identification system model, in combination with the multi-layer structure of a stack autoencoder to improve the characterization ability of an evaluation model, system identification performance in the presence of background noise can be finitely improved, influence of the noise on the system performance is reduced, system noise robustness is improved, the system structure is optimized, and identification timeliness is effectively enhanced.
Owner:HUBEI UNIV OF TECH

Speaker recognition method through emotional model synthesis based on neighbors preserving principle

A speaker recognition method through emotional model synthesis based on Neighbors Preserving Principle is enclosed. The methods includes the following steps: (1) training the reference speaker's and user's speech models; (2) extracting the neutral-to-emotion transformation / mapping sets of GMM reference models; (3) extracting the emotion reference Gaussian components mapped by or corresponding to several Gaussian neutral reference Gaussian components close to the user's neutral training Gaussian component; (4) synthesizing the user's emotion training Gaussian component and then synthesizing the user's emotion training model; (5) synthesizing all user's GMM training models; (6) inputting test speech and conducting the identification. This invention extracts several reference speeches similar to the neutral training speech of a user from a speech library by employing neighbor preserving principles based on KL divergence and combines an emotion training speech of the user using the emotion reference speech in the reference speech, improving the performance of the speaker recognition system in the situation where the training speech and the test speech are mismatched, and the robustness of the speaker recognition system is increased.
Owner:ZHEJIANG UNIV

Speaker gender automatic recognition method and system based on deep self-coding network

The invention belongs to the technical field of vocal print recognition, and discloses a speaker gender automatic recognition method and system based on deep self-coding network. The method comprisesthe steps that a voice signal training UIBM general background model which is not related to a registered speaker and channel is utilized; i-vector of registered data is extracted; i-vector of testingdata is extracted; a deep self-coding network is trained; mode matching and recognition are conducted, and model evaluation is conducted. The method and the system have the advantages that the deep self-coding network is applied to speaker gender recognition, the powerful learning capability of the deep self-coding network is used for characterizing speaker features of different genders, the re-extraction of the features is achieved, feature dimension is reduced, and therefore the complicity of sorting computation is reduced; the method can be further popularized and used for speaker recognition to try to improve the robustness of a speaker recognition system.
Owner:HUAZHONG NORMAL UNIV

Estimation of reliability in speaker recognition

A method for estimating the reliability of a result of a speaker recognition system concerning a testing audio and a speaker model, which is based on one, two, three or more model audios, the method using a Bayesian Network to estimate whether the result is reliable. In estimating the reliability of the result of the speaker recognition system one, two, three, four or more than four quality measures of the testing audio and one, two, three, four or more than four quality measures of the model audio(s) are used.
Owner:AGNITIO

Speaker identification method base on simple direct tolerance learning algorithm

The invention provides a speaker identification method base on a simple direct tolerance learning algorithm. The method comprises the following steps: acquiring voice samples of multiple speakers, extracting i-vectors of all the samples, performing channel compensation processing by use of an LDA or WCCN method, performing length normalizing, and forming a training sample set; according to the i-vectors of the training sample set and speaker identity, constructing a similar sample pair set and a non-similar sample pair set; by use of a KISS algorithm, obtaining a tolerance matrix by performing training on the similar sample pair set and the non-similar sample pair set; and for two pieces of new voice, their i-vectors are extracted firstly, the channel compensation processing is carried out by use of the LDA or WCCN method, the length normalizing is performed, by use of the previously calculated tolerance matrix, a Mahalanobis distance between the two i-vectors is calculated and compared with a threshold, and thus whether the two pieces of new voice belong to the same speaker is determined. According to the invention, the obtained Mahalanobis distance tolerance matrix can better truly reflect similarities and distinctions of a sample space so as to improve the performance of a speaker identification system.
Owner:JIANGXI NORMAL UNIV

Voice spoofing attack detection method based on voice signal spectrum characteristics and deep learning

The invention discloses a voice spoofing attack detection method based on voice signal spectrum characteristics and deep learning. After a microphone of the electronic equipment receives a voice signal, signal processing work is carried out on voice, then specific features are extracted, finally, the marked features are input into a classifier of a deep convolutional neural network SE-ResNet to betrained, and the trained classifier is adopted to carry out voice living body detection on the voice signal to be detected. and a result of whether the voice is sent by the human voice or the voice attack is replayed is output. According to the invention, the voice spoofing attack represented by the replay attack for the speaker recognition system can be accurately and effectively detected.
Owner:ZHEJIANG UNIV

Method for reducing error identification rate of text irrelevant speaker identification system

ActiveCN102237089AReduce false recognition rateImprove bindingSpeech analysisFemale groupMale group
The invention discloses a method for reducing error identification rate of a text irrelevant speaker identification system, and relates to a method for reducing error identification rate of a speaker identification system. The method solves the problem that the error identification rate of the conventional text irrelevant speaker identification system is increased in open set test. The method comprises the following steps of: dividing speakers in a closed set into a male group and a female group by using an identification threshold value of known speakers in the closed set acquired by a reference speaker identification system, dividing the male group and the female group into a plurality of small groups in a form of threshold value subsection, and finding out the central distribution of each small group; adding a coarse screening module at the front end of the reference speaker identification system, judging the gender of tested voice, and then comparing the voice to be tested with the central distribution of the small group of the same gender to obtain a probability threshold value of the voice to be tested; and performing identification by using the voice frame of the probability threshold value. According to the method, the identification accuracy rate is improved by 2 to 3 percent compared with the conventional system, and the method can be used for the text irrelevant speaker identification system.
Owner:哈尔滨工业大学高新技术开发总公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products