Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

81 results about "Speaker recognition system" patented technology

Speaker recognition system and method

The invention discloses a speaker recognition system and a speaker recognition method. The speaker recognition system comprises a characteristic extraction unit, a background model generation unit, a registered speaker model generation unit, a metric value calculation unit and a recognition unit, wherein the characteristic extraction unit is configured to extract a characteristic vector of speech data of a speaker; the background model generation unit is configured to perform internal clustering on the characteristic vector of the speech data of a background speaker and generate a universal background model aiming at a normal speaker according to the result of the internal clustering; the registered speaker model generation unit is configured to adapt to the universal background model by using the characteristic vector of the speech data of each registered speaker so as to generate a registered speaker model of each registered speaker; the metric value calculation unit is configured to calculate metric values of the characteristic vector of a tested speaker on the universal background model generated by the background model generation unit and on the registered speaker model of each registered speaker, which is generated by the registered speaker model generation model; and the recognition unit is configured to recognize the tested speaker according to the metric values calculated by the metric value calculation unit.
Owner:SONY CORP

Speaker recognition method based on three-dimensional convolutional neural network text independence and system

The invention discloses a speaker recognition system based on three-dimensional convolutional neural network text independence. The speaker recognition system comprises a module I, namely a voice acquisition module, a module II, namely a voice preprocessing module, a module III, namely a speaker recognition model training module, and a module IV, namely a speaker recognition module, wherein the voice acquisition module is used for acquiring voice data; the voice preprocessing module is used for extracting mel-frequency cepstrum coefficient characteristics of original voice data and used for ejecting non-voice data in the characteristics, and thus final training data are acquired; the speaker recognition model training module is sued for training off-line models recognized by a speaker; and the speaker recognition module is used for recognizing identity of a speaker in real time. The invention further discloses a speaker recognition method based on three-dimensional convolutional neural network text independence. By adopting the speaker recognition method and the speaker recognition system based on three-dimensional convolutional neural network text independence, the purpose that registration of a user is independent from a recognized text is achieved, and thus the user experience can be improved.
Owner:SICHUAN CHANGHONG ELECTRIC CO LTD

Recording attack prevention voiceprint recognition method and device and access control system

ActiveCN108039176AReal voiceRealize the judgment of the recordingSpeech analysisIndividual entry/exit registersFeature vectorSpeaker recognition system
The invention discloses a recording attack prevention voiceprint recognition method and device and an access control system. The method comprises the following steps of obtaining an audio to be detected; extracting a first MFCC feature vector, a first GFCC feature vector and a first CQCC feature vector of the audio to be detected; synthesizing the first MFCC feature vector, the first GFCC featurevector and the first CQCC feature vector; obtaining the first acoustic feature vector of the audio to be detected; performing matching degree comparison on the first acoustic feature vector, a recording acoustic feature model obtained through the training by an SVM classifier in the preset training template base and a real voice acoustic feature model; judging whether the first matching degree ofthe first acoustic feature vector and the recording acoustic feature vector model is greater than or equal to the second matching degree of the first acoustic feature and the real voice acoustic feature vector model or not; if so, judging that the audio to be detected is the recording audio; if not, judging that the audio to be detected is the real voice audio. The technical problems that the existing speaker recognition system has low voice recognition accuracy and relies on the specific text are solved.
Owner:SPEAKIN TECH CO LTD

Text-related speaker recognition method based on infinite-state hidden Markov model

The invention discloses a text-related speaker recognition method based on an infinite-state hidden Markov model, which can be used for solving the problem that overfitting or underfitting data is easily generated in the traditional hidden Markov model. The text-related speaker recognition method disclosed by the invention comprises the following steps of: firstly, carrying out preprocessing and feature extraction on a voice signal set for training; then, describing the set for training in a training process by adopting the infinite-state hidden Markov model, wherein the model has an infinite state number before training data arrives and an output probability distribution function corresponding to each state is expressed by using a student's t mixed model; after the training data arrives, calculating to obtain a parameter value in the model and the distribution condition of random variables; and during recognition, calculating a likelihood value related to each trained speaker model on the basis of recognizable voices subjected to the processing and feature extraction, wherein a speaker corresponding to the maximal likelihood value is used as a recognition result. The method disclosed by the invention can be used for effectively improving the recognition accuracy rate of a text-related speaker recognition system, and in addition, the text-related speaker recognition system has better robustness for noises.
Owner:NANJING UNIV OF POSTS & TELECOMM

Speaker recognition method through emotional model synthesis based on neighbors preserving principle

A speaker recognition method through emotional model synthesis based on Neighbors Preserving Principle is enclosed. The methods includes the following steps: (1) training the reference speaker's and user's speech models; (2) extracting the neutral-to-emotion transformation/mapping sets of GMM reference models; (3) extracting the emotion reference Gaussian components mapped by or corresponding to several Gaussian neutral reference Gaussian components close to the user's neutral training Gaussian component; (4) synthesizing the user's emotion training Gaussian component and then synthesizing the user's emotion training model; (5) synthesizing all user's GMM training models; (6) inputting test speech and conducting the identification. This invention extracts several reference speeches similar to the neutral training speech of a user from a speech library by employing neighbor preserving principles based on KL divergence and combines an emotion training speech of the user using the emotion reference speech in the reference speech, improving the performance of the speaker recognition system in the situation where the training speech and the test speech are mismatched, and the robustness of the speaker recognition system is increased.
Owner:ZHEJIANG UNIV

Speaker identification method base on simple direct tolerance learning algorithm

The invention provides a speaker identification method base on a simple direct tolerance learning algorithm. The method comprises the following steps: acquiring voice samples of multiple speakers, extracting i-vectors of all the samples, performing channel compensation processing by use of an LDA or WCCN method, performing length normalizing, and forming a training sample set; according to the i-vectors of the training sample set and speaker identity, constructing a similar sample pair set and a non-similar sample pair set; by use of a KISS algorithm, obtaining a tolerance matrix by performing training on the similar sample pair set and the non-similar sample pair set; and for two pieces of new voice, their i-vectors are extracted firstly, the channel compensation processing is carried out by use of the LDA or WCCN method, the length normalizing is performed, by use of the previously calculated tolerance matrix, a Mahalanobis distance between the two i-vectors is calculated and compared with a threshold, and thus whether the two pieces of new voice belong to the same speaker is determined. According to the invention, the obtained Mahalanobis distance tolerance matrix can better truly reflect similarities and distinctions of a sample space so as to improve the performance of a speaker identification system.
Owner:JIANGXI NORMAL UNIV

Method for reducing error identification rate of text irrelevant speaker identification system

ActiveCN102237089AReduce false recognition rateImprove bindingSpeech analysisFemale groupMale group
The invention discloses a method for reducing error identification rate of a text irrelevant speaker identification system, and relates to a method for reducing error identification rate of a speaker identification system. The method solves the problem that the error identification rate of the conventional text irrelevant speaker identification system is increased in open set test. The method comprises the following steps of: dividing speakers in a closed set into a male group and a female group by using an identification threshold value of known speakers in the closed set acquired by a reference speaker identification system, dividing the male group and the female group into a plurality of small groups in a form of threshold value subsection, and finding out the central distribution of each small group; adding a coarse screening module at the front end of the reference speaker identification system, judging the gender of tested voice, and then comparing the voice to be tested with the central distribution of the small group of the same gender to obtain a probability threshold value of the voice to be tested; and performing identification by using the voice frame of the probability threshold value. According to the method, the identification accuracy rate is improved by 2 to 3 percent compared with the conventional system, and the method can be used for the text irrelevant speaker identification system.
Owner:哈尔滨工业大学高新技术开发总公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products