A method and system for judging the number of speakers
A speaker and purpose technology, applied in the field of speech signal processing, can solve problems such as inaccurate number of speakers, achieve the effect of eliminating the step size limit, improving the effect of speech recognition, and improving the accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0078] like figure 2 Shown is a flowchart of a method for judging the number of speakers provided by an embodiment of the present invention, including the following steps:
[0079] Step S01, receiving a voice signal.
[0080] In this embodiment, a voice signal is received through a device such as a microphone. The voice signal can be the real-time pronunciation of the speaker, or a voice signal saved by a recording device, etc. Of course, it can also be a voice signal transmitted by communication equipment, such as a mobile phone, a teleconferencing system, and the like.
[0081] In practical application, it is necessary to carry out endpoint detection to the received speech signal. The endpoint detection refers to determining the start point and termination point of speech from a section of signal containing speech. Effective endpoint detection can not only minimize the processing time, but also And it can remove the noise interference of the silent segment. In this embod...
Embodiment 2
[0115] A method for judging the number of speakers, as described in Embodiment 1, the difference is that in this embodiment, in order to eliminate the influence of channel interference on judging the similarity between speech signal classes, a probabilistic linear discriminant analysis (Probabilistic linear discriminant analysis (PLDA) technology to remove the interference information of the channel, so as to improve the accuracy of judging the similarity between speech signal classes.
[0116] Step S11 to step S15 are the same as the first embodiment, and will not be described in detail here.
[0117] Step S16, calculation process: calculate and compare the similarity between different speech signal classes according to the speech signal features of each segmented signal segment in the speech signal class after re-segmentation.
[0118] In this embodiment, the PLDA technology is used to remove channel interference information. Specifically, the part representing the channel ...
Embodiment 3
[0132] A method for judging the number of speakers, as described in Embodiment 2, the difference is that in this embodiment, in order to further improve the accuracy of judging the similarity between speech signal classes, this embodiment uses probabilistic linear discrimination The analysis (Probabilistic linear discriminant analysis, PLDA) technology calculates the PLDA score between each speech signal category, and judges the similarity between each speech signal category through the PLDA score, thereby improving the accuracy of judging the similarity between speech signal categories. Wherein, the larger the value of the PLDA score, the higher the possibility that the speech signal feature of the corresponding class 2 speech signal class is judged as class 1.
[0133] Step S11 to step S15 are the same as the second embodiment, and will not be described in detail here.
[0134] Step S16, calculation process: calculate and compare the similarity between different speech signa...
PUM

Abstract
Description
Claims
Application Information

- R&D
- Intellectual Property
- Life Sciences
- Materials
- Tech Scout
- Unparalleled Data Quality
- Higher Quality Content
- 60% Fewer Hallucinations
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com