A real-time role-based transcription method, device and system
A technology of roles and speech segments, applied in speech analysis, speech recognition, instruments, etc., can solve the problems of grabbing to this side by mistake, feeling unfriendly, unable to obtain recognition effect, etc., and achieve the effect of reducing load
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0062] Such as figure 1 with figure 2 As shown, the present invention provides a real-time role-based transcription method, the method comprising the following steps:
[0063] S100: Installing a sound collection device with a directional microphone on the side between the two speakers, respectively collecting the left channel sound signal and the right channel sound signal;
[0064] S200: Detect whether the left channel sound signal and the right channel sound signal contain a speech segment, if the speech segment is detected, then extract the left channel speech segment and the right channel speech segment corresponding to the speech segment;
[0065] Such as image 3 As shown, the specific steps of detecting whether the voice segment is included in the left channel sound signal and the right channel sound signal are: extracting the fundamental frequency and subband energy in the left channel sound signal and the right channel sound signal; according to the high-dimensiona...
Embodiment 2
[0102] Such as Figure 8 As shown, the present invention provides a real-time role-based transcription device, which includes a sound collection device, a voice activity detection (VAD) module, a single-sided speech judgment module, a clustering module, a separation module, and a sending module;
[0103] The sound collection device includes a directional microphone, which is respectively used to collect the left channel sound signal and the right channel sound signal; the directional microphone includes a left sound channel and a right sound channel, and the left sound channel and the right sound track diverge at an angle of 90 degrees ~120 degrees, the spacing is 10cm to 15cm; the angle between the left and right channels and the vertical direction is 40 degrees to 60 degrees.
[0104] The voice activity segment detection module is used to detect whether the left channel sound signal and the right channel sound signal contain a voice segment, if a voice segment is detected, t...
Embodiment 3
[0121] Such as Figure 12 As shown, the present invention provides a real-time role-by-role transcription system, which includes a processor, a left-side speech recognition engine, a right-side speech recognition engine, a network card, and the real-time role-by-role transcription device provided in Embodiment 2, and the processor is connected with The real-time role-based transcription device is connected to the network card, and the network card is respectively connected to the left-channel speech recognition engine and the right-channel speech recognition engine.
[0122] The real-time role-based transcription device sends the left voice signal to the left voice recognition engine, and sends the right voice signal to the right voice recognition engine; compared with the previous transcription system that only sent one signal to the engine, the present invention combines both ends Different signals are independently input to the speech recognition engine; both systems have t...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


