Weak supervision voice retrieval method and system based on attention

An attention and weak supervision technology, applied in speech analysis, digital data information retrieval, natural language data processing and other directions, can solve the problems of insufficient attention mechanism usage methods and insufficient labeled data, and improve retrieval efficiency and accuracy. , the effect of good application prospects

Active Publication Date: 2021-04-20
PLA STRATEGIC SUPPORT FORCE INFORMATION ENG UNIV PLA SSF IEU +1
View PDF8 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Character-level annotation of speech requires a lot of resources, so insufficient annotation data will become a very real problem in speech retrieval, and the use of attention mechanism is not rich enough

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Weak supervision voice retrieval method and system based on attention
  • Weak supervision voice retrieval method and system based on attention
  • Weak supervision voice retrieval method and system based on attention

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] In order to make the purpose, technical solution and advantages of the present invention more clear and understandable, the present invention will be further described in detail below in conjunction with the accompanying drawings and technical solutions.

[0023] The embodiment of the present invention provides an attention-based weakly supervised speech retrieval method, which includes the following content: extract text keywords and convert them into keyword feature vectors, and perform feature extraction on audio data to obtain audio feature vectors; use attention mechanism The keyword feature vector and the audio feature vector are fused to obtain the speech retrieval feature vector; the speech retrieval feature vector is sent to the trained and optimized keyword recognition module for recognition to detect whether the text keyword appears in the speech data.

[0024] By using the attention mechanism to obtain the speech retrieval feature vector that combines the tex...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of voice retrieval, and particularly relates to a weak supervision voice retrieval method and system based on attention, and the method comprises the steps: extracting a text keyword, converting the text keyword into a keyword feature vector, and carrying out the feature extraction of audio data to obtain an audio feature vector; fusing the keyword feature vector and the audio feature vector by using an attention mechanism to obtain a voice retrieval feature vector; and sending the voice retrieval feature vector to a trained and optimized keyword recognition module for recognition so as to detect whether the text keyword appears in the voice data or not. According to the method, the speech retrieval feature vector fusing the text feature vector and the audio feature vector is obtained by using the attention mechanism, and the recognition model can be trained and optimized by using the weak supervision annotation data, so that the retrieval efficiency and accuracy are improved.

Description

technical field [0001] The invention belongs to the technical field of speech retrieval, in particular to an attention-based weakly supervised speech retrieval method and system. Background technique [0002] The main task of speech retrieval is to find interesting keywords from massive speech databases and return corresponding positions. Common speech retrieval methods include keyword search technology based on large vocabulary continuous speech recognition and keyword search method based on neural network. Speech retrieval using keyword search technology based on continuous speech recognition with large vocabulary requires two steps. The first step is to train a continuous speech recognition system with a large vocabulary, and use the trained system to decode the audio to be searched to generate the corresponding word lattice. The second step is to convert the word pattern of the audio library to be searched generated by the decoder into an inverted index, so as to searc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/632G06F16/683G06F40/284G10L25/30
Inventor 张文林胡恒博闫红刚郝朝龙邱泽宇李喜坤贺晓年
Owner PLA STRATEGIC SUPPORT FORCE INFORMATION ENG UNIV PLA SSF IEU
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products