A keyword detection method and system based on unlabeled keyword data
A keyword detection and unlabeled technology, which is applied in audio data retrieval, digital data information retrieval, metadata audio data retrieval, etc., can solve the problems affecting the accuracy of the training model, large manpower, material resources and time, and low classification accuracy and other issues to achieve the effects of saving manpower and time costs, low false negative rate, and high accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0055] Such as figure 1 and figure 2 As shown, this embodiment discloses a keyword detection method based on unlabeled keyword data, including the following steps:
[0056] S100: Collect a large amount of unmarked audio data, add preset wake-up word audio and non-wake-up word audio to the unmarked audio data, and form a pre-processing audio library;
[0057] Such as image 3 As shown, add preset wake-up word audio and non-wake-up word audio to the unlabeled audio data. The types of wake-up word audio and non-wake-up word audio to be added are set according to the specific situation, which can be one type or multiple types ;Add N1 audio for each type of wake-up word, add N2 audio for non-awakening word, N1 and N2 are set according to the specific situation, for example, set N1 to 50~200, and set N2 to 0~100.
[0058] S200: Classify the audio data in the preprocessed audio library based on an unsupervised deep learning classification method;
[0059] Set the total number of...
Embodiment 2
[0141] Such as Figure 5 As shown, the present invention provides a keyword detection system based on unlabeled keyword data, including a preprocessing module, a classification module, a feature extraction module, a model training module and a keyword detection module;
[0142] The preprocessing module is used to collect a large amount of unmarked audio data, and adds preset wake-up word audio and non-wake-up word audio to the unmarked audio data to form a pre-processing audio library;
[0143] The classification module is used to classify the audio data in the preprocessed audio library based on an unsupervised deep learning classification method;
[0144] The feature extraction module is used to extract features from the classified audio data to generate feature data;
[0145] The model training module is used to input feature data into different types of neural network models for training to obtain multiple different keyword detection models;
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com