A data screening method and device
A technology of speech data and text data, applied in the field of data processing, can solve problems such as the inability to guarantee the effect of acoustic models and language models
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
no. 1 example
[0065] This embodiment will introduce a voice data screening method. Prior to this, a training data set composed of a large number of voices can be pre-built. Through this method, the voice data that the acoustic model really needs to learn can be screened out from the training data set. , used to train the acoustic model. In this way, using the limited data resources (ie low resources) selected, the acoustic model can learn the acoustic features as comprehensively as possible, which not only improves the training speed of the acoustic model, but also improves the Predictive performance of acoustic models.
[0066] see figure 1 , which is a schematic flow chart of a voice data screening method provided in this embodiment, the method includes the following steps:
[0067] S101: Using the voice data of the first duration, train an acoustic model.
[0068] In this embodiment, in order to improve the data quality of the training data of the acoustic model, before this step S101,...
no. 2 example
[0108] This embodiment will introduce a text data screening method. Prior to this, a training data set composed of a large amount of text can be pre-built. Through this method, the specific text domain classification model that the text domain classification model really needs to learn can be screened out from the training data set. The text data in the domain (such as the medical field) is used to train the text domain classification model, so that the text domain classification model can perform the specific domain as comprehensively as possible by using the limited data resources (that is, low resources) that are screened out. The learning of text features not only improves the training speed of the text domain classification model, but also improves the classification effect of the text domain classification model for this specific field. Furthermore, the text domain classification model can be used to more accurately select the text data of the specific domain from the tra...
no. 3 example
[0146] It should be noted that this embodiment will introduce a data screening method and a model building method, which can be specifically implemented by using the screening methods introduced in the first and second embodiments above.
[0147] see Figure 4 , which is a schematic flowchart of a data screening method provided in this embodiment, the method includes:
[0148] Step S401: Based on the learning requirements for data features, use a preset screening strategy to perform data screening in the data set to be screened to obtain screened data, wherein the screened data carries unlearned data features.
[0149] In practical applications, especially in the field of deep learning, in order to achieve different functional goals, it is necessary to collect a large amount of sample data related to the functional goals to form a data set, and it is expected that the data features in the data set can be analyzed. Comprehensive learning, however, in order to achieve comprehen...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com