Method and system for assisting in labeling model training data
A technology for training data and labeling models, which is applied in the field of auxiliary labeling model training data to improve labeling efficiency and labeling quality, reduce labeling costs, and improve labeling quality.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0058] see figure 1 , the embodiment of the present invention discloses a method for assisting labeling model training data, including:
[0059] S1. Construct two training sets, wherein the training set includes a plurality of training data, and the training data is obtained after being sampled from the data pool to be labeled and labeled by the operator;
[0060] S2. Train two classifiers that correspond one-to-one to the training set, and use the two classifiers to predict the category of the training data in the training set of the other party respectively, and obtain the wrong data that the prediction result is inconsistent with the category marked by the operator, and at the same time obtain the undefined category. training data, and extracting wrong features from wrong data;
[0061] S3. Use two classifiers to predict the data pool to be labeled at the same time, filter and retain the contradictory data whose prediction results of the two classifiers are inconsistent, a...
Embodiment 2
[0104] This embodiment provides a system for assisting in labeling model training data, including a sampling module, an error identification module, a data pool refresh module, a data export module and a data verification module, wherein,
[0105] The sampling module is used to construct two training sets, and to obtain multiple new training data based on the new data pool to be labeled after the data pool to be labeled is updated and distribute them to the two training sets, that is, in the iterative process, based on the new construct a new training set from the labeling pool; wherein, the training set includes a plurality of training data, and the training data is obtained by sampling from the data pool to be labeled and labeled by the operator;
[0106] The error identification module is used to train two classifiers that correspond to the training set one by one, and use the two classifiers to predict the category of the training data in the training set of the other party...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com