Classification model training method based on crowdsourcing technology

A classification model and training method technology, applied in character and pattern recognition, instruments, computer parts, etc., can solve problems such as difficulty in ensuring the quality of labeling information, accuracy measurement, affecting the generalization performance of classification models, etc., to overcome the problem of training. Interference, guarantee the application effect, and improve the effect of generalization performance

Inactive Publication Date: 2017-10-13
HARBIN ENG UNIV
View PDF0 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Since the annotation information obtained based on crowdsourcing technology comes from multiple online network users, it is difficult to guarantee the quality of the collected annotation information. experience

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Classification model training method based on crowdsourcing technology
  • Classification model training method based on crowdsourcing technology
  • Classification model training method based on crowdsourcing technology

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0036] The following examples describe the present invention in more detail.

[0037] According to the flowchart of the classification model training process based on crowdsourcing technology among the present invention, concrete steps are as follows:

[0038] 1) Randomly select n (n

[0039] 2) Determine the training sample x i The corresponding label information y i ,when when, y i = 1, otherwise, y i =0.

[0040] 3) Learning a classification model with parameter w using the training samples and their label information.

[0041] 4) In the training sample and its crowdsourcing annotation set on, according to the label information y i , to estimate the annotation level set θ corresponding to multiple users, where, given a set of annotation information provided by the jth user on category c The user's corresponding annotation level for:

[0042]

[0043] in, and Respectively represent the number of times the user gave correct labels and wrong labels. Taki...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a classification model training method based on the crowdsourcing technology. The method comprises a step of estimating a level of providing annotation information of a user on crowdsourcing annotation information corresponding to a few samples, a step of taking an observed annotation people level as prior knowledge to determine the annotation information used by training samples, a step of taining a classification model on the training sample and the annotation information, a step of using the classification model to select a training sample which allows a model expected error to be minimum, and predicting a category to which the sample belongs, a step of adding the selected sample and annotation information provided by a user with a highest annotation level in the category into a training set, and a step of carrying out iterative execution of the above steps on an updated training set until the precision of the classification model or the number of the training samples reaches a preset standard. The method has the advantages that disadvantageous influence of the low quality annotation information provided by a user with a low annotation level on the classification model training is avoided, and an effect of training a high generalization ability classification model in a crowdsourcing environment is guaranteed.

Description

technical field [0001] The invention relates to a classification model training method. Background technique [0002] Currently, under the framework of supervised learning in machine learning, training a classification model requires pre-collecting a set of data samples with labeled information. The quantity and quality of the collected training data directly determine the generalization performance of the classification model. In the traditional training data collection process, experts with professional domain knowledge are required to provide unique and correct labeling information corresponding to data samples to ensure that the trained classification model has good generalization performance. [0003] The challenge of this traditional approach is that there are fewer people with professional backgrounds in real-world tasks, and it takes a lot of time and time to obtain sample labeling information. Therefore, with the development of network technology and data storage ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62
CPCG06F18/2415G06F18/214
Inventor 吴伟宁
Owner HARBIN ENG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products