A Quality Control Method for Crowdsourcing Classification Data Based on Self-paced Learning

A quality control method and classification data technology, applied in the field of crowdsourcing classification data quality control based on self-paced learning, can solve problems such as providing errors, random provision, uselessness, etc., to reduce the expenditure of crowdsourcing tasks and achieve high accuracy Effect
CN107357763BActive Publication Date: 2020-08-14DALIAN UNIV OF TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
DALIAN UNIV OF TECH
Publication Date
2020-08-14

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses a crowdsourcing classification data quality control method based on self-learning and belongs to the field of computer scientific data mining technology. The method is used for true classification discovery of multiple classification crowdsourcing annotation tasks and recognition of a malicious worker. According to the method, first, sample credibility is calculated according to initial dataset nature; second, a sample is selected; third, a true tag and the ability of a worker are calculated; fourth, another sample is selected according to updated ability and the true tag; fifth, after all sample points are completely selected, further optimization is performed; and finally an annotated true answer and recognition results of the ability of the worker and a malicious and passive worker are acquired at the same time. Experiments prove that a better result can be obtained through the method compared with a traditional method.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention belongs to the technical field of computer science data mining, and relates to a method for controlling the quality of crowdsourcing classified data based on self-paced learning. Background technique

[0002] Crowdsourcing (also known as human computing, crowd wisdom) means that companies and enterprises outsource task distribution to uncertain (generally a large number of) people in an open manner. It is believed that the "wisdom of the majority" is far more accurate than individual judgment. A large number of crowdsourcing platforms distribute tasks to registered workers, and then pay corresponding wages according to the marked data. The data obtained by crowdsourcing will be applied to a large number of data mining, machine learning, and deep learning tasks, so the quality of the data obtained by crowdsourcing data will seriously affect the results of subsequent learning tasks. In the crowdsourcing distribution system, algorithms for...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More