Data screening method and device, storage medium and electronic equipment

A technology of data screening and data samples, applied in the field of machine learning, which can solve problems such as the impact of machine learning

Pending Publication Date: 2020-04-21
OPPO CHONGQING INTELLIGENT TECH CO LTD
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, when manually labeling data samples, labeling noise will inevitably b

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data screening method and device, storage medium and electronic equipment
  • Data screening method and device, storage medium and electronic equipment
  • Data screening method and device, storage medium and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] It should be noted that the principles of the present application are implemented in an appropriate computing environment for illustration. The following description is based on illustrated specific embodiments of the present application, which should not be construed as limiting other specific embodiments of the present application that are not described in detail here.

[0027] Artificial Intelligence (AI) is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the nature of intelligence and produce a new kind of intelligent machine that can respond in a similar way to human intelligence. Artificial intelligence is to study the design principles...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a data screening method and device, a storage medium and electronic equipment, and the method comprises the steps: obtaining a sample identification of a datasample marked with a category, and obtaining an identification vector corresponding to the sample identification; then, taking the category number of the labeled categories as a clustering category number to carry out clustering processing on the identification vector; then, for each clustering category, obtaining the similarity between the clustering center identification vector and the non-clustering center identification vector; then, determining a target non-clustering center identification vector of which the similarity with the clustering center identification vector does not reach a preset similarity in each clustering category, and judging the labeled category of the data sample represented by the target non-clustering center identification vector as labeling noise; and finally, filtering out the data sample corresponding to the target non-clustering center identification vector in each clustering category, thereby achieving the purpose of improving the labeling quality of thedata sample, and providing a high-quality data sample for machine learning.

Description

technical field [0001] The present application relates to the technical field of machine learning, and specifically relates to a data screening method, device, storage medium and electronic equipment. Background technique [0002] At present, in the field of machine learning, data samples are usually labeled manually, so that machine learning is performed based on the data samples and corresponding identification data to obtain corresponding functional models. However, when manually labeling data samples, labeling noise will inevitably be generated, and these labeling noises will have an impact on machine learning. Contents of the invention [0003] The embodiments of the present application provide a data screening method, device, storage medium and electronic equipment, which can improve the quality of labeling of data samples. [0004] In the first aspect, the embodiment of the present application provides a data screening method, including: [0005] Obtain the sample...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62G06N20/00
CPCG06N20/00G06F18/23G06F18/22G06F18/214G06F18/10
Inventor 郭子亮
Owner OPPO CHONGQING INTELLIGENT TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products