Unlock instant, AI-driven research and patent intelligence for your innovation.

Training data processing method and device and storage medium

A processing method and training data technology, applied in character and pattern recognition, instrumentation, computing, etc., can solve problems such as training data redundancy, ignoring data repetition, and damaging the validity of training data, so as to improve generalization ability and performance , to improve effectiveness and balance

Pending Publication Date: 2022-04-12
TENCENT TECH (SHENZHEN) CO LTD
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Usually, the accumulation of training data is based on the technical staff’s understanding and experience of the business to label the sample data. This method relies on the technical staff’s knowledge to label, which can easily lead to redundant training data.
In addition, under normal circumstances, there may be a large amount of data with high similarity in the sample data, and the data based on manual annotation will ignore the repeatability of the data, which will damage the effectiveness of the training data and affect the training effect and application performance of related models.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training data processing method and device and storage medium
  • Training data processing method and device and storage medium
  • Training data processing method and device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some of the embodiments of the application, not all of them. Based on the embodiments in the present application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present application.

[0039] It should be noted that the terms "first" and "second" in the description and claims of the present application and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the application described herein can be practiced in sequences other than those illustrated or des...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a training data processing method and device and a storage medium, relates to the technical field of artificial intelligence, can be applied to various scenes such as cloud technology, artificial intelligence, intelligent traffic and aided driving, and comprises the following steps: obtaining a pre-training regression model and an initial training set; obtaining a clustering result corresponding to each piece of sample data in the candidate data set by using a pre-training regression model; based on the initial training set and the second loss function, performing update constraint training on the pre-training regression model to obtain an intermediate model, and obtaining prediction confidence of each sample data in the first difference set by using the intermediate model; based on the prediction confidence and the clustering result, performing data sampling on sample data in the first difference set; updating the initial training set by using the obtained incremental training set; and performing loop iteration based on the updated initial training set, and taking the initial training set obtained when an iteration ending condition is satisfied as a target training set. The effectiveness of the training data is effectively improved.

Description

technical field [0001] The present application relates to the technical field of artificial intelligence, and in particular to a training data processing method, device and storage medium. Background technique [0002] A key issue that needs to be addressed in deep learning schemes is data availability. Usually, the accumulation of training data is based on the understanding and experience of technicians on the business to mark the sample data. This method relies on the knowledge of technicians to mark, which can easily lead to redundant training data. In addition, in general, there may be a large amount of data with high similarity in the sample data, and the data based on manual annotation will ignore the repetition of the data, which will damage the effectiveness of the training data and affect the training effect and application performance of related models. Therefore, it is necessary to provide a training data processing method capable of efficiently screening effecti...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06V40/16G06V10/74G06V10/762G06V10/774G06K9/62
Inventor 康洋
Owner TENCENT TECH (SHENZHEN) CO LTD