Data processing method and system and electronic equipment

A data processing and data technology, applied in the field of structured data, can solve problems such as disordered order, poor actual effect, changing weight, etc., to achieve the effect of improving training effect

Pending Publication Date: 2022-01-07
南京星云数字技术有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

There is a way to do down sampling (up sampling) for types with rich labels or up sampling (down sampling) for types with few labels, but this method only changes the weight of the existing data and cannot provide true data for the model. Therefore, the effect of data enhancement is not good; another method is to make small disturbances to the characteristics of existing data to form new data, but this method is not suitable for more sensitive features, but it cannot be used for insensitive features. Have a significant impact, resulting in poor actual results; there is another way to shuffle the order of features, while the labels remain the same, but it is only applicable to specific scenarios

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data processing method and system and electronic equipment
  • Data processing method and system and electronic equipment
  • Data processing method and system and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0064] In order to realize the data processing method disclosed in this application, the embodiment of this application provides a data processing method applied to a supervised classification model, such as figure 1 As shown, taking the binary classification model as an example, applying the data processing method disclosed in this embodiment to enhance the training data includes:

[0065] S100. Acquire the acquisition time point corresponding to the initial training data and the label generation time point;

[0066] S110. According to the corresponding acquisition time point and the corresponding label generation time point, determine the initial training data whose corresponding acquisition time point is earlier than the corresponding label generation time point as training data; and generate a training data set to be processed according to the determined training data ;

[0067] Wherein, the initial training data may be data samples collected from actual business scenario...

Embodiment 2

[0145] Corresponding to the above embodiments, the present application provides a data processing method, such as figure 2As shown, the method includes:

[0146] 1000. Acquire a training data set to be processed, where the training data set to be processed includes training data;

[0147] Preferably, before the acquisition of the training data set to be processed, the data processing method further includes:

[0148] 1100. Obtain the acquisition time point and label generation time point corresponding to the initial training data;

[0149] 1120. According to the acquisition time point corresponding to the initial training data and the corresponding label generation time point, determine that the initial training data whose corresponding acquisition time point is earlier than the corresponding label generation time point is the training data;

[0150] 1130. Generate the training data set to be processed according to the training data.

[0151] 2000. Generate target trainin...

Embodiment 3

[0175] Corresponding to Embodiment 1 and Embodiment 2, the present application provides a data processing system, such as image 3 As shown, the system includes:

[0176] An acquisition module 310, configured to acquire a training data set to be processed, the training data set to be processed includes training data;

[0177] A generating module 320, configured to generate target training data corresponding to the training data to be processed according to the target code corresponding to the training data and the acquisition time point;

[0178] The generation module 320 is further configured to determine the training data to be processed included in the training data set to be processed and generate the training data corresponding to the training data to be processed according to the acquisition time point of the training data to be processed and the corresponding time processing rules Incremental time point;

[0179] The generating module 320 is further configured to gene...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a data processing method, and the method comprises the steps: obtaining a to-be-processed training data set which comprises training data; generating target training data corresponding to the training data according to the target code corresponding to the training data and the acquisition time point; determining to-be-processed training data contained in the to-be-processed training data set, and generating an increment time point corresponding to the to-be-processed training data according to the acquisition time point of the to-be-processed training data and a corresponding time processing rule; generating incremental training data corresponding to the to-be-processed training data according to the incremental time point, the to-be-processed training data and the corresponding target code; generating a target training data set according to the incremental training data and the target training data so as to train the to-be-trained model by using the target training data set; and achieving data enhancement based on the newly added tag. The method and system can be applied to structured data, and the training effect on the model can be improved based on the obtained target training data.

Description

technical field [0001] The invention relates to the field of structured data, in particular to a data processing method, system and electronic equipment. Background technique [0002] Data augmentation has been widely used in the field of unstructured data, but in the field of structured data, there are not many effective methods for data augmentation. The fundamental difficulty lies in the inability to verify the accuracy of the newly added data labels. [0003] In the prior art, there are some data augmentation methods applied to structured data. There is a way to do down sampling (up sampling) for types with rich labels or up sampling (down sampling) for types with few labels, but this method only changes the weight of the existing data and cannot provide true data for the model. Therefore, the effect of data enhancement is not good; another method is to make small disturbances to the characteristics of existing data to form new data, but this method is not suitable for ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06T3/40
CPCG06T3/4007G06F18/241G06F18/214
Inventor 孙鹏吴贞书张硕
Owner 南京星云数字技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products