Data processing method, risk identification method, computer equipment and storage medium

A data processing and data technology, applied in the field of data processing, can solve problems such as low coverage, model fitting, and data sets that cannot fit the data well, so as to improve coverage, efficiency, and recognition effect Effect

Pending Publication Date: 2020-12-01
ALIBABA GRP HLDG LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The applicant has found through research that if the number of samples is too small, the assumption will become overly strict in order to obtain a consistent hypothesis, that is, one hypothesis can obtain a better fit than other hypotheses on the training data, but the assumptions outside ...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data processing method, risk identification method, computer equipment and storage medium
  • Data processing method, risk identification method, computer equipment and storage medium
  • Data processing method, risk identification method, computer equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0225] The embodiment of the present application discloses a data processing method and device. Example 1 includes a data processing method, which is characterized in that it includes:

[0226] Obtain a first sample set for training the first recognition model, and at least one second sample set, wherein the at least one second sample set is used for training the second recognition model;

[0227] determining that the similarity data between the first sample set and at least one second sample set meet preset requirements;

[0228] Combining the first sample set and at least one second sample set to obtain a third sample set replacing the first sample set as an input for training the first recognition model.

example 2

[0229] Example 2 may include the method described in Example 1, wherein the determining that the similarity data between the first sample set and at least one second sample set meets preset requirements includes:

[0230] clustering the first samples in the first set of samples and the second samples in the second set of samples;

[0231] determining similarity data between the first sample set and the second sample set according to the clustering result;

[0232] It is determined that the similarity data meets a preset requirement.

example 3

[0233] Example 3 may include the method described in Example 1 and / or Example 2, wherein, according to the clustering result, determining the similarity data between the first sample set and the second sample set includes:

[0234] calculating a ratio of the number of first samples belonging to the same class as the second sample in the clustering result to the total number of the first samples as the similarity data.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a data processing method. The method comprises steps of a first sample set used for training a first recognition model and at least one second sample set being acquired, and the at least one second sample set being used for training a second recognition model; determining that similarity data between the first sample set and at least one second sample setmeets preset requirements; combining the first sample set and at least one second sample set, obtaining a third sample set replacing the first sample set. According to the method, the first sample setis used as the input for training the first recognition model, so the samples for training the first recognition model are supplemented, the over-fitting problem of the first recognition model causedby too few samples in the first sample set is avoided, the recognition coverage rate of the first recognition model is improved, and the recognition effect is improved.

Description

technical field [0001] The present application relates to the technical field of data processing, and in particular to a data processing method, a risk identification method, a computer device, and a computer-readable storage medium. Background technique [0002] With the development of computer technology, technologies such as artificial intelligence and machine learning are increasingly being applied in practice. The machine learning method is a method in which a computer uses existing data, obtains a certain model through training, and uses the model to infer new instances. Therefore, the training process requires historically existing sample data, and the sample data will have a great impact on the prediction effect of the final model. [0003] In practice, for some businesses, historically existing samples are relatively scarce. For example, in e-commerce platforms, in order to prevent and control risks such as counterfeit goods, contraband, and fraud, and ensure busin...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62G06Q10/06
CPCG06Q10/0635G06F18/22G06F18/214
Inventor 俞飞江王榕朱成生高阳姜喆
Owner ALIBABA GRP HLDG LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products