Data classification method, device, electronic device and computer readable medium

A data classification and data technology, applied in the field of data processing, can solve the problems of performance degradation of majority class samples, unfavorable model promotion, poor coverage and accuracy, etc.
CN107169518AInactive Publication Date: 2017-09-15JINGDONG TECH HLDG CO LTD

Patent Information

Authority / Receiving Office
CN · China
Current Assignee / Owner
JINGDONG TECH HLDG CO LTD
Publication Date
2017-09-15
Estimated Expiration
Not applicable · inactive patent

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The disclosure provides a data classification method, device, electronic device and computer readable medium. The data classification method includes adopting a machine learning method to perform modeling on full training data to obtain an original model, wherein the full training data contains minority class samples; performing screening to obtain new trained data from the full training data based on a minority class proportion threshold value which is a critical value of the proportion of the minority class samples in the full training data; adopting the machine learning method to perform modeling on the new trained data to obtain a new trained model; applying the original model and the new trained model to perform classification forecasting on the new trained data to obtain an original classification result and a new trained classification result; and comparing the accuracy rates of the original classification result and the new trained classification result, and using the one with a higher accuracy rate as a final classification result. The model is retrained aiming at the new trained data with an improved minority class sample proportion, and an original model result is updated, thereby achieving the purpose of improving the accuracy rate of sample classification.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The present disclosure generally relates to the technical field of data processing, and in particular, relates to a data classification method, device, electronic device, and computer-readable medium. Background technique

[0002] At present, the method of using machine learning to classify samples has been widely used. Commonly used algorithm models include: logistic regression, decision tree, random forest, support vector machine and neural network. When performing model training for most algorithms, it is generally assumed that the number of categories in the training samples tends to be balanced, and the cost of model prediction errors for various samples is equal. Usually, when the number of classified data in the sample data is not much different, machine learning can achieve good classification results. However, in fact, the requirement of balanced sample data is often not satisfied, and the data volume of each classification data may have a la...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More