Method for testing categorical data set

A technology for classifying data and test sets, applied in the field of multi-label classification, can solve the problem of low classification accuracy

Inactive Publication Date: 2015-11-25
CHINA UNIV OF GEOSCIENCES (WUHAN)
View PDF2 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The invention provides a method for testing the classification data ...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for testing categorical data set
  • Method for testing categorical data set
  • Method for testing categorical data set

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0053]The core point of the present invention is that, in view of the fact that the Naive Bayesian multi-label classification algorithm ignores the feature of 'different attributes have different importance for class label selection' when performing data classification, a double-weighted Naive Bayesian multi-label classification is proposed. method to classify a classification dataset. According to the importance of the attribute characteristics of different items on the decision-making of different class labels in the decision-making class label set, each attribute and the edge between each class label are weighted, that is to say, each attribute feature and each class label Labels are doubly weighted.

[0054] Specifically, the present invention adopts the niche culture algorithm to learn and optimize the double weights in the double weighted naive Bayesian multi-label classifier, and obtain the optimal weight combination to be substituted into the current double weighted na...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for testing a categorical data set. The method includes the steps that after the categorical data set is obtained, if the categorical data set needs to be processed in a standardization mode, the categorical data set is standardized in an absolute standard deviation mode; the categorical data set is divided into a training set and a test set, an ecological niche cultural algorithm is used for learning to obtain dual weight values of a dual weighted naive Bayes multi-label classifier, and then the training set is trained to obtain optimized weight values; the optimized weight values are substituted into the test set for prediction. A data training process is added on the basis of a traditional naive Bayes multi-label algorithm, and then the categorical data set is predicted. As traditional data classification is improved through a particle swarm optimization algorithm, the improved algorithm can improve the classification accuracy.

Description

technical field [0001] The present application relates to the technical field of multi-label classification, and in particular to a method for testing a classification data set. Background technique [0002] Multi-label learning is derived from text classification problems where each document may belong to several predefined topics: health and government. But now, this type of problem also exists in real-life applications very widely: in the field of video search, each audio clip can be divided into different emotional labels, such as "cheerful" and "joyful"; in gene function, Genes may correspond to multiple functional labels, such as "tall" and "fair skin"; in the field of image attribution, an image may belong to several scene labels at the same time, such as "big tree" and "tall building". All of these, the multi-label classification problem is widely used in more and more practical applications, and a deeper study of it will bring greater benefits to our daily life. C...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/353
Inventor 颜雪松
Owner CHINA UNIV OF GEOSCIENCES (WUHAN)
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products