Software defect prediction method based on class imbalance learning algorithm

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A software defect prediction and learning algorithm technology, applied in neural learning methods, integrated learning, computer components, etc., can solve problems such as imbalance, and achieve the effect of avoiding subjectivity and reducing costs

Pending Publication Date: 2021-03-09

HANGZHOU DIANZI UNIV

View PDF0 Cites 9 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

In the sample weight adjustment stage, the adaptive cost matrix adjustment strategy is used to assign different misclassification costs to the majority class samples and minority class samples through the cost matrix, and solve the class imbalance problem from the algorithm level

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0041] combine figure 2 and image 3 , the NASA defect prediction data set and the AEEEM defect prediction data set illustrate the present invention in detail. Overall process of the present invention is as accompanying drawing figure 1 As shown, the specific steps are as follows:

[0042] Step 1. Use the SWIM oversampling method to synthesize minority class samples, and then combine the generated minority class samples with the original data to obtain a data set with a low imbalance rate.

[0043] Step 2. Use the ten-fold cross-validation method to divide the data set in step 1 into a training set and a test set for the prediction accuracy of the training model and the test model. Then use the ten-fold cross-validation method to divide the training set into a training set and a validation set, which is used to calculate the most suitable minority class misclassification cost for the current data set.

[0044]Step 3. Use the training set obtained from the second division ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a software defect prediction method based on a class imbalance learning algorithm. According to the method, a minority class sample is synthesized by using an SWIM oversampling method, so that a data set is converted into moderate imbalance from high imbalance, then minority class misclassification cost most suitable for a current data set is calculated by using a proposedadaptive cost matrix adjustment strategy, and then K weak classifiers are trained according to a training set, so that the classification accuracy of the data set is improved. In the process, the weight of the sample is continuously adjusted, the weight of the wrongly predicted sample is increased, the weight of the correctly predicted sample is reduced, and finally, the K weak classifiers are combined into a composite classifier to predict the category of the to-be-tested sample. According to the method, the problem of low prediction accuracy of minority class samples when the unbalanced data set is predicted is solved, defective modules can be accurately predicted, a test manager is helped to search for defects of software, and the software development cost is reduced.

Description

technical field [0001] The present invention is a learning method for class unbalanced data sets, and aims to use this technology to find defect samples in defect data sets, which can help testers locate defects and allocate test resources more effectively, thereby reducing the cost of software testing , specifically relates to a software defect prediction method based on a class imbalance learning algorithm. Background technique [0002] In the field of software defect prediction, there is a natural class imbalance problem in data sets, that is, in a given data set, the number of instances representing the "defective" class is much less than the number of instances representing the "non-defective" class. However, this defective class is the most important class, and it is the ultimate goal of the classifier to correctly predict samples of the defective class as much as possible. Due to under-representation of defect classes, classification techniques give less weight to in...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G06K9/62G06N20/20G06N3/08

CPCG06N20/20G06N3/08G06F18/2453G06F18/2415G06F18/214

Inventor王兴起郑建明魏丹陈滨

OwnerHANGZHOU DIANZI UNIV

Software defect prediction method based on class imbalance learning algorithm

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology