Software defect prediction method based on feature set division and ensemble learning

A technology of software defect prediction and integrated learning, applied in software testing/debugging, genetic rules, genetic models, etc. cost effect

Active Publication Date: 2020-07-10
SHANGHAI MARITIME UNIVERSITY
View PDF3 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The purpose of the present invention is to provide a software defect prediction method based on feature set division and integrated learning. In addition to the redundant features in the defect prediction data set, reducing the search space of the algorithm can also effectively alleviate the problem of high dimensionality of software defect historical data features; on the other hand, this method uses ensemble learning technology to integrate the classification results of different base classifiers, which can Effectively overcome the problem of low prediction accuracy of defective modules caused by data set imbalance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Software defect prediction method based on feature set division and ensemble learning
  • Software defect prediction method based on feature set division and ensemble learning
  • Software defect prediction method based on feature set division and ensemble learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] The present invention will be further elaborated below by describing a preferred specific embodiment in detail in conjunction with the accompanying drawings.

[0047] Such as figure 1 and figure 2 Combined with the shown, it is a frame diagram and process schematic diagram of a software defect prediction method based on feature set division and integrated learning according to the present invention. The method includes:

[0048] S1. Obtain an original data set D of software defect samples from historical software data, and divide the original data set D into a training data set (TS) and a testing data set (VS).

[0049] The original data set D={(x 1 ,y 1 ),...,(x n ,y n )} is a collection of n software defect module samples, where x n is the metric attribute vector of software module n, and each vector contains m metric attributes (also called metric elements), that is, x n =(a 1 ,...,a m ); n ∈ Y represents the category mark of the nth software module, and i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a software defect prediction method based on feature set division and ensemble learning, and the method comprises the steps: dividing an original data set into a training dataset and a test data set, and dividing the training data set into a plurality of feature subsets; selecting K base classifiers for ensemble learning, and synthesizing an ensemble classifier of each feature subset according to the base classifiers and the corresponding weights; selecting a feature subset most similar to the input instance, performing defect prediction on the input instance by usingan integrated classifier of the feature subset, and establishing a software defect prediction model; dividing the test data set and searching a feature subset most similar to the input instance; searching optimal values of the centroid set and the weight set, and optimizing the software defect prediction model by combining the most similar feature subset of the test data set. The method has the advantages that redundant features in the defect prediction data set can be removed, the search space of the algorithm is reduced, and the problem of high feature dimension of historical data of software defects can be effectively relieved.

Description

technical field [0001] The invention relates to the technical field of software defect prediction, in particular to a software defect prediction method based on feature set division and integrated learning. Background technique [0002] The purpose of software defect prediction is mainly to distinguish software modules as defective modules or non-defective modules by using related technologies through historical software defect information, so software defect prediction is essentially a binary classification problem. Defective modules can be effectively identified through defect prediction, thereby reducing various risks and hazards caused by software defects. At present, many machine learning algorithms have been used to build predictive models. For example, the classification rules generated by the decision tree C4.5 algorithm are easy to understand and have a fast learning speed, and are often used as benchmark comparison algorithms for model building; Naive Bayesian algo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F11/36G06K9/62G06N3/12
CPCG06F11/3608G06N3/126G06F18/241G06F18/214
Inventor 李璐璐任洪敏朱云龙卢晓喆
Owner SHANGHAI MARITIME UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products