Software defect prediction method based on genetic algorithm and random forest

A technology of software defect prediction and random forest, which is applied in the field of software defect prediction based on genetic algorithm and random forest, can solve problems such as poor performance of machine learning algorithms, large differences in data sets, and unbalanced categories, so as to avoid Effects of incompleteness and inconsistency, dimensionality reduction, and size reduction

Inactive Publication Date: 2019-07-05
YANSHAN UNIV
View PDF3 Cites 28 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Due to the large differences in software defect datasets and the extremely unbalanced categories, the performance of machine learning algorithms is not very good under different evaluation indicators. At the same time, many existing technologies have not dealt with the problem of category imbalance well. , so that the performance of software defect prediction is not good

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Software defect prediction method based on genetic algorithm and random forest
  • Software defect prediction method based on genetic algorithm and random forest
  • Software defect prediction method based on genetic algorithm and random forest

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0053] In order to make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of the embodiments of the present invention, not all the embodiments.

[0054] The present invention aims at the research of software defect prediction, and the detailed algorithm framework and flowchart are as figure 1 Shown:

[0055] A software defect prediction method based on genetic algorithm and random forest, including the following steps:

[0056] Step S1: Perform data preprocessing operations such as missing value supplementation on the software defect data set through a data reduction method;

[0057] Step S2: Use genetic algorithm and random forest combination for defect feature selection and accuracy...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a software defect prediction method based on a genetic algorithm and a random forest. The software defect prediction method comprises the following steps of carrying out data preprocessing on each subset of a software defect data set; performing feature selection based on the genetic algorithm and a random forest algorithm; constructing a random forest classifier; and carrying out the software defect prediction, training the random forest classifier by using the processed software defect data set, obtaining the random forest classifier with a better classification effect through a multi-test experiment, and then inputting the processed software defect test set into the trained classifier to finally obtain a classification result of the test set. The method is well suitable for the software defect data sets with difference and class imbalance; the genetic algorithm and the random forest algorithm are combined for feature selection, so that a very good dimension reduction effect is achieved. By using an integrated algorithm based on a decision tree to independently learn and make predictions, and combining the prediction results, a final prediction result is obtained.

Description

Technical field [0001] The present invention relates to the field of computers, in particular to a software defect prediction method based on genetic algorithm and random forest. Background technique [0002] Software defects are the source of software failure and an important factor that affects software reliability. It is particularly important in the field of software engineering to predict the defects in the software as early as possible so as to allocate test verification resources reasonably and ensure software quality. Software defect prediction technology has become an important research direction in the field of software engineering, which is mainly divided into static prediction and dynamic prediction. Static prediction is based on defect-related measurement data to predict the number and distribution of defects; dynamic prediction is to predict the distribution of failures over time. Among them, static prediction technology is the more common technology used in early ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F11/36G06N3/00G06N3/12
CPCG06F11/3608G06N3/006G06N3/126
Inventor 王倩李亚洲
Owner YANSHAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products