Software defect prediction method based on principal component analysis and combined sampling

A software defect prediction and principal component analysis technology, applied in the field of defect prediction, can solve the problems of uneven distribution of data classes, lack of defect sample information, etc., to reduce the problem of combination omission, improve prediction efficiency, and improve prediction accuracy.

Inactive Publication Date: 2019-06-25
YANSHAN UNIV
View PDF3 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Furthermore, the combination of SMOTE (Synthetic Minority Oversampling Technique) oversampling and stratified random sampling without replacement is used to solve the problem that due to the small number of defect samples, the distribution of data classes is unbalanced and the information of defect samples is too lacking, resulting in defects. The module is wrongly predicted as a non-defective module, etc. At the same time, the setting of the sampling rate can reduce the loss cost and improve the efficiency of software defect prediction

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Software defect prediction method based on principal component analysis and combined sampling
  • Software defect prediction method based on principal component analysis and combined sampling

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments.

[0030] A software defect prediction method based on fusion feature selection and combined sampling of the present invention comprises the following steps:

[0031] Step S1: Use fusion feature selection for software defect data to reduce dimensionality and denoise;

[0032] Step S2: Perform SMOTE oversampling and stratified random sampling on the data after dimensionality reduction. Oversampling refers to increasing the number of minority class samples so that the class samples in the data set are relatively balanced. Stratified random sampling Stra...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a software defect prediction method based on principal component analysis and combined sampling. The software defect prediction method comprises the following steps: S1, dimensionality reduction and denoising are selected for software defect data through fusion characteristics; S2, performing SMOTE oversampling and hierarchical random sampling on the data subjected to dimensionality reduction in combination for sampling, the oversampling means that class samples in a data set are relatively balanced by increasing the number of few class samples, hierarchical random sampling means that classification is performed by dividing classes, and no-replay random sampling is adopted in each layer; and S3, selecting a classifier for the processed data, and optimizing classifier parameters. According to the method, the random forest classifier is selected, and the characteristics of the characteristic subset are randomly selected, so that the purpose of randomizing the treeis further achieved, the overfitting problem of the classifier is avoided, finally, the software defect prediction performance and prediction efficiency are improved, and a good theoretical and experimental basis is provided for predicting defective software in reality.

Description

technical field [0001] The invention relates to a defect prediction method, in particular to a software defect prediction method based on principal component analysis and combined sampling. Background technique [0002] With the development of Internet technology, the reliability of software product quality has become a concern in the field of software engineering, and software defects will inevitably appear in the process of software development. However, for software with potential threats, once it is put into use, it will cause huge economic losses to companies and individuals. In order to effectively solve this problem, it is necessary to accurately and quickly predict the possible defect modules of the software, so as to improve the reliability of the software system. [0003] Currently, related software defect prediction methods mainly utilize different types of machine learning techniques. Its main consideration is the prediction accuracy of the overall data. Althou...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F11/36
Inventor 何海涛任家东张旭胡昌振
Owner YANSHAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products