Unlock instant, AI-driven research and patent intelligence for your innovation.

Semi-supervised learning software defect prediction method based on spectral clustering

A technology of software defect prediction and semi-supervised learning, applied in software testing/debugging, error detection/correction, instruments, etc., can solve problems such as wasting computing resources, predicting model performance degradation, and not paying attention to feature selection

Active Publication Date: 2020-12-29
SOUTH CHINA UNIV OF TECH +1
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the data owned by the actual project is often a collection of a small number of labeled data and most of the unlabeled data. The above two schemes assume that the target project does not have any historical label data, which loses part of the available known information and reduces the model. performance
[0004] In addition, the main focus of existing research is to construct a usable model without focusing on feature selection
However, the use of a large number of irrelevant or redundant features not only wastes computing resources, but also degrades the performance of the prediction model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Semi-supervised learning software defect prediction method based on spectral clustering
  • Semi-supervised learning software defect prediction method based on spectral clustering

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] The present invention will be further described in detail below in conjunction with the embodiments and the accompanying drawings, but the embodiments of the present invention are not limited thereto.

[0039] Such as figure 1 , 2 , a software defect prediction method based on semi-supervised learning of spectral clustering, comprising the following steps:

[0040] 1) Obtain the original data from the database, perform data preprocessing operations, and obtain the processed feature matrix, as follows:

[0041] 1.1) Considering the different attributes due to the different ranges of the features themselves, avoiding the impact of small data in absolute values ​​being covered by large data, using z-score standardized features to ensure that each feature is treated equally by the classifier .

[0042] 1.2) After the z-score standardization process, the data conforms to the standard normal distribution. For the missing data in the database, the average value of the exist...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a semi-supervised learning software defect prediction method based on spectral clustering, and the method comprises the following steps: 1), obtaining original data, carrying out the data preprocessing, and obtaining a processed feature matrix; 2) judging whether the feature matrix has a label or not: clustering the data without the label through spectral clustering; performing label operation on the obtained cluster through a heuristic rule of software defect prediction to obtain a pseudo label, and then turning to the step 3); for the labeled data, directly going to the step 3); 3) calculating a feature deviation score according to the data distribution and performing feature selection, wherein the weight of the original label data is greater than the weight of the pseudo label data; and 4) performing clustering and labeling operation again according to the new feature matrix to obtain a prediction result. According to the method, the influence of irrelevant and redundant characteristics on the model result is reduced, the information of the original label data of the project is utilized, the accuracy of the software defect prediction result can be effectively improved, and the applicability of the model is improved.

Description

technical field [0001] The invention relates to the field of software defect prediction, in particular to a software defect prediction method based on spectral clustering semi-supervised learning. Background technique [0002] Software defect prediction is a process of predicting whether a software entity is defective or not. As the scale of contemporary software continues to expand, software defect prediction, as a technology that can help reduce the burden of software testers and optimize the allocation of developers and testers, has received more and more attention. Practice has shown that the cost of finding and fixing defects after development is completed is much higher than the cost of finding and fixing defects during development. Therefore, it is very important to introduce software defect prediction in the early stage of software life cycle. But at present in the industry, the application of software defect prediction is still less. This is mainly because the cu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F11/36
CPCG06F11/3608
Inventor 陆璐周璇
Owner SOUTH CHINA UNIV OF TECH