Supercharge Your Innovation With Domain-Expert AI Agents!

Cross-project software defect prediction method based on shared hidden layer auto-encoder

A technology of software defect prediction and autoencoder, which is applied in software testing/debugging, instruments, computer components, etc., can solve problems such as poor prediction performance, data distribution differences, and poor prediction capabilities, and solve the problem of data distribution differences and the effect

Active Publication Date: 2020-05-26
NANJING UNIV OF POSTS & TELECOMM
View PDF2 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, its prediction performance is still very poor. The main reason is that there is a difference in data distribution between the source project and the target project. If the difference in their data distribution is smaller, the cross-project defect prediction effect is better.
In addition, due to the class imbalance in the data set itself, that is, the number of non-defective categories is much greater than the number of defective categories, this class imbalance problem will reduce the prediction performance of the model, making it easier for the model to identify samples of non-defective categories, Shows poor predictive power for flawed samples

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-project software defect prediction method based on shared hidden layer auto-encoder
  • Cross-project software defect prediction method based on shared hidden layer auto-encoder
  • Cross-project software defect prediction method based on shared hidden layer auto-encoder

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] Below in conjunction with accompanying drawing, the present invention will be described in further detail: as figure 1 As shown, a cross-item software defect prediction method based on a shared hidden layer autoencoder includes the following steps:

[0025] Step 1. Divide the training data set and the test data set, and perform data preprocessing on the data set. The specific method is: first select the PROMISE data set, the data set has 20 basic metrics, and these 20 basic metrics are not in the same Order of magnitude, so we need to use the min-max data normalization method to convert all metric values ​​to the interval between 0 and 1. Given a feature x, its maximum and minimum values ​​are expressed as: max(x) and min(x), respectively. For each eigenvalue x of a feature x i After data preprocessing, it can be expressed as follows:

[0026]

[0027] Step 2. Use the improved self-encoder for feature extraction. We use an autoencoder with a shared hidden layer t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a cross-project software defect prediction method based on a shared hidden layer auto-encoder, and the method comprises the steps: firstly preprocessing a data set, and dividing the data set into a training set and a test set; secondly, performing feature extraction by adopting an auto-encoder with a sharing mechanism, and respectively extracting depth features of the training set and the test set; finally, introducing a focus loss function to train and a classifier. The problem of feature distribution difference in cross-project software defect prediction is solved, and a focus loss-based shared hidden layer auto-encoder technology is proposed for the first time. Different data distributions become more similar, different weights are allocated to different types of samples by using a focus loss learning technology to solve class imbalance, and meanwhile, different weights are given to samples easy to classify and samples difficult to classify, so that a classifier can better learn the samples difficult to classify.

Description

technical field [0001] The invention belongs to the field of software engineering, and in particular relates to a cross-project software defect prediction method based on a shared hidden layer autoencoder. Background technique [0002] Software defect prediction is a research hotspot in the field of software engineering. Its main goal is to discover defects in software in advance in the early process of software development and improve the quality of software products. Most previous research has focused on the problem of intra-project defect prediction, mainly using a part of the historical data of the same project to train a prediction model, and then using the remaining data of the same project to test the ability of the model to predict defects. But for a newly started project, there is not enough historical data to train the model, and the performance of defect prediction within the project will perform poorly. Therefore, when there is not enough historical defect data...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F11/36G06K9/62
CPCG06F11/3608G06F18/214Y04S10/50
Inventor 荆晓远李娟娟吴飞孙莹
Owner NANJING UNIV OF POSTS & TELECOMM
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More