Cross-project defect prediction method based on feature distribution alignment and neighborhood instance selection

A feature distribution and prediction method technology, applied in character and pattern recognition, software testing/debugging, error detection/correction, etc., to improve defect prediction performance and reduce distribution differences

Pending Publication Date: 2021-07-23
XUZHOU NORMAL UNIVERSITY
View PDF6 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the large difference in data distribution between the source item and the target item, the model trained on the source item may not be able to achieve good predictive performance on the target item

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-project defect prediction method based on feature distribution alignment and neighborhood instance selection
  • Cross-project defect prediction method based on feature distribution alignment and neighborhood instance selection
  • Cross-project defect prediction method based on feature distribution alignment and neighborhood instance selection

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0035] The present invention comprises the following steps in practical application:

[0036] Step 1: Select source items from the software defect dataset, merge all source items to form source item set D S , and select a target item D from the software defect data T ;where target item D T and the selected source item set D S The same module data does not exist, that is, it is not a dataset of different versions of the same project;

[0037] Step 2: Calculate the source itemset D S The covariance matrix C of S , calculate the target item D T The covariance matrix C of T , and the corresponding calculation formula is:

[0038] C S =COV(D S )

[0039] C T =COV(D T )

[0040] Step 3: As figure 2 As shown, perform feature distribution alignment, specifically by first eliminating the source itemset D S The correlation between features is calculated as:

[0041]

[0042] Then put the target item D T The feature correlations are populated to eliminate source item...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A cross-project defect prediction method based on feature distribution alignment and neighborhood instance selection specifically comprises the following steps: selecting source projects from a software defect data set, combining all the source projects to form a source project set, and selecting a target project; calculating a covariance matrix of the source item set and a covariance matrix of the target item; eliminating the correlation between the features of the source item set, filling the feature correlation of the target item into the source item set, and selecting instances with high similarity with instances in the target item from the source item set data after feature alignment to form a training instance set TS; and training a Logistic model by using the training instance set TS, and performing defect prediction classification on each instance in the target project by using the Logistic model. According to the cross-project software defect prediction method, the selection of the training data required by the model is achieved by adopting the feature distribution alignment method and the neighborhood instance selection method, the difference between projects and instances in the cross-project software defect prediction method is effectively solved, and the defect prediction performance is improved.

Description

[0001] Technology neighborhood [0002] The invention relates to software engineering neighborhoods, in particular to a cross-project defect prediction method based on feature distribution alignment and neighborhood instance selection. Background technique [0003] As software becomes increasingly important and dependent in many application neighborhoods, it is increasingly important to ensure software reliability. Predicting defects in software projects is critical to the software development process because the later bugs in software are discovered, the more expensive it will be to fix them. The purpose of software defect prediction is to help software developers find and locate software defects in the early stage of project development, so as to reasonably allocate software testing resources to improve software reliability. [0004] Software defect prediction method is a cutting-edge research in the combination of current machine learning technology and software testing ne...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F11/36G06K9/62
CPCG06F11/3688G06F11/3692G06F18/22G06F18/214
Inventor 祝义赵宇
Owner XUZHOU NORMAL UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products