Cross-project defect prediction method based on data screening and data oversampling

A prediction method and data screening technology, applied in the direction of electrical digital data processing, error detection/correction, software testing/debugging, etc., can solve the problem of damage to the performance of cross-project software defect prediction models, cross-project historical software module data class imbalance, etc. problem, to achieve performance-enhancing effects

Inactive Publication Date: 2017-11-24
WUHAN UNIV
View PDF1 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] Compared with the existing cross-project software defect prediction methods at home and abroad, the present invention aims at the problem that a large number of irrelevant cross-project historical software module data in the cross-project software defect pre...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-project defect prediction method based on data screening and data oversampling
  • Cross-project defect prediction method based on data screening and data oversampling
  • Cross-project defect prediction method based on data screening and data oversampling

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] In order to facilitate those of ordinary skill in the art to understand and implement the present invention, the present invention will be described in further detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the implementation examples described here are only used to illustrate and explain the present invention, and are not intended to limit this invention.

[0025] please see figure 1 A cross-project defect prediction method based on data screening and data oversampling provided by the present invention comprises the following steps:

[0026] Step 1: Extract cross-project historical software modules;

[0027] When a project is just being developed, the defect prediction model for this project cannot be trained because there is no historical software module data. Therefore, it is necessary to borrow cross-project historical software module data. Therefore, useful cross-project historical software modules are ex...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a cross-project defect prediction method based on data screening and data oversampling. Reasonable data screening and data imbalance processing strategies are designed, and cross-project historical software module data truly similar to the project module data is screened by means of a hierarchical clustering algorithm, so that a cross-project software defect prediction model is protected from the influence of irrelevant cross-project historical software module data; then by means of an oversampling method, defective software module data is added, and a new dataset with relatively balanced classification is obtained, so that the cross-project software defect prediction model is protected from the influence of an imbalanced training dataset. According to the technical scheme, the method has the advantages of being simple and efficient, and the performance of the cross-project software defect prediction model can be well improved.

Description

technical field [0001] The invention belongs to the technical field of software defect prediction, in particular to a cross-project defect prediction method based on data screening and data oversampling. Background technique [0002] (1) Software defect prediction technology [0003] Software has become an important factor affecting national economy, military affairs, politics and even social life. Highly reliable and complex software systems depend on the reliability of the software they employ. Software defects are the potential source of related system errors, failures, crashes, and even machine crashes. The so-called defect, so far, there are many related terms and definitions in academia and industry, such as failure, defect, bug, error, error, failure, failure, etc. According to ISO 9000, the definition of a defect is: to meet the requirements related to the intended or specified use. A defect is a part of the software that already exists and can be eliminated by mo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F11/36
CPCG06F11/3608
Inventor 余啸刘进伍蔓崔晓晖张建升井溢洋
Owner WUHAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products