Software defect prediction data processing method and device and storage medium

A software defect prediction and data processing technology, which is applied in electrical digital data processing, software testing/debugging, error detection/correction, etc., can solve problems such as the inability to effectively improve the identification ability of different defect samples of the model, and achieve improved overall prediction accuracy, Good application value and the effect of improving recognition ability

Active Publication Date: 2020-10-16
NANJING UNIV OF AERONAUTICS & ASTRONAUTICS
View PDF15 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This type of method can generate samples that are highly similar to the original samples and retain the original data features to the greatest extent, but because only local sample information is considered during linear interpolation, and the features are mutually restricted (since new samples can only exist in two On the connection between two parent samples, once a feature is determined, all other features cannot be changed), the generated new sample is too similar to the original sample, so the processed data set cannot effectively improve the model's identification of different defect samples. ability

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Software defect prediction data processing method and device and storage medium
  • Software defect prediction data processing method and device and storage medium
  • Software defect prediction data processing method and device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] The technical solutions of the present invention will be further described below in conjunction with the accompanying drawings and embodiments.

[0040] See figure 1 , which shows a flow chart of the method for processing software defect prediction data according to the present invention, the method includes the following steps:

[0041] Step 1, input the labeled historical defect data D, where the non-defective samples have D maj , the defective samples have D min indivual. In this example, there are 50 non-defective samples and 10 defective samples, and each sample contains 10 common features, and a label used to represent defective or non-defective.

[0042] Step 2: Calculate the ratio of non-defective samples to defective samples, and judge whether it is higher than the extremely unbalanced threshold. If yes, randomly delete some non-defective samples to reduce the ratio to the threshold; otherwise, proceed directly to the next step. In this example, the unbalan...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a software defect prediction data processing method. According to the method, an independent feature distribution model is established for each feature of a defect sample; andpart of the features are replaced by adopting a random variation mode so as to obtain a new defect sample, and new samples are supplemented continuously until the ratio of a non-defect sample to thedefect sample is balanced so as to obtain a processed software defect prediction data set for subsequent model training. The invention further provides a software defect prediction data processing device based on the method and a machine storage medium, the problem that in the prior art, due to the fact that the number of defect samples is smaller than that of non-defect samples, the defect samplerecognition capacity is insufficient is solved, and the software defect prediction precision is effectively improved.

Description

technical field [0001] The invention relates to a supplementary data set generation method and device, in particular to a software defect prediction data processing method, device and storage medium. Background technique [0002] Software defect prediction can help developers locate defect-prone modules in the project before the software product enters the testing stage, allocate limited testing resources more reasonably, and improve the quality of software products. In the process of software defect prediction, historical defect data is usually used to train binary classifiers to classify software modules to be predicted into defect and non-defect categories, and the classification results are used as the basis for judging the defect tendency of the module. However, in the software defect prediction data set, the number of defect samples is often far less than the number of non-defect samples, so the generated model tends to be biased towards a large number of non-defect cl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F11/36
CPCG06F11/3672
Inventor 燕雪峰张雨青
Owner NANJING UNIV OF AERONAUTICS & ASTRONAUTICS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products