Principal component distribution function based software defect prediction imbalance data processing method

A software defect prediction and principal component distribution technology, which is applied in the fields of electrical digital data processing, software testing/debugging, computer parts, etc.
CN106201897AActive Publication Date: 2016-12-07NANJING UNIV OF AERONAUTICS & ASTRONAUTICS

Patent Information

Authority / Receiving Office
CN · China
Current Assignee / Owner
NANJING UNIV OF AERONAUTICS & ASTRONAUTICS
Publication Date
2016-12-07

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses a principal component distribution function based software defect prediction imbalance data processing method and belongs to the technical field of software engineering application. The method comprises the steps that data acquired from a software data concentration is preprocessed to obtain an original sample set; dimension reduction processing is conducted on the original sample set by adopting a PCA algorithm to obtain a principal component data set including defect-free sample sets and defect sample sets; subsampling is conducted on the defect-free sample sets, and boundary samples and noise samples of the defect-free sample sets are removed; distribution fitting is conducted on principal component data corresponding to the defect sample sets to obtain new defect sample sets; new sample sets are obtained by screening the new defect sample sets; the Euclidean distances between the sample sets and the original sample set in the new sample sets are calculated to remove noise samples in the new sample sets. By the adoption of the imbalance data processing method, the software defect prediction accuracy can be effectively improved.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention discloses a software defect prediction unbalanced data processing method based on a principal component distribution function, and belongs to the technical field of software engineering applications. Background technique

[0002] With the rapid development of information technology, the application of computer software is becoming more and more extensive. Efficient and safe software systems are highly dependent on software reliability, and software defects that affect software reliability have become the root cause of system errors, failures, crashes, and even disasters. Accurate prediction of software defects can help reduce testing workload and cost. At present, software defect prediction is facing a serious and unavoidable problem, that is, the problem of data imbalance. The imbalance of data means that the categories of the data set are not evenly distributed, so that one of the categories is dominant. The problem of data imbalance...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More