Self-acceptance technology debt detection and classification method based on multi-method ensemble learning

A technology integrating learning and classification methods, applied in the field of self-admitted debt classification and detection, can solve the problems of low classifier performance, poor flexibility, low efficiency, etc., to achieve the effect of optimizing prediction results, improving detection indicators, and improving the performance of training classifiers

Pending Publication Date: 2020-10-16
NORTHWESTERN POLYTECHNICAL UNIV
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But so far some methods rely heavily on manual detection, and there are many advanced methods that use a single natural language detection method to automatically identify SATD. However, the manual detection method has low efficiency and obvious disadvantages, and the single natural

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Self-acceptance technology debt detection and classification method based on multi-method ensemble learning
  • Self-acceptance technology debt detection and classification method based on multi-method ensemble learning
  • Self-acceptance technology debt detection and classification method based on multi-method ensemble learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0050] Now in conjunction with embodiment, accompanying drawing, the present invention will be further described:

[0051] The invention is a self-recognition technology detection and classification method based on multi-method integrated learning. The method mainly includes five core steps: preprocessing the feature words; selecting the top k most useful features to train the classifier; using the naive Bayesian polynomial ( Bayes Multinomial) and Linear Logistic Regression (Simple Logistic) two methods to train the corresponding sub-classifiers; and integrate the prediction results through the sub-classifier voting rules to obtain precision, recall, and comprehensive accuracy And the recall rate finally calculates the F1 value (F1-score) as a subsequent evaluation standard. Finally, the clustering method is used to cluster the features that often appear in the experiment process and have high information gain value, and then classify the detected technical debt.

[0052] ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a self-acceptance technology detection and classification method based on multi-method ensemble learning. The method comprises the following five steps: preprocessing featurewords; selecting the first k most useful features to train a classifier; training corresponding sub-classifiers by using a naive Bayes polynomial method and a linear Logistic regression method; and performing integrated prediction on the prediction result through a sub-classifier voting rule to obtain accuracy, a recall rate, comprehensive accuracy and a recall rate, and finally calculating an F1value as a subsequent evaluation standard. And finally, the features which frequently appear in the experiment process and have high information gain values are clustered through a clustering method,so that the detected technical debts are classified.

Description

technical field [0001] The invention belongs to the technical field of software development, and in particular relates to a method for classifying and detecting self-admitted debts based on multi-method integrated learning. Background technique [0002] The document "Huang Q, Shihab E, Xia X, et al. Identifying self-admitted technical debt in open source projects using text mining [J]. Empirical Software Engineering, 2017." discloses an automatic detection of self-admitted technical debt using an integrated classifier Methods. The method uses source code annotations from different software projects to analyze the annotations to be detected. This method firstly preprocesses the source files, and uses feature selection to filter features, using Naive Bayesian polynomial ( BayesMultinomial) trains each classifier, and finally the integrated classifier composed of multiple classifiers predicts according to the voting rules to determine whether the statement has self-admitted ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/35G06K9/62
CPCG06F16/35G06F18/241
Inventor 殷茗徐悦然田嘉毅朱奎宇马怀宇张小港薛禹坤吴瑜
Owner NORTHWESTERN POLYTECHNICAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products