Code reviewer recommendation system and method based on random forest classifier

A technology of random forest and recommendation system, which is applied in the direction of instruments, computer parts, software testing/debugging, etc. It can solve the problems of large feature dimension, unsatisfactory recommendation effect, and difficulty in integration, and achieve the effect of saving communication costs

Active Publication Date: 2020-07-17
NANJING UNIV
View PDF3 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Rule-based methods can usually only dig out reviewer selection rules from a relatively single point of view for recommendation, and the recommendation effect is usually not ideal; while machine learning-based recommendation methods can comprehensively analyze reviewer selection strategies from multiple perspectives to improve the recommendation effect , but the currently commonly used reviewer re

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Code reviewer recommendation system and method based on random forest classifier
  • Code reviewer recommendation system and method based on random forest classifier
  • Code reviewer recommendation system and method based on random forest classifier

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0075] see figure 1 In the embodiment of the present invention, a code reviewer recommendation system based on a forest classifier is proposed, including an input module 310 , a calculation module 320 , a model training module 330 and a recommendation result output module 340 .

[0076] In the technical solution of the embodiment of the present invention, by converting the reviewer recommendation problem into a multi-classification task of machine learning, the historical review log and historical code change log of the project are analyzed, and the personnel information and code change of the project are mined. Information and file path information, and converted into personnel activity features, code change features, and file weight features, all review records contained in the project are converted into feature vectors, and used as a data set, input into the random forest model, training random Forest classifier; extract features from the historical code review records of t...

Embodiment 2

[0112] see image 3 As shown, the technical solution of this embodiment proposes a code reviewer recommendation method based on a random forest classifier. This method recommends a suitable reviewer for the code to be reviewed according to the historical code review records, saving the time required for no recommendation. The communication time, the specific steps are as follows:

[0113] Step 210, acquire project historical code review records, said historical code review records include:

[0114] Obtaining a code submission log that matches the software project, the code submission log includes: code submitter, submission time, branch, number of newly added code lines, number of deleted code lines, and file path set;

[0115] Obtaining a code review log that matches the software project, where the code review log includes: reviewer, review time;

[0116] Step 220, according to the historical code review records of the project, mining personnel activity characteristics, cod...

Embodiment 3

[0136] refer to Figure 4 , this embodiment provides a time series verification model to evaluate the effect of the code reviewer recommendation method based on the random forest classifier in the second embodiment;

[0137] In this embodiment, the time series model is used to verify the method, and the processed historical code review records are used as a data set, which is divided into a training set and a test set according to time series, such as Figure 4As shown, first sort the data set according to the review time and divide it into N time slices, N depends on the time interval between the first code review of the project and data collection, for example, the first code review of project A The first code review time is January 2019, and the data is collected in November 2019, that is, 10 months after the first submission. When "month" is used as the time slice, all records included in project A will be collected Divided into 10 time slices, then, based on the divided ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a code reviewer recommendation system and method based on a random forest classifier, and the system comprises an input module, a calculation module, a model training module, and a recommendation result output module, and a code reviewer recommendation method based on the random forest classifier is provided, and the method comprises the steps: inputting a historical code review record of a project; according to historical code review records, mining personnel activeness, code changes and file weight features; taking a reviewer of a historical project review record as aclassification label, taking the calculated feature vector as a data set, and inputting a training set into the random forest model to train a classification model; and extracting features from the to-be-evaluated code change, inputting the features into the classifier, and outputting N categories with the highest probability as recommended evaluators. According to the invention, in a large-scaleproject, an appropriate reviewer is recommended for the change of the to-be-reviewed code according to the historical review record, a reference basis is provided for the selection of the reviewer, and the communication cost is saved.

Description

technical field [0001] The invention relates to the technical field of software development, in particular to a code reviewer recommendation system and method based on a random forest classifier. Background technique [0002] As an important means of ensuring code quality, code review has become increasingly prominent in most software companies. However, the process of selecting reviewers usually takes a certain amount of communication time. How to find the right reviewer in time has become a code review An important problem in practice, sometimes the selection of reviewers is not reasonable enough, which may cause many problems for subsequent delivery. Appropriate code reviewers need to have certain relevant knowledge reserves and familiarity with the submitted code. At the same time, large-scale projects usually involve a large number of developers. How to select suitable reviewers from a large number of candidates has become a problem in software development practice. a ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/9535G06F11/36G06F8/77G06K9/62
CPCG06F16/9535G06F11/3616G06F8/77G06F18/24323
Inventor 马瑾瑜张贺杨岚兴荣国平邵栋
Owner NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products