Unlock instant, AI-driven research and patent intelligence for your innovation.

A Relevant Patch Recommendation Method Based on Heterogeneous Data

A technology of heterogeneous data and recommended methods, applied in the field of code review, can solve the problems of high number of code iterations, labor cost and time cost, etc., and achieve the effect of improving high-efficiency work, saving costs, and improving reliability and stability

Active Publication Date: 2022-02-22
SUN YAT SEN UNIV
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the large number of code iterations and too many code files in each project, it takes a lot of labor cost and time to manually find other submissions related to the current problem submission, or to manually mark whether the submissions are related in advance. cost

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Relevant Patch Recommendation Method Based on Heterogeneous Data
  • A Relevant Patch Recommendation Method Based on Heterogeneous Data
  • A Relevant Patch Recommendation Method Based on Heterogeneous Data

Examples

Experimental program
Comparison scheme
Effect test

example

[0092] 1. Adopt the method of web crawler to automatically obtain the review records and files or data of multiple projects on Gerrit. The data obtained by crawling is heterogeneous and has many types, including at least the following three types:

[0093] 1) The basic information of the patch (the source of the patch meta-feature): discrete data such as the submitter and the reviewer; for crawling the submission information of the patch using the crawler method, including the name of the reviewer of the patch and the name of the submitter of the patch , the name of the author of the patch, the name of the project to which the patch belongs, the name of the branch of the project to which the patch belongs, the time when the patch was submitted, the number of personnel participating in the patch review, the number of modified code files, etc.

[0094] 2) Crawl the brief description of each patch submission, which is text data.

[0095] 3) The patch is crawled by means of a craw...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for recommending related patches based on heterogeneous data, which includes crawling multiple heterogeneous data for code review, cleaning the data, splicing the features of multiple heterogeneous data into patch feature vectors, and pairing patch pairs so as to correlate with predicted patches. The correlation is a positive sample, and the non-correlation with the predicted patch is a negative sample. After marking the positive and negative samples with a binary classification label, divide the training set and the verification set, and use the training set to train the three models of logistic regression, random forest and LightGBM respectively. , get the corresponding probability and prediction label, then calculate the corresponding accuracy rate according to the prediction label, and finally construct the prediction score according to the weighted sum of the fusion weight and the corresponding probability, and obtain the optimal prediction score. The present invention utilizes machine learning to conduct correlation evaluation on the data submitted to the code review system to obtain the optimal recommendation, improves the reliability and stability of the recommendation, and saves labor costs.

Description

technical field [0001] The invention relates to the field of code review, in particular to a related patch recommendation method based on heterogeneous data. Background technique [0002] Code review is an important basis for the smooth iteration of software engineering projects. It is composed of multiple and complex small tasks, including code specification revisions, code supplementary comments, etc. At present, it is widely used in the field of software engineering to update the code and manage the version by manual review, and the labor cost is high. [0003] At present, the software engineering industry generally uses git, Gerrit and other similar systems for code management and review. Every time the code is updated on these systems, we call it a code modification or a patch. These systems provide a reviewer, committer, for each patch submission to complete queries or comments. The basic process for patch review is: [0004] 1) The programmer, that is, the author ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F8/658G06F40/284
CPCG06F8/658
Inventor 郑子彬陈志豪李全忠
Owner SUN YAT SEN UNIV