Unlock instant, AI-driven research and patent intelligence for your innovation.

A Method of Code Digest Generation Based on Maximum Entropy Model

A technology of maximum entropy model and code summarization, applied in the direction of code compilation, program code conversion, etc., can solve the problems of loss of consistency, inability to specify, inaccurate code summary, etc., to achieve the effect of reducing workload and good scalability

Active Publication Date: 2018-07-06
FUJIAN UNIV OF TECH
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The vast majority of existing code summaries are created manually. Not only does it take a lot of energy from developers to create summaries, but also its maintenance cost is very high. Although academia and industry have also proposed some code summaries based on word frequency, but These techniques often only consider the number and frequency of occurrences of different terms, but ignore the position of the word
A large number of studies have shown that the importance of different words in code is closely related to the type of code element (class, method, variable, etc.) to which they belong; for example: relative to the terms appearing in comments, the importance of terms located in class names is often much higher; and, in existing technical solutions, developers cannot specify certain terms that they need to pay attention to or ignore, for example: in some older legacy codes, the comments in the code have may have long lost consistency with the code, and existing technology still treats comments as important as the code, and may extract obsolete words from comments as part of the code summary
The most similar implementation scheme is the code summarization technology based on word frequency proposed by Haiduc and other scholars of Wayne State University in the United States. Its code summary is inaccurate

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Method of Code Digest Generation Based on Maximum Entropy Model
  • A Method of Code Digest Generation Based on Maximum Entropy Model
  • A Method of Code Digest Generation Based on Maximum Entropy Model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] like figure 1 As shown, the code summary generation method based on the maximum entropy model of the present invention includes the following steps:

[0023] Step 1. According to the limited sample template, use the abstract syntax tree to parse the code and collect training samples;

[0024] Step 2. According to the training samples, a general iterative algorithm is used to construct a code element classifier;

[0025] For the classification problem, use A to represent all possible code element types, and B is a set of context information where the code element is located, then a binary function on a {0,1} domain can be defined to represent the feature:

[0026]

[0027] Among them, if (a,b)∈(A,B), and satisfy the limited conditions, then f(a,b)=1; otherwise, f(a,b)=0;

[0028] If the type a∈A of the judgment code element is regarded as an event, and the context information of the code element is regarded as the condition b∈B of the occurrence of the event, then t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention provides a code summary generation method based on a maximum entropy model, which collects training samples according to a limited sample template; constructs a code element classifier based on a maximum entropy model according to the training samples; inputs the source code to be analyzed into the classifier, and Identify the code elements in it, and obtain the terms contained in each code element; denoise the acquired terms; according to the code element type to which the term belongs, and specify the weight of each term; according to the weight and the number of occurrences, Evaluate the importance of terms; generate code summaries based on the importance evaluation results and user-specified summary constraints, making the obtained code summaries more accurate.

Description

technical field [0001] The invention relates to a code summary generation method based on a maximum entropy model. Background technique [0002] At each stage of the software life cycle, developers need to spend a lot of time reading program code. During this time, developers tend to avoid understanding the entire system, choosing to focus on only a task-relevant piece of code. To achieve this, developers often skim the code (e.g. only read method signatures). When the knowledge acquired through skimming is not enough to understand the code snippet, they have to spend energy to read the specific information of the code (such as the content in the method body). Although the former method is efficient, it is easy to lose effective information in the code, while the latter method is too time-consuming, and the knowledge gained by skimming the code is difficult to share with other developers. [0003] As a common alternative to skimming, developers often also read code summar...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F8/41
Inventor 王金水郑建生邹复民赵钊林薛醒思黄丽丽唐郑熠杨荣华聂明星
Owner FUJIAN UNIV OF TECH