Unlock instant, AI-driven research and patent intelligence for your innovation.

Low-frequency word perception source code annotation generation method and tool

A technology of source code and low-frequency words, which is applied in the computer field, can solve the problems of affecting annotation efficiency, weak prediction effect of low-frequency words, and poor performance of retrieval fusion process, and achieve the effects of optimizing calculation methods, strengthening prediction effects, and improving efficiency

Pending Publication Date: 2022-04-12
BEIHANG UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] In addition, the fusion-based annotation generation method has a weak prediction effect on combined low-frequency words, and the performance of the retrieval fusion process is poor, which seriously affects the efficiency of the entire model to generate annotations
[0010] Moreover, the existing methods for automatically generating code comments are only for English comments, and there is a lack of research and exploration related to Chinese comments.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Low-frequency word perception source code annotation generation method and tool
  • Low-frequency word perception source code annotation generation method and tool
  • Low-frequency word perception source code annotation generation method and tool

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in combination with specific embodiments and with reference to the accompanying drawings. It should be understood that these descriptions are exemplary only, and are not intended to limit the scope of the present invention. Also, in the following description, descriptions of well-known structures and techniques are omitted to avoid unnecessarily obscuring the concept of the present invention.

[0033] Aiming at the defects of the fusion-based annotation generation method, the present invention proposes a low-frequency word-aware source code annotation efficient generation method, which can more effectively predict low-frequency words, and the retrieval fusion process is more efficient. Use BPE word splitting technology to convert low-frequency words in annotations, optimize the calculation method of the retrieva...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Various embodiments of the disclosure relate to low frequency word aware source code annotation generation and tools. The low-frequency word perception source code annotation generation method comprises the steps that character or character string merging is conducted on annotations of source codes on the basis of a byte pair coding model, so that the annotations in the source codes are segmented; the editing weight is obtained based on the editing distance between a first character or character string and a second character or character string, meanwhile, semantic retrieval is performed on the second character or character string through Faise, the first character or character string is a character or character string of a source code retrieved based on bit parallel computing, and the second character or character string is a character or character string of a source code retrieved based on bit parallel computing. The second character or character string is a to-be-tested character or character string of the source code; and based on the fusion weight, fusing the retrieved source code information into the to-be-tested source code, generating a sequence of sub-words, and merging the sub-words to obtain an annotation of the to-be-tested source code. In this way, combined low-frequency words in code annotations can be effectively predicted, source code annotations can be efficiently generated, and Chinese annotations can be expanded.

Description

technical field [0001] The invention relates to the field of computers, in particular to a method and tool for generating low-frequency word-aware source code annotations. Background technique [0002] The goal of automatic source code annotation is to automatically generate annotations that can accurately describe the source code. These annotations include descriptions in the code and related content in development documents, which are crucial to understanding and maintaining software systems. Using these annotations, developers can understand and maintain software systems more effectively and quickly, free themselves from the tedious task of writing annotations, improve reading speed, quickly and accurately grasp the semantics of source code, quickly locate specific code locations, and judge needs Analyze which parts of your source code in detail to develop more efficiently. [0003] At present, the research on automatic source code annotation is mainly divided into three...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F8/73G06F40/216G06F40/242G06F40/289G06F40/30G06F16/335
Inventor 王旭唐宇黄元才张建刘旭东
Owner BEIHANG UNIV