Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Duplicated code detection method and device based on abstract syntax tree

An abstract syntax tree and code detection technology, which is applied in the field of repeated code detection based on abstract syntax trees, can solve the problem of reducing detection accuracy by irrelevant information such as blanks and comments

Inactive Publication Date: 2016-09-28
NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT
View PDF4 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In order to solve the problem that irrelevant information such as blanks and notes reduces the accuracy of detection in the process of repeated code detection, the present invention provides a method and device for detecting repeated codes based on an abstract syntax tree

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Duplicated code detection method and device based on abstract syntax tree
  • Duplicated code detection method and device based on abstract syntax tree
  • Duplicated code detection method and device based on abstract syntax tree

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0055] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0056] Such as figure 1 As shown, the repetitive code detection method based on the abstract syntax tree provided in the embodiment of the present invention specifically includes the following steps:

[0057] Step 1, construct the abstract syntax tree of the code to be tested and the sample code respectively;

[0058]Step 2, classify the subtrees of the two abstract syntax trees according to the type of the root node;

[0059] Step 3, comparing the subtrees ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a duplicated code detection method and device based on an abstract syntax tree. The duplicated code detection method comprises the following steps of respectively constructing abstract syntax trees for a to-be-detected code and a sample code; classifying subtrees of two abstract syntax trees according to the root node types; comparing the subtrees of the two abstract syntax trees with the same root node type and judging whether a public subtree exists; and when the public subtree exists, obtaining a code corresponding to the public subtree and judging the code as a duplicated code. Through the method of the invention, the influence of irrelevant information, such as spacing, line feed, indent and annotation, on similarity judgment can be completely avoided, and the duplicated code can be quickly detected.

Description

technical field [0001] The invention relates to the technical field of software applications, in particular to a method and device for detecting repeated codes based on an abstract syntax tree. Background technique [0002] When adding new functions or doing maintenance in a large software system, it is very common to introduce duplicate code into the program. Because in this case, developers are not very familiar with the structure of the code, in order to reduce the introduction of new bugs in the program, usually by copying, the code segment that implements a similar function is slightly modified, or even implemented as it is. Replication, through the reuse of code to meet the current system design requirements. [0003] Code reuse results in many similar pieces of code (called "duplicated code") in the final software product. Studies have shown that typical software systems include as much as 7% to 23% of duplicated code. In the process of software maintenance and evo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F11/36
CPCG06F11/3616
Inventor 易立杜翠兰任彦李鹏霄钮艳刘晓辉查奇文佟玲玲
Owner NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products