C language-oriented source code clone detection method

A detection method and source code technology, applied in the field of C language-oriented source code clone detection, can solve problems such as high false positive rate, low recall rate, and difficulty in detecting Type-3 code clones.
CN110209425BActive Publication Date: 2022-03-15UNIV OF ELECTRONICS SCI & TECH OF CHINA

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
UNIV OF ELECTRONICS SCI & TECH OF CHINA
Publication Date
2022-03-15

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses a C language-oriented source code clone detection method. The technical solution of the present invention is: adopting the context facial features grammar to define the grammar of C language, for realizing the parsing of the source program, generating the parsing tree of the source program, and then transforming the whole parsing tree to obtain the transformed parsing tree, and then Then restore the source code in text form. In addition, it also includes formatting and normalizing the obtained source code in text form. Perform clone detection on the obtained C function through the LCS algorithm to obtain the clone function detection result of the current function to be detected. During clone detection, only the code sequence length of the function falls within the length range of the clone function allowed by it as the current function to be detected The clone comparison object. The invention can realize the detection of Type3 clones, and control the calculation amount of detection to a certain extent.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention belongs to the technical field of clone code detection, and in particular relates to C language-oriented source code clone detection. Background technique

[0002] The current research on clone code detection generally divides clone code into three categories:

[0003] (1) Type1: Except for spaces and comments, the cloned code is exactly the same;

[0004] (2) Type2: The cloned code with the same syntax and modified identifiers, constants, and types;

[0005] (3) Type3: On the basis of Type2, further modify the statement, such as adding a statement, removing the statement, modifying the statement statement, etc. to generate cloned code.

[0006] The early idea of ​​code cloning detection is very intuitive, and the code is directly treated as pure text (string), and the similarity of the code is judged from the similarity of the text. The representative technology is Baker's Dup, which, like a general web page similarity detection tool,...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More