Supercharge Your Innovation With Domain-Expert AI Agents!

Detection and extraction method of repeated fragments in software codes

A technology of repeating segments and software codes, applied in the field of computer programs, can solve problems such as large amount of calculation, insufficient stability, poor robustness, etc., and achieve the effect of saving calculation amount

Inactive Publication Date: 2017-01-04
UNIV OF SHANGHAI FOR SCI & TECH
View PDF3 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This algorithm is slow, and the effect is not very good, and there are few practical applications
[0008] 5. Method based on code quality measurement (K.Kontogiannis, R.DeMori, E.Merlo, M.Galler, M.Bernstein, Pattern matching for clone and concept detection, Journal of Automated Software Engineering 3(1–2)(1996) 77–108), this method is inefficient and computationally intensive, so it is rarely used in practice
[0009] 6. Index-based (also known as inverted index) approach (Benjamin Hummel, Elmar Juergens, Lars Heinemann, and Michael Conradt. Index-based code clone detection: Incremental, distributed, scalable. In the International Conference on Software Maintenance, pages 1 –9, sept.2010), the index-based method is very efficient, but the only index-based sliding window scheme currently has poor performance and detection quality, is not stable enough, and has poor robustness

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Detection and extraction method of repeated fragments in software codes
  • Detection and extraction method of repeated fragments in software codes
  • Detection and extraction method of repeated fragments in software codes

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0034] In this embodiment, two source codes of files a and b to be detected are used to extract repeated segments.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a detection and extraction method of repeated fragments in software codes. The method comprises steps 1-8 in figures, hierarchical information in a syntax tree is adopted for extraction of code fragments, and grammar information in the code fragments is considered, so that the extracted code fragments are significant. Besides, the extraction process of the code fragments is controlled by a duplicate checking mechanism based on the reverse index; low-level extraction is not performed if high-level repetition is found. Compared with the manner that repetition checking is performed on the smallest fragments and then the fragments are combined in most techniques at present, the extraction method can reduce a large calculated amount. In the process, the size of a detection window can be adjusted automatically according to whether repeated contexts actually exist, the performance is improved, the detection speed is high, and the method can be applied to real scene detection. In addition, the method combines grammatical structure information, so that the misjudgment rate is very low.

Description

technical field [0001] The invention belongs to the field of computer programs, in particular to a method for detecting repeated segments in software codes. Background technique [0002] Code duplication detection is very important in software development. First of all, code duplication detection can improve the maintainability of software. If duplicate codes are allowed to be scattered everywhere, then if one code needs to evolve or undergo defect repair, the code in other places will also evolve or undergo defect repair, which will affect maintainability. Through code duplication detection to find duplication in the code, they can be extracted into functions in time to improve maintainability. Secondly, it can reduce the legal risk in software development. There are different licenses in software development. If the developer accidentally copies infectious license information (such as the GNU license) due to the negligence of the developer, it will bring serious harm to ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F11/36
CPCG06F11/3608
Inventor 张刚
Owner UNIV OF SHANGHAI FOR SCI & TECH
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More