Method and system for analyzing source code similarity and electronic equipment

A source code and similarity technology, which is applied in the field of analyzing source code similarity, can solve the problems that the development team is difficult to identify the code of similar logic, the impact, the increase of code size and complexity, etc.

Pending Publication Date: 2020-10-23
北京思特奇信息技术股份有限公司
View PDF2 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] In the process of the development team working together to complete large-scale software, the business requirements in large-scale software are constantly iterated, and the code is also increasing based on business changes, and the code size and complexity are gradually in

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for analyzing source code similarity and electronic equipment
  • Method and system for analyzing source code similarity and electronic equipment
  • Method and system for analyzing source code similarity and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] Such as figure 1 As shown, a method for analyzing source code similarity in the embodiment of the present invention includes the following steps:

[0034] S1. Generate an abstract syntax tree in JSON format according to the program source code;

[0035] S2. Map the abstract syntax tree into a space vector according to the vocabulary in the abstract syntax tree;

[0036] S3. According to the cosine similarity calculation method and the space vector, calculate the cosine similarity used to represent the source code similarity.

[0037] First, generate an abstract syntax tree in JSON format according to the program source code, then map the abstract syntax tree into a space vector according to the vocabulary in the abstract syntax tree, and finally, according to the cosine similarity calculation method and the space vector, calculate the representation The cosine similarity of source code similarity can assist the development team to identify source code with repeated or...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method and system for analyzing source code similarity and electronic equipment. The method comprises the following steps: firstly, generating an abstract syntax tree in a JSON format according to a program source code, mapping the abstract syntax tree into a spatial vector according to vocabularies in the abstract syntax tree; and finally, according to a cosine similarity calculation method and the spatial vector, calculating cosine similarity used for representing source code similarity, thereby assisting a development team to identify repeated or similar logic source codes, and providing a judgment basis for implementation of code reconstruction, service merging and other scenes.

Description

technical field [0001] The invention relates to the technical field of computers, in particular to a method, system and electronic equipment for analyzing source code similarity. Background technique [0002] In the process of the development team working together to complete large-scale software, the business requirements in large-scale software are constantly iterated, and the code is also increasing based on business changes, and the code size and complexity are gradually increasing. Due to similar logic code It may appear in different business code components, and it is difficult for the development team to identify codes with similar logic, which will affect the implementation of application scenarios such as code refactoring and business merging. Contents of the invention [0003] The technical problem to be solved by the present invention is to provide a method, system and electronic equipment for analyzing the similarity of source codes for the deficiencies of the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F8/75G06F40/194
CPCG06F8/75G06F40/194
Inventor 张睿
Owner 北京思特奇信息技术股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products