Unlock instant, AI-driven research and patent intelligence for your innovation.

A method and system for similarity detection based on code file of electric power information system

A system code and power information technology, which is applied in the field of similarity detection based on power information system code files, can solve the problems of low detection effect, complex method execution time, and high error rate, so as to reduce the detection error rate, achieve strong comprehensiveness, The effect of high detection accuracy

Active Publication Date: 2022-07-19
NANJING NARI GROUP CORP +2
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] Purpose of the invention: In order to overcome the deficiencies of the prior art, the present invention provides a similarity detection method and system based on the code file of the electric power information system, which can solve the problem of low detection effect and complex execution of some methods encountered in the process of detecting code similarity The problem of long time and high error rate in some cases

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and system for similarity detection based on code file of electric power information system
  • A method and system for similarity detection based on code file of electric power information system
  • A method and system for similarity detection based on code file of electric power information system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0058] The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

[0059] Lucene tokenizer: Lucene is a sub-project of the Apache Software Foundation project group. It is an open source full-text search engine toolkit, but it is not a complete full-text search engine, but a full-text search engine architecture that provides Complete query engine and indexing engine, part of the text analysis engine (English and German two western languages). Lucene provides excellent tokenizers, such as IKAnalyzer which ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and a system for detecting similarity based on code files of an electric power information system. Obtain described text semantic word vector according to the TF-IDF value of word, start from the function call entrance of described first text and second text, look for the function call tree structure of described first text and second text respectively, and Calculate the first text structure vector and the second text structure vector; calculate the intermediate semantic word vector through the text semantic word vector, and calculate the first intermediate structure vector after taking the union of the first text structure vector and the second text structure vector and the second intermediate structure vector; and then obtain the similarity between the first text and the second text. The present invention firstly adopts the preprocessing function to simplify the code file to improve the detection efficiency and reduce the detection error rate.

Description

technical field [0001] The invention relates to the technical field of code similarity detection, in particular to a similarity detection method and system based on a code file of a power information system. Background technique [0002] Code similarity detection technology is currently mainly used in code plagiarism detection, which is an important task in computer software development and maintenance activities. are widely used. It can not only help teachers to detect plagiarism of students' program work, but also has good practical significance for the identification of software copyright. [0003] In a paper published at the 6th Annual CCSC Northeastern Conference, Middlebury VT. 2001, "Metricsbased plagiarism monitoring. Paper presented at the 6th Annual CCSC Northeastern Conference, Middlebury VT. 2001), Jones summarized ten plagiarism method. They are (1) copy verbatim (2) change the comment statement (3) change the white space (4) rename the identifier change the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F11/36G06F40/194G06F40/284G06F40/30G06K9/62
CPCG06F11/3604G06F18/22
Inventor 钱琳俞俊朱广新庞恒茂任晓龙胡鑫许明杰王琳梅竹陈海洋
Owner NANJING NARI GROUP CORP