Similarity detection method and system based on power information system code file

A technology of system code and power information, which is applied in the field of similarity detection based on power information system code files, can solve the problems of low detection effect, complex method execution time, high error rate, etc., and achieve lower detection error rate, strong comprehensiveness, The effect of high detection accuracy

Active Publication Date: 2019-11-19
NANJING NARI GROUP CORP +2
View PDF2 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] Purpose of the invention: In order to overcome the deficiencies of the prior art, the present invention provides a similarity detection method and system based on the code file of the electric power information system, which can solve the problem of low detection effect and complex execution of some methods encountered in the process of detecting code similarity The problem of long time and high error rate in some cases

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Similarity detection method and system based on power information system code file
  • Similarity detection method and system based on power information system code file
  • Similarity detection method and system based on power information system code file

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0058] The following will clearly and completely describe the technical solutions in the embodiments of the present invention in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0059] Lucene tokenizer: Lucene is a sub-project of the Apache Software Foundation project team. It is an open source full-text search engine toolkit, but it is not a complete full-text search engine, but a full-text search engine architecture. It provides Complete query engine and index engine, part of the text analysis engine (English and German two western languages). Lucene provides excellent tokenizers, such as IKAnalyzer is an open source...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a similarity detection method and system based on a power information system code file, and the method comprises the steps: obtaining a first code file and a second code file which need to judge the similarity, and respectively obtaining a first text and a second text after preprocessing; obtaining the text semantic word vector according to the TF-IDF value of the word, respectively searching function call tree structures of the first text and the second text from the function call entrances of the first text and the second text, and calculating to obtain a first text structure vector and a second text structure vector; calculating an intermediate semantic word vector through a text semantic word vector, taking a union set of the first text structure vector and thesecond text structure vector, and calculating a first intermediate structure vector and a second intermediate structure vector; and obtaining the similarity between the first text and the second text.Firstly, a preprocessing function is adopted to perform code simplification on a code file, the detection efficiency is improved, and the detection error rate is reduced.

Description

technical field [0001] The invention relates to the technical field of code similarity detection, in particular to a similarity detection method and system based on code files of an electric power information system. Background technique [0002] Code similarity detection technology is currently mainly used in code plagiarism detection, which is an important task in computer software development and maintenance activities, in many fields such as source code plagiarism detection, software component library query, software defect detection, program understanding, etc. has wide application. It can not only help teachers to detect plagiarism of students' program assignments, but also has good practical significance for the identification of software copyright. [0003] In the paper published at the 6th Annual CCSCN Northeastern Conference, Middlebury VT.2001, "Metrics-based plagiarism monitoring. A means of plagiarism. Respectively (1) copy verbatim (2) change the comment sta...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F11/36G06F17/22G06F17/27G06K9/62
CPCG06F11/3604G06F18/22
Inventor 钱琳俞俊朱广新庞恒茂任晓龙胡鑫许明杰王琳梅竹陈海洋
Owner NANJING NARI GROUP CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products