A Multithreaded Program Plagiarism Detection Method Based on Frequent Pattern Mining

A technology of frequent patterns and detection methods, applied in the direction of program/content distribution protection, etc.

Active Publication Date: 2021-04-30
XIAN UNIV OF POSTS & TELECOMM
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the uncertainty of thread interweaving makes the behavior of multi-threaded programs also show great uncertainty, which leads to great randomness in the analysis of multi-threaded programs by traditional dynamic birthmark technology.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Multithreaded Program Plagiarism Detection Method Based on Frequent Pattern Mining
  • A Multithreaded Program Plagiarism Detection Method Based on Frequent Pattern Mining
  • A Multithreaded Program Plagiarism Detection Method Based on Frequent Pattern Mining

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0069] Example 1: Assume that program p is at some input I 1 The following has been executed twice, and the execution trajectories after filtering by interference items are and Among them, respectively use I 1 1 and I 1 2 Indicates that the program is inputting I 1 The first execution and the second execution under.

[0070] Assuming that the value of τ is 3 (only for the convenience of expression, the actual value is preferably between 8-10), then, for the trajectory of the first execution After processing in step S303, the pattern candidate set (allowing elements to be repeated) of the execution track is obtained as follows:

[0071]

[0072] Similarly, the trajectory corresponding to the second execution The pattern candidate set for is:

[0073]

[0074] Then, program p at some input I 1 The set of schema candidates (allowing element repetitions) under is:

[0075]

[0076] Step S103: use the frequent pattern mining algorithm to process the candidat...

Embodiment 2

[0083] Embodiment 2: Assume that according to step S401, for the program p in embodiment 1 in a certain input I 1 After processing the pattern candidate set below, the obtained frequent pattern set is:

[0084]

[0085] Further according to the flow described in steps S401-S405, obtain the program p in a certain input I 1 The thread-aware birthmark under:

[0086]

[0087] Step S104: given the plaintiff program p and the defendant program q under the input I thread sense birthmark and The formula (1) is used to realize the calculation of birthmark similarity:

[0088]

Embodiment 3

[0089] Example 3: For another program q, assume that at input I 1 The frequent pattern extracted according to steps S101-S103 is The resulting thread-aware birthmark is:

[0090] Then, program q and program p in Example 2 are input I 1 The similarities under the software birthmark are:

[0091]

[0092] Step S105: The dynamic birthmark is related to the input, which is an abstraction of the semantics and behavior of the program part under a specific input, and the judgment result made only by a single input is not reliable. In this regard, multiple different inputs are provided and steps S101-S104 are repeated to sequentially obtain the similarity of the birthmarks of the plaintiff and defendant programs under corresponding inputs, and take the average value of the similarity as the similarity of the plaintiff and defendant programs. Specifically, formula (2) is used to calculate program similarity:

[0093]

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-threaded program plagiarism detection method based on frequent pattern mining, which includes: 1) acquiring multiple execution trajectories of a program under multiple executions of the same input through dynamic monitoring; 2) preprocessing the program execution trajectories set, generating Pattern candidate set; 3) Use frequent pattern mining algorithm to process pattern candidate set, generate frequent pattern set and perform Hash processing to construct thread-aware birthmark; 4) Calculate the similarity of plaintiff and defendant program birthmark under specific input; 5) Based on multiple Input the mean value of birthmark similarity and a given threshold, make a plagiarism judgment and output the detection result. The present invention directly takes the executable program as the analysis object, without the existence of program source code; uses frequent pattern mining to extract behavior patterns from multiple execution trajectories corresponding to multiple runs of the program under the same input, and generates thread-aware birthmarks, which greatly reduces thread interweaving Uncertain interference.

Description

technical field [0001] The invention belongs to the technical field of program execution trajectory analysis and software plagiarism detection, and in particular relates to a multi-threaded program plagiarism detection method based on frequent pattern mining. Background technique [0002] In recent years, with the vigorous development of open source software communities such as GitHub and SourceForge, the software industry has achieved unprecedented prosperity. The problem of software plagiarism has also become more and more serious, and it is not uncommon to abuse other people's code. On the one hand, there is no lack of premeditated plagiarism driven by economic interests, such as the recent "Red Core Disturbance" incident, which claimed that the Red Core browser, which independently developed a domestic kernel, was only a simple package of Google Chrome browser; in addition, many large Software companies often integrate some software components from upstream companies in ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F21/14
CPCG06F21/14
Inventor 田振洲王清高聪王忠民陈彦萍张恒山
Owner XIAN UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products