Unlock instant, AI-driven research and patent intelligence for your innovation.

Method for detecting and positioning electronic text contents plagiary

A technology of electronic text and text, which is applied in the field of detecting whether electronic text contains plagiarized content, detecting and locating plagiarized content of electronic text, and can solve the problems that plagiarized text cannot be given at the same time, there is no storage, and plagiarized content cannot be located

Inactive Publication Date: 2009-04-08
XI AN JIAOTONG UNIV
View PDF1 Cites 22 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0010] However, since the complete text content is not stored in the text feature library of this method, this method will not give the specific content of the plagiarized text, that is, it cannot locate the specific plagiarized content
That is to say, for the plagiarized texts detected, conclusive evidence of plagiarism cannot be given at the same time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for detecting and positioning electronic text contents plagiary
  • Method for detecting and positioning electronic text contents plagiary
  • Method for detecting and positioning electronic text contents plagiary

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031]The basic idea of ​​the method for detecting and locating plagiarism in electronic text content of the present invention is as follows: first, detect whether two texts have a certain number of identical characters. If there is no identical text, there must be no plagiarism. If so, proceed to the next step of detection. Secondly, it is detected whether the sequence of the same text in the two texts is the same, whether a sentence is formed, that is, whether there are similar sentences. If there are no identical sentences, there is no plagiarism. If so, proceed to the next step of detection. Note: The same sentence does not mean that the two sentences are absolutely the same, and a single character is not bad. The same sentence allows individual words in the sentence to be different, but the main frame of the sentence should be the same. Finally, if the same sentence exceeds a certain range, it can be judged as plagiarism. Identical sentences are evidence of plagiaris...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for detecting and positioning electronic text contents plagiarism by a computer system. The computer system at least comprises an electronic text input module, a text feature extraction module, a plagiarism evidence extraction module, a text plagiarism judgment module, a detection result display module and a plagiarism contents positioning module. The detection method comprises the following steps: firstly, a feature is extracted according to the structural information and the semantic information of the text to obtain a sequence of items to be detected; then all items in the sequence of the items to be detected are sequentially processed to obtain a suspected plagiarism queue; thirdly, all suspected plagiarism queues are detected to obtain the plagiarism evidence and generate a plagiarism evidence sheet; finally, the resemblance of the text is calculated based on the evidence sheet, and the text is judged whether to have plagiarism. If the resemblance is greater or equal to a certain threshold, the detected text is considered to have plagiarism, or else, the detected text is not considered to have plagiarism. As to the text judged to have plagiarism, the corresponding plagiarism evidence thereof is extracted from the evidence sheet, and input the display module to display the specific plagiarism contents.

Description

technical field [0001] The invention belongs to the field of intelligent information processing and computer technology, and relates to a method for detecting whether an electronic text contains plagiarized content, in particular to a method for detecting and locating content plagiarism in an electronic text, which can accurately locate the detected electronic text , and give conclusive evidence of plagiarism. Background technique [0002] With the rapid development and popularization of the network, the electronic text published on the Internet has become a focus of current intellectual property protection. Because electronic texts are easy to copy and download, they have become the objects of research and citation by many people. Cases of some electronic texts being copied in large format and considered as plagiarism happen from time to time. At present, there are mainly two types of electronic text protection measures on the Internet: one is the "blocking" method, and th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06F21/00G06F21/10
Inventor 鲍军鹏冯中慧
Owner XI AN JIAOTONG UNIV