Article duplicate checking method and system

An article and text technology, applied in the field of computer search, can solve the problem that the correctness of judgment results is difficult to be guaranteed, and achieve the effect of facilitating modification, reducing workload and improving accuracy.

Pending Publication Date: 2019-06-21
重庆誉存科技有限公司
View PDF3 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method relies on the calculation of natural language similarity. Due to the complexity of the Chinese language, it is difficult to guarantee the correctness of the judgment results based on semantic knowledge.
[0007] For the current plagiarism check technology, if the author of the article is in the same paragraph, select as many documents as possible, and extract some clauses from each reference to the same paragraph, which will not be quickly detected by the article plagiarism check system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Article duplicate checking method and system
  • Article duplicate checking method and system
  • Article duplicate checking method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0040] figure 1 It shows a schematic flowchart of the article plagiarism checking method provided by Embodiment 1 of the present invention. The methods include:

[0041] Step S1, according to the keyword of the uploaded article, obtain relevant text information;

[0042] Step S2, selecting comparison samples from the text information;

[0043] Step S3, respectively decomposing the article and the comparison sample;

[0044] Step S4, calculating text similarity according to the decomposed article and the comparison sample.

[0045] The concrete technical scheme of embodiment one of the present invention is:

[0046] Step S1, according to the keyword of the uploaded article, relevant text information is obtained.

[0047] Preferably, the text information containing the keyword is obtained on a relevant website through a web crawler.

[0048] Step S2, selecting comparison samples from the text information.

[0049] A piece of information is randomly selected from the text ...

Embodiment 2

[0070] Corresponding to the embodiments of the present invention, figure 2 A schematic structural diagram of an article plagiarism checking system provided by an embodiment of the present invention is shown. The system includes: an information acquisition module 101 , a sample selection module 102 , a decomposition module 103 , a similarity calculation module 104 , and a result marking module 105 .

[0071] The information acquisition module 101 is used for uploading articles and obtaining text information on related websites according to keywords in the articles. The information acquisition module is a web crawler module, and the web crawler module can automatically grab information on the Internet according to certain rules. In the embodiment of the invention, the rules can be set to grab information containing keywords in the article, then the web crawler module can automatically grab information on the Internet according to certain rules. The crawler module can collect t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an article duplicate checking method and system, and the method comprises the steps: obtaining the related text information according to the uploaded keyword of an article; selecting a comparison sample from the text information; respectively decomposing the article and the comparison sample; and calculating text similarity according to the article after decomposition and the comparison sample. According to the article duplicate checking method and system provided by the invention, the workload of teachers in university schools is effectively reduced, and the accuracy of article duplicate checking judgment results is improved.

Description

technical field [0001] The invention relates to the field of computer search, in particular to a method and system for screening repeated information on the Internet. Background technique [0002] At present, college graduates must complete a qualified graduation essay in order to be able to successfully graduate from the university. Tutors need to check and evaluate the graduation essay. In this process, plagiarism checking is the most important thing, and the workload is also a big one. Task. [0003] At present, there are mainly three methods for article plagiarism checking: methods based on string matching, methods based on document fingerprints and methods based on semantic knowledge. [0004] The method based on string matching is a method based on mathematical statistics. It first uses the string matching algorithm to find out the number of strings that match the document to be detected with the document in the database, and then uses the similarity calculation form...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F16/33G06F16/9535
Inventor 刘德彬陈玮孙世通
Owner 重庆誉存科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products