Plagiarism source retrieval sorting model construction method and plagiarism source retrieval sorting method

A technology of retrieval sorting and construction method, applied in the field of information retrieval, can solve the problem of inaccurate filtering results of source retrieval, and achieve the effect of accurate retrieval results

Active Publication Date: 2018-11-16
HEILONGJIANG INST OF TECH
View PDF3 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, currently only Williams et al. use classification-based machine learning methods (Williams K, Chen H H, Giles C L. Classifying and Tanking Search Engine Results as Potential Sources of Plagiarism [C]. Proceedings of the 2014ACM Symposium on Docum

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Plagiarism source retrieval sorting model construction method and plagiarism source retrieval sorting method
  • Plagiarism source retrieval sorting model construction method and plagiarism source retrieval sorting method
  • Plagiarism source retrieval sorting model construction method and plagiarism source retrieval sorting method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] Exemplary embodiments of the present invention will be described below with reference to the accompanying drawings. In the interest of clarity and conciseness, not all features of an actual implementation are described in this specification. It should be understood, however, that in developing any such practical embodiment, many implementation-specific decisions must be made in order to achieve the developer's specific goals, such as meeting those constraints related to the system and business, and those Restrictions may vary from implementation to implementation. Moreover, it should also be understood that development work, while potentially complex and time-consuming, would at least be a routine undertaking for those skilled in the art having the benefit of this disclosure.

[0034] Here, it should also be noted that, in order to avoid obscuring the present invention due to unnecessary details, only the device structure and / or processing steps closely related to the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a plagiarism source retrieval sorting model construction method and a plagiarism source retrieval sorting method. According to the plagiarism source retrieval sorting model construction method, training samples are utilized to train a predetermined sorting logic regression model through an order pair-based sorting learning manner on the basis of a degree of aggregation between each plagiarism source document of a reference document and the reference document until a value of a predetermined loss function is minimum, the predetermined loss function includes first and second sub-loss functions, the first sub-loss function represents a loss caused by sorting errors of order pairs formed on the basis of the plagiarism source documents and non-plagiarism source documentsof the reference document, and the second sub-loss function represents a loss caused by sorting errors of order pairs formed by plagiarism source documents with different degrees of aggregation. The plagiarism source retrieval sorting method utilizes the above obtained sorting model to resort retrieval results of suspicious documents. The above technology of the invention can more accurately sortthe source retrieval results of the suspicious documents in plagiarism detection.

Description

technical field [0001] The invention relates to information retrieval technology, in particular to a plagiarism source retrieval and sorting model building method and a plagiarism source retrieval and sorting method. Background technique [0002] In the general process of plagiarism detection source retrieval, plagiarism source retrieval algorithms usually filter the retrieval results to obtain plagiarism source documents that are finally text-aligned with suspicious documents. Among them, the performance of filtering is crucial to the performance of source retrieval and is an indispensable key step of source retrieval. [0003] At present, existing source retrieval and filtering techniques mainly adopt heuristic methods. However, it is difficult for the heuristic method to incorporate more effective features, and its performance improvement depends on the experience of experts and the discovery of effective filtering features. [0004] Compared with heuristic methods, mac...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 孔蕾蕾韩中元齐浩亮
Owner HEILONGJIANG INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products