Text duplicate checking processing method, device, computer equipment and computer storage medium

A processing method and text technology, applied in computer parts, text database query, calculation, etc., can solve the problems of low efficiency, poor reliability and low accuracy of duplicate checking methods

Pending Publication Date: 2020-10-09
PINGAN INT SMART CITY TECH CO LTD
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In view of this, the embodiment of the present application provides a text plagiarism check processing method, device, terminal, and computer storage medium to solve the problems of low efficiency, low accuracy, and poor reliability of the plagiarism check method in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text duplicate checking processing method, device, computer equipment and computer storage medium
  • Text duplicate checking processing method, device, computer equipment and computer storage medium
  • Text duplicate checking processing method, device, computer equipment and computer storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0072] In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application.

[0073] see figure 1 , figure 1 It is an implementation flowchart of a text plagiarism checking processing method provided in the first embodiment of the present application. The details are as follows:

[0074] Step S11: Obtain the word score table corresponding to the text to be checked for repetition, the word score table contains all the target words of the text to be checked for repetition and the word score values ​​​​corresponding to the target word, wherein the target word represents the The content information of the text to be checked, and the word score value ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of artificial intelligence and provides a text duplicate checking processing method, a device, computer equipment and a computer storage medium. The methodcomprises the steps of obtaining the similarity between a text to be duplicated and a historical text by obtaining a word score table corresponding to the text to be duplicated, combining a word score value corresponding to a target word in the word score table and performing comparison calculation on the text to be duplicated and the historical text stored in a historical text database accordingto a word level; comparing the similarity with a similarity threshold, and evaluating whether the to-be-duplicated text is a duplicated text or not according to a comparison result. According to themethod, similarity between texts is compared and calculated based on word score values corresponding to words; according to the text duplicate checking method and device, words containing special information have large influences on text similarity evaluation, general words have small influences on text similarity evaluation, whether the content of the two texts is strongly related to the words inthe two texts or not is judged repeatedly, and the accuracy and reliability of text duplicate checking are improved.

Description

technical field [0001] The present application relates to the technical field of artificial intelligence, and in particular to a text plagiarism checking method, device, computer equipment and computer storage medium. Background technique [0002] Now, projects apply for a series of preferential policies made by government agencies for enterprises or other research units. In order to obtain more incentive funds for project declarations, some enterprises apply for the same project to different government departments or apply for the same project in the name of different enterprises. Moreover, the description of the text file is often adjusted and differentiated when the same project is declared twice, so that the two text files are not exactly the same, achieving the effect of changing words without changing meanings. The behavior of these companies has undoubtedly increased the difficulty of checking duplicates. [0003] At present, the existing plagiarism checking method ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/194G06F16/33G06K9/62
CPCG06F40/194G06F16/3331G06F18/22
Inventor 肖丹陈翔
Owner PINGAN INT SMART CITY TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products