Unlock instant, AI-driven research and patent intelligence for your innovation.

Patent duplicate checking method and system based on complex network text language intention coding mode

A complex network and coding method technology, applied in the field of patent plagiarism checking based on the complex network text language intent coding method, can solve problems such as unreasonable, unclear physical meaning, and difficult to associate full-text information, so as to achieve easy calculation and expressiveness Clear, physically meaningful effects

Inactive Publication Date: 2019-12-24
DATAGRAND TECH INC
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the way long texts are encoded has so far lacked a pleasing way of encoding
One is to consider the co-occurrence frequency of long texts and words, completely ignoring the relationship between words, but this is obviously unreasonable; the other is to consider the probability model of N-grams, but once too many contexts are considered, the calculation is particularly difficult It is complex, and it is difficult to associate full-text information, with obvious shortcomings; the third is the encoding method of Sequence to Sequence training based on deep learning, which requires a large amount of training data, not to mention its physical meaning is not clear, and its accuracy cannot be trusted. The scene is difficult to apply; the fourth is some coding methods, because the length of the document is different, and the text concept is written differently, which leads to the problem that the coding needs to be aligned, which is also difficult to solve

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Patent duplicate checking method and system based on complex network text language intention coding mode

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] In order to better understand the technical solutions of the present invention, the embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0039] It should be clear that the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0040] The present application will be described in further detail below through specific embodiments and in conjunction with the accompanying drawings.

[0041]The embodiment of the present invention provides a patent plagiarism checking method and system based on complex network text semantic coding mode.

[0042] The patent plagiarism checking system based on the complex network text semantic meaning encoding method includes a prep...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a patent duplicate checking method and a patent duplicate checking system based on a complex network text language intention coding mode. The method comprises the following steps of preprocessing a public data set obtained from the Internet into language material data; inputting the corpus data into a Word2Vector model for training and testing to generate a word vector model; respectively inputting the to-be-queried duplicate long text and the public long text into the word vector model to obtain a to-be-queried duplicate long text word vector and a public long text word vector; constructing a multi-dimensional complex directed graph of the long text to be subjected to duplicate checking and a multi-dimensional complex directed graph of the public long text; obtaining the tensor of the long text to be subjected to duplicate checking and the tensor of the public long text; and judging the text similarity degree by calculating the similarity between the to-be-checked duplicate long text tensor and the public long text tensor. The complex web text language intention coding mode provided by the invention can fully represent the relationship between characters and words in the document and the weight of the relationship, and is clear in expressive meaning, definite in physical significance and easy to calculate.

Description

technical field [0001] The present invention relates to the technical field of text similarity matching, in particular to a method and system for patent plagiarism checking based on complex network text language intent coding. Background technique [0002] Text matching mainly studies the similarity between two pieces of text. The problem of similarity includes two layers: one is how to represent two pieces of text so that the computer can easily process it; the other is how to define similarity as an optimization goal, such as semantic matching similarity, click relationship similarity, user behavior similarity, etc. Business scenarios are closely related. [0003] For the representation of text, the current mainstream method mainly stays at the word level, and the relationship between words in the entire text is less considered. Around 1970, the mainstream representation methods were vector space algorithms such as TF-IDF and BM25. This type of method is to express the w...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/9032G06F16/951G06F17/22
CPCG06F16/90332G06F16/951
Inventor 周明星陈运文江永青桂洪冠边一雄纪达麒
Owner DATAGRAND TECH INC