Unlock instant, AI-driven research and patent intelligence for your innovation.

Similar document searching method and device, electronic equipment and storage medium

A search method and document technology, applied in network data retrieval, other database retrieval, electronic digital data processing, etc., can solve the problems of low efficiency and accuracy when searching for similar documents, achieve accurate similar documents, improve accuracy, and sort precise effect

Pending Publication Date: 2022-07-15
BEIJING KINGSOFT DIGITAL ENTERTAINMENT CO LTD
View PDF0 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the embodiments of the present invention is to provide a similar document search method, device, electronic equipment and storage medium to solve the problem of low efficiency and accuracy in similar document search in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Similar document searching method and device, electronic equipment and storage medium
  • Similar document searching method and device, electronic equipment and storage medium
  • Similar document searching method and device, electronic equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0077] The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art based on the present invention fall within the protection scope of the present invention.

[0078] The terms involved in the present invention are first introduced below.

[0079] BERT (Bidirectional Encoder Representations from Transformers) is a pre-trained language representation model. The BERT model is designed to predict the current word from the left and right context, and / or predict the next sentence from the current sentence. Therefore, with only one additional output layer, pretrained BERT representations can be f...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a similar document searching method and device, electronic equipment and a storage medium, and relates to the field of data processing, in particular to the technical field of document retrieval. According to the specific scheme, the method comprises the steps of determining a target document to be subjected to similar document search; recalling a plurality of candidate documents from a document library by utilizing the target document; for each candidate document, calculating the similarity between the candidate document and the target document under various granularities, and carrying out fusion processing on the calculated similarity to obtain the document similarity between the candidate document and the target document; wherein the various granularities comprise at least two of a character level, a sentence level and a semantic level; and based on the determined document similarity, selecting a similar document of the target document from the plurality of candidate documents. Through the scheme, the similar document searching accuracy can be improved.

Description

technical field [0001] The present invention relates to the field of data processing, in particular to the technical field of document retrieval, and in particular to a similar document search method, device, electronic device and storage medium. Background technique [0002] Artificial intelligence (AI) refers to the ability of an engineered (ie designed and manufactured) system to perceive the environment and to acquire, process, apply and represent knowledge. Natural Language Processing (NLP, Natural Language Processing) is an important direction in the field of computer science and artificial intelligence. It refers to the use of computers to process the shape, sound, meaning and other information of natural language. The operation and processing of input, output, recognition, analysis, understanding, generation, etc., it studies various theories and methods that can realize effective communication between humans and computers using natural language. The specific manife...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/953G06F40/194G06F40/258G06F40/279G06F40/30
CPCG06F16/953G06F40/194G06F40/258G06F40/279G06F40/30
Inventor 弓源李长亮
Owner BEIJING KINGSOFT DIGITAL ENTERTAINMENT CO LTD