Document retrieval method and device and computer readable storage medium

A document retrieval and computer program technology, applied in computer components, unstructured text data retrieval, computing, etc., can solve the problems of limited reading length, loss of semantic information, low accuracy of document retrieval, etc., to reduce the difficulty of reading , Narrow down the scope of the query, improve the effect of accuracy

Active Publication Date: 2020-05-15
CLOUDMINDS SHANGHAI ROBOTICS CO LTD
View PDF10 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The inventors have found that there are at least the following problems in the prior art: documents are usually composed of multiple sentences or multi-paragraph texts. If the retrieved candidate documents are directly input into the reading comprehension model, due to the limitation of the reading length of the reading comprehension model (that is, only Read and comprehend documents within the preset word count), so that candidate documents with more text words will not only increase the difficulty of machine reading, but also may lose some semantic information due to the limitation of reading length, thus directly affecting the overall performance of reading comprehension. resulting in inaccurate document retrieval

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Document retrieval method and device and computer readable storage medium
  • Document retrieval method and device and computer readable storage medium
  • Document retrieval method and device and computer readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention more clear, various implementation modes of the present invention will be described in detail below in conjunction with the accompanying drawings. However, those of ordinary skill in the art can understand that in each implementation manner of the present invention, many technical details are proposed in order to enable readers to better understand the present invention. However, even without these technical details and various changes and modifications based on the following implementation modes, the technical solution claimed in the present invention can also be realized.

[0023] Unless the context clearly requires, throughout the specification and claims, "comprises", "comprises" and similar words should be interpreted in an inclusive sense rather than an exclusive or exhaustive meaning; that is, "including but not limited to" meaning.

[0024] In the descri...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention relates to the field of natural language processing, and discloses a document retrieval method and device and a computer readable storage medium. The document retrievalmethod comprises the steps of acquiring candidate documents, wherein the candidate documents are determined by query statements input by a user, judging whether the text word number of the candidatedocument is smaller than or equal to a preset word number or not, if not, segmenting the candidate document into a plurality of sentences, calculating the similarity between each sentence in the plurality of sentences and the query sentence, deleting a part of sentences in the plurality of sentences according to the similarity until the total word number of the remaining sentences is smaller thanor equal to the preset word number, and inputting the remaining sentences and the query statement into a preset machine reading model to obtain an answer of the query statement. According to the document retrieval method and device and the computer readable storage medium provided by the invention, the reading difficulty of the machine reading model can be reduced, and meanwhile, the document retrieval accuracy is improved.

Description

technical field [0001] Embodiments of the present invention relate to the field of natural language processing, and in particular to a document retrieval method, device, and computer-readable storage medium. Background technique [0002] Document retrieval refers to retrieving the first few documents most relevant to the query from the retrieval database as document candidate sets; document reading refers to obtaining the answer to the query by machine reading the query and documents. Among them, the document candidate set is mainly obtained by calculating the similarity between the query and each document in the text library, and then sorting according to the similarity. [0003] The inventors have found that there are at least the following problems in the prior art: documents are usually composed of multiple sentences or multi-paragraph texts. If the retrieved candidate documents are directly input into the reading comprehension model, due to the limitation of the reading...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/33G06F16/35G06F40/295G06K9/62
CPCG06F16/3334G06F16/35G06F18/22Y02D10/00
Inventor 付霞
Owner CLOUDMINDS SHANGHAI ROBOTICS CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products