Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Multi-granularity answer sorting multi-document machine reading understanding method

A reading comprehension, multi-document technology, applied in the field of machine reading comprehension, can solve problems such as poor model representation and generalization ability, inability to integrate multi-granularity question and answer correlation, and limited model input length.

Active Publication Date: 2020-01-03
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF5 Cites 36 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to solve the technical defects of the existing model that the input length is limited or the correlation between multi-granularity questions and answers cannot be integrated, which leads to poor model representation and generalization capabilities, and proposes a multi-document machine reading method for sorting multi-granularity answers understanding method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-granularity answer sorting multi-document machine reading understanding method
  • Multi-granularity answer sorting multi-document machine reading understanding method
  • Multi-granularity answer sorting multi-document machine reading understanding method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0073] figure 1 A multi-document machine reading comprehension method for multi-granularity answer sorting according to the present invention and a flow chart of this embodiment;

[0074] from figure 1 It can be seen that the present invention comprises the following steps:

[0075] Step A: question and document joint feature representation;

[0076] Specifically, obtain questions and multiple documents and perform document splitting, input sequence vectorization, and text semantic representation;

[0077] Specifically in this embodiment, this step A corresponds to steps 1 to 5 in the summary of the invention;

[0078] Obtain the question and multiple documents and split the documents, specifically: split the documents into sequence lengths that can be input by the model according to the predefined sliding window length and sliding distance, specifically corresponding to steps 1 to 2 in the content of the invention;

[0079] Input sequence vectorization, that is, to obtain...

Embodiment 2

[0091] This example will start with the question "Is a gecko a beneficial insect?" Document 1 "Gecko is a beneficial insect. It eats mosquitoes, flies and insects. It looks ugly but is actually a beneficial insect and does not bite people." Document 2 "First of all, it is harmless to humans. You should not You will be afraid when you see it in the house, but it will not bite, and it also eats mosquitoes and bugs. Geckos have no harmful characteristics, and a few species may be poisonous." Document 3 "Geckos are reptiles that belong to Lizards have a lot of medicinal value, beneficial insects, nocturnal, and like to hunt flies in places with lights at night, they are not harmful to people, and they are national second-class protected animals.” An example is described in the present invention. The specific operation steps of a multi-document machine reading comprehension method for multi-granularity answer sorting are described in detail.

[0092] The processing flow of a multi-...

Embodiment 3

[0111] In order to further verify the effectiveness of a multi-document machine reading comprehension method for multi-granularity answer sorting in the present invention, this embodiment uses the 270,000 question-multi-document corpus used in Example 2, and the question-multi-document corpus comes from Baidu Company 2017 DuReader, a large-scale Chinese multi-document machine reading comprehension dataset released in 2016, each question corresponds to 5 documents from Baidu search engine or Baidu Knowing Community. The dataset has a document answer label for each question, and the 5 documents are searched The order in which the engine actually returns.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a multi-granularity answer sorting multi-document machine reading understanding method, and belongs to the technical field of machine reading understanding application. The method is based on a pre-trained deep learning model. Splitting the document into text fragments through a sliding window, and splicing the text fragments with questions; a plurality of candidate answersgenerated by a plurality of documents are sorted by fusing multi-granularity answer sorting of statistical information, shallow semantic information, deep semantic information and answer ending wordinformation, and the semantic information of different granularities is fully utilized to capture the correlation between a question and the plurality of candidate answers. According to the method, the text representation capability and the generalization capability of a traditional machine reading understanding model are improved by utilizing a pre-trained deep learning model; Meanwhile, the defect that the input length of an existing model for a multi-document scene is limited is overcome, Meanwhile, the answer quality of multi-document machine reading understanding is improved by modeling the correlation between questions and answers from different granularities.

Description

technical field [0001] The present invention relates to a multi-document machine reading comprehension method for sorting multi-granularity answers, in particular to a multi-document machine reading comprehension method for multi-granularity answer sorting by fusing statistical information, shallow semantic information, deep semantic information and answer ending word information , belonging to the technical field of machine reading comprehension applications. Background technique [0002] In recent years, the performance of Machine Reading Comprehension (MRC) on multiple machine reading comprehension tasks has been significantly improved, and the machine reading comprehension model based on the attention mechanism is considered to be the most classic in machine reading comprehension. method, it first mathematically models the question and the document, and then fuses the question and document information based on the attention mechanism to form an answer probability model i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/35G06K9/62
CPCG06F16/35G06F18/2113G06F18/2411G06F18/2415G06F18/25G06F18/214
Inventor 史树敏刘宏玉黄河燕
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products