Multi-granularity answer sorting multi-document machine reading understanding method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A reading comprehension, multi-document technology, applied in the field of machine reading comprehension, can solve problems such as poor model representation and generalization ability, inability to integrate multi-granularity question and answer correlation, and limited model input length.

Active Publication Date: 2020-01-03

BEIJING INSTITUTE OF TECHNOLOGYGY

View PDF5 Cites 36 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] The purpose of the present invention is to solve the technical defects of the existing model that the input length is limited or the correlation between multi-granularity questions and answers cannot be integrated, which leads to poor model representation and generalization capabilities, and proposes a multi-document machine reading method for sorting multi-granularity answers understanding method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0073] figure 1 A multi-document machine reading comprehension method for multi-granularity answer sorting according to the present invention and a flow chart of this embodiment;

[0074] from figure 1 It can be seen that the present invention comprises the following steps:

[0075] Step A: question and document joint feature representation;

[0076] Specifically, obtain questions and multiple documents and perform document splitting, input sequence vectorization, and text semantic representation;

[0077] Specifically in this embodiment, this step A corresponds to steps 1 to 5 in the summary of the invention;

[0078] Obtain the question and multiple documents and split the documents, specifically: split the documents into sequence lengths that can be input by the model according to the predefined sliding window length and sliding distance, specifically corresponding to steps 1 to 2 in the content of the invention;

[0079] Input sequence vectorization, that is, to obtain...

Embodiment 2

[0091] This example will start with the question "Is a gecko a beneficial insect?" Document 1 "Gecko is a beneficial insect. It eats mosquitoes, flies and insects. It looks ugly but is actually a beneficial insect and does not bite people." Document 2 "First of all, it is harmless to humans. You should not You will be afraid when you see it in the house, but it will not bite, and it also eats mosquitoes and bugs. Geckos have no harmful characteristics, and a few species may be poisonous." Document 3 "Geckos are reptiles that belong to Lizards have a lot of medicinal value, beneficial insects, nocturnal, and like to hunt flies in places with lights at night, they are not harmful to people, and they are national second-class protected animals.” An example is described in the present invention. The specific operation steps of a multi-document machine reading comprehension method for multi-granularity answer sorting are described in detail.

[0092] The processing flow of a multi-...

Embodiment 3

[0111] In order to further verify the effectiveness of a multi-document machine reading comprehension method for multi-granularity answer sorting in the present invention, this embodiment uses the 270,000 question-multi-document corpus used in Example 2, and the question-multi-document corpus comes from Baidu Company 2017 DuReader, a large-scale Chinese multi-document machine reading comprehension dataset released in 2016, each question corresponds to 5 documents from Baidu search engine or Baidu Knowing Community. The dataset has a document answer label for each question, and the 5 documents are searched The order in which the engine actually returns.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a multi-granularity answer sorting multi-document machine reading understanding method, and belongs to the technical field of machine reading understanding application. The method is based on a pre-trained deep learning model. Splitting the document into text fragments through a sliding window, and splicing the text fragments with questions; a plurality of candidate answersgenerated by a plurality of documents are sorted by fusing multi-granularity answer sorting of statistical information, shallow semantic information, deep semantic information and answer ending wordinformation, and the semantic information of different granularities is fully utilized to capture the correlation between a question and the plurality of candidate answers. According to the method, the text representation capability and the generalization capability of a traditional machine reading understanding model are improved by utilizing a pre-trained deep learning model; Meanwhile, the defect that the input length of an existing model for a multi-document scene is limited is overcome, Meanwhile, the answer quality of multi-document machine reading understanding is improved by modeling the correlation between questions and answers from different granularities.

Description

technical field [0001] The present invention relates to a multi-document machine reading comprehension method for sorting multi-granularity answers, in particular to a multi-document machine reading comprehension method for multi-granularity answer sorting by fusing statistical information, shallow semantic information, deep semantic information and answer ending word information , belonging to the technical field of machine reading comprehension applications. Background technique [0002] In recent years, the performance of Machine Reading Comprehension (MRC) on multiple machine reading comprehension tasks has been significantly improved, and the machine reading comprehension model based on the attention mechanism is considered to be the most classic in machine reading comprehension. method, it first mathematically models the question and the document, and then fuses the question and document information based on the attention mechanism to form an answer probability model i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F16/35G06K9/62

CPCG06F16/35G06F18/2113G06F18/2411G06F18/2415G06F18/25G06F18/214

Inventor 史树敏刘宏玉黄河燕

Owner BEIJING INSTITUTE OF TECHNOLOGYGY

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Multi-granularity answer sorting multi-document machine reading understanding method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology