Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Machine reading understanding method for guiding attention based on knowledge

A technology of reading comprehension and attention, applied in instruments, digital data processing, natural language data processing, etc., can solve problems such as inability to fully understand content

Active Publication Date: 2020-06-05
ZHEJIANG UNIV
View PDF8 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

When people read and understand the content of an article, the process of reasoning is almost everywhere. Without reasoning, people cannot fully understand the content, and the same is true for machines.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Machine reading understanding method for guiding attention based on knowledge
  • Machine reading understanding method for guiding attention based on knowledge
  • Machine reading understanding method for guiding attention based on knowledge

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0135] Taking the CNN\Daily Mail dataset as an example, apply the above method to the reading comprehension task. The specific parameters and practices in each step are as follows:

[0136] 1. Using the CNN\Daily Mail dataset, since the CNN\Daily Mail original dataset is stored in the form of one piece of data and one file, in order to facilitate subsequent processing, it is merged and redundant field information is removed, and only (Question, Context, Answer), using natural language processing tools to segment articles and questions into sentences and words, the vocabulary size is 118497 / 208045, and the average number of entities in CNN and Daily Mail articles is about 26;

[0137] 2. Use the 600 million Stanford GloVe 300-dimensional vectors that have been trained and the vocabulary in 1 to form a 300-dimensional word vector. In order to train the model, this paper counts the word frequencies in the training set, sorts them in descending order, and selects the first 50k word...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a machine reading understanding method for guiding attention based on knowledge. The method comprises the following steps of: (1) obtaining a word vector of a sequence by usinga pre-trained word embedding matrix; (2) modeling the context information of each word in the text by using a bidirectional GRU network; (3) inputting the context representation of the question intoa one-way GRU network as an initial hidden layer state, wherein the GRU network iteratively executes a search step by using an attention-based look-back mechanism so as to collect information possiblyused for predicting answers in the article; (4) taking external knowledge as long-term memory, adding the external knowledge into a replaying mechanism, guiding the focus of attention in the replaying process, and redistributing attention scores by the model; and (5) obtaining a predicted answer at the output end of the one-way GRU network through the pointer network. The method is an end-to-endmodel, and data preprocessing, except for pre-trained word vectors, in an unlabeled corpus set is not needed, so that the method can be widely applied to reading understanding of different languages and fields.

Description

technical field [0001] The invention relates to natural language processing, in particular to a machine reading comprehension method based on knowledge guiding attention. Background technique [0002] Natural Language Processing (NLP) is an interdisciplinary subject integrating linguistics and computer science. Reading comprehension (Reading Comprehension) is a fundamental task in natural language processing, usually by asking the system to answer questions, inferring answers from a given text or context. With the advent of the Internet age, the information on the Internet has exploded, including text data in various languages ​​and forms, such as news from Sina and Daily Mail, articles from Baidu and Wikipedia, Zhihu and Answers from question-and-answer communities like Quora. These corpora become the basis for constructing large-scale machine reading comprehension datasets. Teaching machines to read, process and understand human language is one of the core tasks of natu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/205G06F40/289G06F40/30G06N3/04
CPCG06N3/045
Inventor 庄越挺浦世亮汤斯亮谭洁郝雷光吴飞
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products