Hypothetical semi-supervised learning based open domain question answering method

A semi-supervised learning, open-domain technology, applied in the open-domain question answering field based on hypothetical semi-supervised learning, it can solve problems such as oversimplification, loss, and lack of semantic analysis, and achieve the effect of avoiding information omission

Active Publication Date: 2018-10-30
ZHEJIANG UNIV
View PDF9 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, such an algorithm uses a simple information retrieval system in the step of extracting documents and then directly sends the results to the reading comprehension step indiscriminately, so it is too simple and has no semantic analysis, which will lead to the loss of many synonymous sentences. chances of being matched to an article containing the answer

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hypothetical semi-supervised learning based open domain question answering method
  • Hypothetical semi-supervised learning based open domain question answering method
  • Hypothetical semi-supervised learning based open domain question answering method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] Specific embodiments of the present invention will be described below in conjunction with the accompanying drawings, so that those skilled in the art can better understand the present invention.

[0037] Such as figure 1 As shown, an open-domain question answering method based on hypothetical semi-supervised learning, such as figure 2 Shown is the structural representation of the present invention. The concrete steps of the inventive method are as follows:

[0038] S01, using information retrieval technology to extract articles related to the question q from the corpus. We represent all documents as word frequency-inverse document frequency weighted bags of words, and use a two-dimensional n-gram model to characterize all phrases. We use an inverted index to match relevant articles, and the number of articles matched by each question is set to 5.

[0039] S02, assuming that the article P that comes with the question and answer training set is given g is the only p...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a hypothetical semi-supervised learning based open domain question answering method. The method includes the following steps: (1) extracting articles related to a question froma corpus by using an information retrieval technology; (2) assuming that articles that come with a given question answering training set are unique positive labels and the articles extracted for thecorpus are negative labels; (3) constructing a deep learning model, learning features of the positive label by training an article scorer, and training a reader to select correct answers from the articles; (4) carrying out article relevance ranking, and sending the first n articles with high relevance to the scorer and re-labeled the articles according to the scores; (5) repeating the step 3 and step 4 until the model converges; and (6) finishing the model training and applying the open domain question answering. The method can greatly improve the quality of article extraction and the accuracyof answers in a conventional open domain question answering system without relying on additional manual annotation and external knowledge.

Description

technical field [0001] The invention relates to the field of natural language processing, in particular to an open domain question answering method based on hypothetical semi-supervised learning. Background technique [0002] In recent years, open-domain question answering problems have become very popular and difficult problems in natural language processing. In this task, given a corpus and a question, the algorithmic system returns an answer from the corpus. The biggest difference between it and machine reading comprehension is that it adds the process of finding articles from the corpus in addition to answering questions based on the article. Open domain question answering systems are widely used, because traditional search engines can only meet the needs of a small number of people and most of the returned answers are just web page links rather than a specific answer. A question answering system that can extract articles from a large corpus and give ideal answers can ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06N3/08
CPCG06N3/08
Inventor 潘博远蔡登姜兴华陈哲乾赵洲何晓飞
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products