Problem generation method and device and storage medium

A technology for question and generation models, applied in question generation methods, computer-readable storage media, and device fields, can solve problems such as inability to understand texts, inability to judge answer boundaries, and poor retrieval generalization capabilities

Active Publication Date: 2019-05-07
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF3 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] (1) Retrieval-based or knowledge-based general question answering systems cannot address customized needs
[0009] (2) For the question-and-answer method by indexing document content, first of all, not all content is question-and-answer content, so storing the entire text will cause a waste of storage space; secondly, the accuracy of generating questions in this way is low, because words A hit does not mean that the current content is the answer; there is also the inability to judge the boundary of the answer and the inability to form a visual FAQ document
The current technology cannot deeply understand the text or generate good text
[0010] (3) Question retrieval based on synonym matching or word matching has poor generalization ability and low recall rate

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Problem generation method and device and storage medium
  • Problem generation method and device and storage medium
  • Problem generation method and device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach

[0171] Way 1: using a Seq2Seq (Sequence to Sequence, sequence to sequence) model, that is, the above-mentioned first encoder-decoder model, to generate a question. The features used by the above models include lexical and syntactic features, the start and end positions of the answers predicted by the sequence tagging model, and word features. The input information of the above model is the paragraph of the document to be processed, and the output information is the question generated for the input information. In an example, the text content in the document to be processed is: "Beijing is the capital of China." Then the sequence labeling model can label "Beijing". Then "Beijing" and "Beijing is the capital of China" are used as input information, which is input to the seq2seq model to generate the question "Where is the capital of China".

[0172] The Seq2Seq model, also known as the Encoder-Decoder model (encoder-decoder model), is an important variant of the RNN model. The...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides a problem generation method and device and a computer readable storage medium. The problem generation method comprises the steps of identifying a text type ofa to-be-processed document according to a text structure; Selecting a generation model corresponding to the text type, wherein the generation model comprises at least one of an explicit question generation model, a structured and semi-structured question generation model and a natural language question generation model; And generating a problem for the to-be-processed document by using the selected generation model. According to the embodiment of the invention, aiming at the characteristics of different text types, the most suitable generation model is selected for the whole document or each part of text of the whole document, so that the problem generation accuracy is improved.

Description

technical field [0001] The present invention relates to the field of information technology, and in particular to a question generation method, device and computer-readable storage medium. Background technique [0002] FAQ (Frequently Asked Questions, question answering system) is the main means of providing online help on the current network. It organizes some possible frequently asked question and answer pairs in advance and publishes them on the web page to provide consulting services for users. [0003] FAQ implementation methods in the prior art mainly include the following types: [0004] (1) General question answering system, based on retrieval or knowledge based question answering service. [0005] (2) Customized retrieval, creating an index for document content segmented and word-segmented; or, obtaining question-answer pairs through document structuring or manual screening. [0006] (3) Question retrieval based on word matching or synonym matching. [0007] The ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/332G06F16/33G06F16/35G06F17/27
Inventor 孙兴武刘璟
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products