Unlock instant, AI-driven research and patent intelligence for your innovation.

Question and answer corpus generation method and system

A technology of corpus and statement, which is applied in the field of question and answer corpus generation method and system, can solve the problems of slow query speed of knowledge graph and difficulty in obtaining corpus, and achieve the effect of improving recall rate and improving efficiency

Active Publication Date: 2020-04-17
AISPEECH CO LTD
View PDF4 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] In order to at least solve the problems in the prior art that in the process of generating paired question-and-answer corpus, the query speed of knowledge graph is slow, fuzzy search cannot be used, and it is difficult to obtain paired corpus of dialogue question-and-answer

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Question and answer corpus generation method and system
  • Question and answer corpus generation method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach

[0069] As an implementation manner, the sequential detection of each character of the regular expression includes:

[0070] Each character of the regular expression is judged one by one through a recursive algorithm.

[0071] It can be seen from this embodiment that since regular expressions are introduced into the algorithm, there will be certain constraints. By adjusting the regular expressions, such constraints can be avoided and the stability of the method can be improved.

[0072] Such as figure 2 Shown is a schematic structural diagram of a question and answer corpus generation system provided by an embodiment of the present invention. The system can execute the question and answer corpus generation method described in any of the above embodiments and is configured in a terminal.

[0073] A question and answer corpus generation system provided in this embodiment includes: a corpus receiving program module 11 , an information determination program module 12 , a regular ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a question and answer corpus generation method. The method comprises the steps of receiving a corpus text; detecting the text quantity of the corpus text, andwhen the text quantity is smaller than a preset threshold value, determining entities and attributes of the corpus text for the knowledge graph; querying a regular expression matched with the corpus text based on the entity and the attribute; determining a fuzzy statement of the corpus text based on the regular expression, inputting the fuzzy statement into the knowledge graph, and determining a corresponding text of the corpus text according to the inverted index; carrying out corpus generation on the corpus text and the corresponding text through the regular expression to construct a plurality of paired question-answer dialogue corpora. The embodiment of the invention further provides a question and answer corpus generation system. According to the embodiment of the invention, fuzzy search is used in the knowledge graph, so that the recall rate of retrieval is improved. In knowledge graph retrieval, an inverted index method is used, so that the retrieval efficiency is improved. Therefore, a plurality of paired question-answer dialogue forecasts can be generated in texts and text segments.

Description

technical field [0001] The invention relates to the field of knowledge graph question answering, in particular to a question answering corpus generation method and system. Background technique [0002] The answer effect of reading comprehension question-and-answer language model requires a large amount of high-quality paired question-answer corpus support. In order to obtain these high-quality dialogue materials, corpus generation methods are usually used. [0003] In the process of realizing the present invention, the inventors have found that there are at least the following problems in the related art: [0004] The existing corpus generation methods are difficult to generate paired corpus such as dialogue question and answer, because the training text corpus that is relatively easy to obtain does not appear in pairs, but is a single sentence. It is difficult to generate a conversational question-answer paired corpus using such individual sentences: [0005] 1. If you wa...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/31G06F16/332G06F16/36
CPCG06F16/319G06F16/3329G06F16/367
Inventor 许建伟
Owner AISPEECH CO LTD