Unlock instant, AI-driven research and patent intelligence for your innovation.

Question and answer pair extraction method and system based on BiLSTM-CRF model and storage medium

A technology of question-answer pairs and models, which is applied to computer-readable storage media and the field of question-answer pair extraction, can solve problems such as time-consuming and labor-intensive, and difficult to achieve dialogue data diversity.

Inactive Publication Date: 2019-08-30
厦门快商通信息咨询有限公司
View PDF4 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Existing question and answer knowledge bases are usually built manually, which is time-consuming and laborious, and the expression of questions is difficult to achieve the diversity in real dialogue data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Question and answer pair extraction method and system based on BiLSTM-CRF model and storage medium
  • Question and answer pair extraction method and system based on BiLSTM-CRF model and storage medium
  • Question and answer pair extraction method and system based on BiLSTM-CRF model and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0047] A question-answer pair extraction method based on the BiLSTM-CRF model, as shown in the attached figure 1 shown, including the following steps:

[0048] The dialogue data is labeled; the labeling process is to label a piece of dialogue data with a sentence as the smallest unit; that is, each sentence is a label for an element or each sentence is a label for an element;

[0049] Perform model training on the dialog data after labeling preprocessing, and the model training includes first model training and second model training:

[0050] The first model training is to adopt the sentence vector of the PV-DM model training dialog data; the second model training is to import the sentence vector into the BiLSTM-CRF model for training;

[0051] The BiLSTM-CRF model includes a BiLSTM layer and a CRF layer;

[0052]Each sentence of a piece of dialogue data is vectorized using a pre-trained sentence vector as the input of the BiLSTM layer; the BiLSTM layer is used to perform fe...

Embodiment 2

[0109] The embodiment of the present invention discloses a question-answer pair extraction system based on the BiLSTM-CRF model. The question-answer pair extraction system includes: a memory, a processor, and computer instructions stored on the memory and run on the processor. When the computer instructions are executed by the processor, the steps of the method for extracting question-answer pairs based on the BiLSTM-CRF model are completed. The specific implementation cases of the question-answer pair extraction method based on the BiLSTM-CRF model and the question-answer pair extraction method based on the BiLSTM-CRF model in Embodiment 1 will not be repeated here.

Embodiment 3

[0111] The embodiment of the present invention discloses a computer-readable storage medium. A computer program is run on the computer-readable storage medium. When the computer program is run by a processor, the steps of the question-answer pair extraction method based on the BiLSTM-CRF model are completed. . The method for extracting question-answer pairs based on the BiLSTM-CRF model is the method for extracting question-answer pairs based on the BiLSTM-CRF model described in Embodiment 1, which will not be repeated here.

[0112] Described computer-readable storage medium can be as flash memory, hard disk, multimedia card, card memory (for example, SD or DX memory etc.), random access memory (Random Access Memory, RAM), static random access memory (Static Random-Access Memory) , SRAM), read-only memory (Read-Only Memory, ROM), electrically erasable programmable read-only memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), programmable read-only memory (Pr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a question and answer pair extraction method and system based on a BiLSTM-CRF model and a storage medium. The method comprises: labelling dialogue data; carrying out model training on the dialogue data subjected to the labeling preprocessing, wherein the model training comprises first model training and second model training; the first model training being a sentence vectorfor training the dialogue data by adopting a PV-DM model; wherein the second model training is to import the sentence vector into a BiLSTM-CRF model for training; and importing the current dialogue data into the trained model to carry out QA extraction of a prediction result. According to the method, corresponding questions and answers can be accurately and quickly matched from a large number ofdisordered and disordered real dialogues, and high-quality question and answer pairs are extracted from the algorithm.

Description

technical field [0001] The present invention relates to the field of question-and-answer dialogs, in particular to a BiLSTM-CRF model-based question-answer pair extraction method, system and computer-readable storage medium. Background technique [0002] Building a dialogue system is often to extract a large number of highly professional questions and answers from the original dialogue data, so extracting the correct corresponding question and answer pairs from the original language data is a necessary step for building a professional question answering knowledge base. Existing question and answer knowledge bases are usually built manually, which is time-consuming and laborious, and the expression of questions is difficult to achieve the diversity of real dialogue data. [0003] Sequence labeling is currently used. Simply speaking, given a sequence, a mark is made for each element in the sequence, or a label for each element. This is a relatively broad concept. Basic NLP ta...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/332G06F16/33G06F17/27
CPCG06F16/3329G06F16/3344G06F40/205
Inventor 刘俊肖龙源蔡振华李稀敏刘晓葳谭玉坤
Owner 厦门快商通信息咨询有限公司