Multi-round dialogue omission recovery method based on gated copying and masking

A recovery method and masking technology, applied in neural learning methods, instruments, biological neural network models, etc., can solve problems such as semantic information mining difficulties, propagation errors, and semantic deviations

Pending Publication Date: 2021-02-26
中国科学院电子学研究所苏州研究院
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, there are many problems with current existing omission recovery methods
For example, in the existing technology, it is relatively simple to model the semantic information of the text in the multi-round dialogue, but the short sentences in the multi-round dialogue are more casual than the standard text, and the semantic information mining is more difficult than the standard text; the existing technology is in A sequence-to-sequence text generation scheme is used for decoding, but this scheme has the problems of propagation error and semantic deviation, and wrong historical generation results will affect future prediction results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-round dialogue omission recovery method based on gated copying and masking
  • Multi-round dialogue omission recovery method based on gated copying and masking
  • Multi-round dialogue omission recovery method based on gated copying and masking

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0064] In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application.

[0065] Such as figure 1 As shown, the multi-round dialogue omission recovery method based on gated copy and mask includes the following steps:

[0066] Step 1, text acquisition

[0067] Get the original elliptical sentence and its contextual text content. In a multi-round dialogue scenario, the original elliptical sentence is the elliptical sentence that needs to be filled in the current dialogue turn, and the context text refers to the set of dialogue sentences including the current turn and all previous turns.

[0068] Step 2, text preprocessing

[0069] First, denoi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a multi-round dialogue omission recovery method based on gated copying and masking. The method comprises the steps of obtaining original omission sentences and context text content of the original omission sentences; performing word segmentation on the text by using a word segmentation tool, and mapping a word sequence into a digital sequence by using a dictionary; using a pre-trained word vector file to represent words; based on a gating mechanism, fusing the multi-head self-attention information and a gating encoder of the Bi-GRU, and performing semantic encoding on the omitted sentence word vector sequence and the context word vector sequence; calculating soft mask features of the omitted sentence based on a soft mask mechanism; calculating probability distribution of the word list by using a mask decoder; calculating scores of the context words, and normalizing the scores by using a Softmax function to obtain context probability distribution; and adding the probability distribution of the word list and the context probability distribution by using a gating unit to obtain final omission word probability distribution, and selecting filling contents of omission sentences. The omission recovery result accuracy is improved.

Description

technical field [0001] The invention relates to the field of natural language processing, in particular to a multi-round dialogue omission restoration method based on gated copy and mask. Background technique [0002] In order to avoid repetitive oral expression habits, the omission of sentences is very frequent in multi-round dialogue scenarios. Humans can easily infer intent and recover omitted content based on dialogue scene information and historical dialogue information, but this is very difficult for the current dialogue model, especially in task-oriented multi-round dialogue. An example of a multi-turn dialogue about restaurant recommendations is given in Table 1. In the example, both Human2 and Human3 omit the restaurant name LittleSeoul. It can be seen from this example that, unlike multiple rounds of small talk dialogues, the content omitted in task-oriented multi-round dialogues is more entity information, such as LittleSeoul in the example. These entity conten...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/289G06F40/30G06F40/216G06N3/04G06N3/08
CPCG06F40/289G06F40/30G06F40/216G06N3/08G06N3/047G06N3/048G06N3/045
Inventor 郑杰包兴王迪费涛段贺顾爽
Owner 中国科学院电子学研究所苏州研究院
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products