Multi-round dialogue omission recovery method based on gated copying and masking

A recovery method and masking technology, applied in neural learning methods, instruments, biological neural network models, etc., can solve problems such as semantic information mining difficulties, propagation errors, and semantic deviations
CN112417864APending Publication Date: 2021-02-26中国科学院电子学研究所苏州研究院

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
中国科学院电子学研究所苏州研究院
Publication Date
2021-02-26

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention provides a multi-round dialogue omission recovery method based on gated copying and masking. The method comprises the steps of obtaining original omission sentences and context text content of the original omission sentences; performing word segmentation on the text by using a word segmentation tool, and mapping a word sequence into a digital sequence by using a dictionary; using a pre-trained word vector file to represent words; based on a gating mechanism, fusing the multi-head self-attention information and a gating encoder of the Bi-GRU, and performing semantic encoding on the omitted sentence word vector sequence and the context word vector sequence; calculating soft mask features of the omitted sentence based on a soft mask mechanism; calculating probability distribution of the word list by using a mask decoder; calculating scores of the context words, and normalizing the scores by using a Softmax function to obtain context probability distribution; and adding the probability distribution of the word list and the context probability distribution by using a gating unit to obtain final omission word probability distribution, and selecting filling contents of omission sentences. The omission recovery result accuracy is improved.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to the field of natural language processing, in particular to a multi-round dialogue omission restoration method based on gated copy and mask. Background technique

[0002] In order to avoid repetitive oral expression habits, the omission of sentences is very frequent in multi-round dialogue scenarios. Humans can easily infer intent and recover omitted content based on dialogue scene information and historical dialogue information, but this is very difficult for the current dialogue model, especially in task-oriented multi-round dialogue. An example of a multi-turn dialogue about restaurant recommendations is given in Table 1. In the example, both Human2 and Human3 omit the restaurant name LittleSeoul. It can be seen from this example that, unlike multiple rounds of small talk dialogues, the content omitted in task-oriented multi-round dialogues is more entity information, such as LittleSeoul in the example. These entity conten...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More