Dialogue generation method based on near-end strategy optimization and adversarial learning

A technology of reinforcement learning and optimization algorithms, applied in biological neural network models, special data processing applications, instruments, etc., can solve problems such as low utilization rate of rewards, reduced training efficiency, insufficient complexity, etc., to improve utilization rate and efficiency Effect

Active Publication Date: 2020-03-06
KUNMING UNIV OF SCI & TECH
View PDF6 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] When the confrontation generation network is used in the process of dialogue generation, the discriminant model can only evaluate the quality of the entire sentence, that is, to get the reward of the entire sentence. In order to better use the reward obtained by the discriminant model to train the generative model, it is necessary to get the reward of the middle dialogue , the method of Monte Carlo sampling is a commonly used method to ge

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Dialogue generation method based on near-end strategy optimization and adversarial learning
  • Dialogue generation method based on near-end strategy optimization and adversarial learning
  • Dialogue generation method based on near-end strategy optimization and adversarial learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] In order to have a clearer understanding of the model structure, purpose, and effects of the present invention, specific implementations of the present invention will now be described with reference to the accompanying drawings.

[0039] figure 1 Is the method flow chart of the present invention:

[0040] The first step: pre-training the generative model.

[0041] The generative model uses an encoder-decoder architecture with an attention mechanism. Both the encoding part and the decoding part of the generative model are composed of cyclic neural networks. The encoding part encodes the input dialogue into a vector representation, and uses the attention mechanism to get the influence of each word in the input dialogue on the words that will be generated in the decoding process, and then generates the output conditionally.

[0042] The purpose of the generative model is to maximize the probability that each output is a true answer:

[0043]

[0044] In formula (1), θ represents...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a dialogue generation method based on near-end strategy optimization and adversarial learning, and belongs to the field of computer natural language processing. The dialogue generation method comprises the steps of firstly pre-training a generation model and a discrimination model of an adversarial generative network; then, adopting a Monte Carlo sampling method to calculate awards corresponding to each word in the generated sentence, wherein the size of an award value represents the quality of word generation; secondly, taking the training process of the adversarial generative network as a reinforcement learning process, and training the adversarial generative network by using a near-end strategy optimization algorithm, so that awards obtained by the discrimination model can guide the generation of the generative model, and dialogues obtained by the generative model can guide the training of the discrimination model; and finally, training the generation modelby using a forced guidance method. According to the dialogue generation method, the training efficiency of the model is improved by controlling self-adaptive multiple iterations of the generation model, and the complexity of the sample is improved through the near-end strategy optimization algorithm, and the dialogue generation quality is further improved, and the dialogue closer to human beings can be generated.

Description

Technical field [0001] The invention relates to a dialogue generation method based on near-end strategy optimization and confrontation learning, and belongs to the field of computer natural language processing. Background technique [0002] The problem of dialogue generation is one of the key research directions of natural language processing, and it is the main technology for training chat robots. Now, chat robots such as Microsoft Xiaoice and Xiao Ai have slowly integrated into our lives, and generating dialogues closer to humans can enhance the user experience of these software. The first breakthrough of the dialogue generation problem is the application of the sequence-to-sequence model with attention mechanism to the dialogue generation, but at the same time it faces the problem of not having good dialogue evaluation indicators, which affects the quality of dialogue generation. The adversarial generation network can use the discriminative model to evaluate the quality of th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/332G06N3/04
CPCG06F16/3329G06N3/045
Inventor 游进国蔡钺
Owner KUNMING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products