Task-oriented dialogue strategy generation method

A task-oriented and task-oriented technology, applied in neural learning methods, biological neural network models, instruments, etc., can solve problems such as model collapse, falling into a local optimal state, and failure to benefit, and achieve the effect of wide application value

Pending Publication Date: 2021-06-11
网经科技(苏州)有限公司
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But this approach is limited to policy gradient-based algorithms that alternately update dialogue policy and reward models; non-policy gradient methods cannot benefit from self-learned reward functions
In addition, the alternating cycle of dialogue strategy and reward model can easily fall into a local optimal state or lead to model collapse.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Task-oriented dialogue strategy generation method
  • Task-oriented dialogue strategy generation method
  • Task-oriented dialogue strategy generation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] In order to have a clearer understanding of the technical features, purposes and effects of the present invention, specific implementations are now described in detail.

[0055] Such as figure 1 As shown, the task-oriented dialog strategy generation method, the specific steps are as follows:

[0056] S101) Establishing a dialog state tracker, determining a dialog state space and an action space and their formal representations;

[0057] The dialogue state tracker is used to record the slot filling status of the dialogue process, including the information slot given by the user and the request slot representing the user's request. Each slot in each field maintains and updates a confidence vector;

[0058] At each time step t in the dialogue, the information collected by the dialogue state tracker forms a structured representation, that is, the dialogue state St, which is a high-dimensional binary vector, and its content includes the following:

[0059] 1) The embedded ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a task-oriented dialogue strategy generation method, and the method comprises the following steps: establishing a dialogue state tracker, and determining a dialogue state space, an action space and formalized representation thereof; simulating a dialogue state by using a variational automatic encoder; simulating a dialogue action by using a multi-layer perceptron and Gumbel Softmax; performing adversarial training on a simulation sample generator and a discriminator; and finally training a dialogue strategy by using a reinforcement learning method. Firstly, a simulation sample generator is used for learning a reward function, and loss from a discriminator can be directly fed back to the generator for optimization; secondly, the trained discriminator is taken as a dialogue reward to be brought into a reinforcement learning process for guiding dialogue strategy learning; the dialogue strategy can be updated by utilizing any reinforcement learning algorithm; according to the method, common information contained in high-quality dialogues generated by human beings can be deduced by distinguishing the dialogues generated by the human beings and the machine respectively, and then the learned information is fully utilized to guide dialogue strategy learning in a new field in a transfer learning mode.

Description

technical field [0001] The invention relates to a method for generating a task-oriented dialogue strategy, which belongs to the technical field of natural language processing. Background technique [0002] A task-oriented dialogue system is designed to provide users with services to complete specific tasks, such as booking a hotel, buying a movie ticket, etc.; this dialogue system requires a specific dialogue strategy in order to select the most appropriate dialogue strategy in each dialogue round according to the context of the current dialogue. Actions. [0003] The development of reinforcement learning in robotics and other fields has brought new inspiration to dialogue strategy learning. After clarifying the state space and action space, the goal of task-oriented dialogue systems is to maximize positive feedback from users. The dialogue policy learning method based on reinforcement learning is suitable for training with user simulators instead of real people in order to...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/08G06K9/62
CPCG06N3/08G06F18/2415
Inventor 孟亚磊刘继明金宁陈浮赵经纬
Owner 网经科技(苏州)有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products