Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Online Realization Method of Dialogue Policy Based on Multi-task Learning

A multi-task learning and strategy technology, applied in speech analysis, speech recognition, instruments, etc., can solve the problems of unstable learning rate, difficult to expand rules, and labor-intensive, etc., to achieve improved maintainability, stable training process, and strong construction The effect of modeling ability

Active Publication Date: 2020-09-01
AISPEECH CO LTD
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The present invention aims at the shortcomings of the prior art, such as labor-intensive, difficult to expand the designed rules, not widely applicable to fields with complex information structures, unstable initial training process and difficult to guarantee the learning rate, etc., and proposes a dialogue based on multi-task learning. The strategy online implementation method adopts the framework of reinforcement learning to optimize the dialogue strategy through online learning. It does not need to manually design rules and strategies according to the domain, and can adapt to domain information structures of different complexity and data of different scales; the present invention aims to improve the stability of the training process Decompose the original task of optimizing a single cumulative reward value, and use multi-task learning to optimize at the same time to learn a better network structure and reduce the variance of the training process

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Online Realization Method of Dialogue Policy Based on Multi-task Learning
  • Online Realization Method of Dialogue Policy Based on Multi-task Learning
  • Online Realization Method of Dialogue Policy Based on Multi-task Learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] Such as figure 1 As shown, the present embodiment follows the steps:

[0033] Step 101, acquire the corpus of the man-machine dialogue in real time from the online dialogue system.

[0034] In this embodiment, the process flow of the online spoken dialogue system for real-time acquisition of man-machine dialogue materials is as follows: figure 2 As shown, the steps of a complete dialogue flow include:

[0035] Step 201: Speech recognition, converting the user's voice into a text format;

[0036] Step 202: Semantic understanding, parsing the user voice text into semantics in the form of "slot-value pair";

[0037] Step 203: dialogue state tracking, updating current user state according to current information and historical information;

[0038] Step 204: the dialog strategy generates a reply action, taking the user's current state and user action as input, and generating a system reply action according to the dialog strategy;

[0039] Step 205: generating natural lan...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a dialog strategy online realization method based on multi-task learning. According to the method, corpus information of a man-machine dialog is acquired in real time, current user state features and user action features are extracted, and construction is performed to obtain training input; then a single accumulated reward value in a dialog strategy learning process is split into a dialog round number reward value and a dialog success reward value to serve as training annotations, and two different value models are optimized at the same time through the multi-task learning technology in an online training process; and finally the two reward values are merged, and a dialog strategy is updated. Through the method, a learning reinforcement framework is adopted, dialog strategy optimization is performed through online learning, it is not needed to manually design rules and strategies according to domains, and the method can adapt to domain information structures with different degrees of complexity and data of different scales; and an original optimal single accumulated reward value task is split, simultaneous optimization is performed by use of multi-task learning, therefore, a better network structure is learned, and the variance in the training process is lowered.

Description

technical field [0001] The invention relates to a technology in the field of speech input, in particular to an online implementation method of a dialogue strategy based on multi-task learning for task-based dialogue systems. Background technique [0002] With the development of artificial intelligence technology, dialogue system, as a system that can communicate with humans naturally, has gradually become a research hotspot with its good application prospects. At present, this technology has been widely used in automatic customer service, voice assistants, chat robots and other scenarios, which greatly improves the human-computer interaction experience. A typical dialogue system includes five modules: speech recognition, semantic understanding, dialogue management, natural language generation and speech synthesis. In terms of functions, dialogue systems can be divided into chat-based dialogue systems and task-based dialogue systems. The former aims to chat with users unint...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/332G10L15/06G10L15/16G10L15/183
CPCG06F16/3326G06F16/3329G10L15/063G10L15/16G10L15/183
Inventor 俞凯常成杨闰哲陈露周翔
Owner AISPEECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products