End-to-end dialogue oriented data enhancement method based on statement rewriting

A sentence and data technology, applied in the field of data enhancement for end-to-end dialogue, can solve problems such as high overhead, no consideration of context, and inappropriate expansion of dialogue text data, etc., to achieve the effect of improving accuracy.

Active Publication Date: 2020-08-11
STATE GRID ZHEJIANG ELECTRIC POWER CO MARKETING SERVICE CENT +1
View PDF3 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] (1) Most of the existing end-to-end dialogue systems require a large amount of labeled dialogue text data related to specific tasks in a specific field when training dialogue generation models, but it is difficult to manually collect and label such text data. high cost
[0009] (2) The existing data enhancement methods for text do not consider the context of the sentence in the text in a dialogue, so they are not suitable for expanding dialogue text data
[0010] (3) The existing data enhancement method based on sentence rewriting relies on the separate training of multiple models in multiple stages, and has not realized the joint training of multiple models and the construction of an end-to-end system, so it is difficult to optimize the overall system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • End-to-end dialogue oriented data enhancement method based on statement rewriting
  • End-to-end dialogue oriented data enhancement method based on statement rewriting
  • End-to-end dialogue oriented data enhancement method based on statement rewriting

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0062] The technical solution of the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0063] The present invention proposes an end-to-end dialogue-oriented data enhancement method based on sentence rewriting, called PARG, which expands the training data of the dialogue generation model by constructing and training a sentence rewriting model. First, the method constructs a training reference for the sentence rewriting model by defining the dialogue function of user sentences. Afterwards, the method adopts a sequence-to-sequence (seq2seq)-based framework, and uses two decoders to decode the previous round of system dialog actions and the rewritten user sentences in sequence, wherein the previous round of system dialog actions can provide the rewriting of user sentences. The historical background of the dialogue makes the generated rewritten sentences more in line with the context of the dialogue. By filtering the rewri...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an end-to-end dialogue oriented data enhancement method based on statement rewriting, and relates to a data processing method. At present, on-site enhancement is difficult to realize overall optimization of a system level. According to the end-to-end dialogue oriented data enhancement method, training data of a dialogue generation model is expanded by constructing and training a statement rewriting model, and training reference of the statement rewriting model is constructed by defining a dialogue function of user statements; and a framework based on sequence-to-sequence is adopted, two decoders are used for sequentially decoding a previous round of system dialogue actions and a rewritten user statement, and the previous round of system dialogue actions can providea dialogue historical background for rewriting of the user statements, so that the generated rewritten statements better conform to a dialogue context. By means of the end-to-end dialogue oriented data enhancement method, an attention mechanism is added between decoders of the statement rewriting model and the dialogue generation model, so that an end-to-end dialogue system is built, the rewrittenstatements can directly assist dialogue generation, and meanwhile, a dialogue generation result can further supervise training of the statement rewriting model.

Description

technical field [0001] The invention relates to a data processing method, in particular to an end-to-end dialogue-oriented data enhancement method based on sentence rewriting. Background technique [0002] Building an intelligent dialogue system based on natural language to communicate with humans is an important research goal of artificial intelligence. There are various types of dialogue systems. Among them, task-based dialogue systems can assist humans to complete specific tasks in specific fields. Therefore, they have broad application prospects in electronic customer service, personal assistants, self-service terminals, etc., and have been favored by the research community and industry. Focus. Generally speaking, a task-based dialogue system needs to build and train a dialogue generation model oriented to one or some specific domains to generate corresponding system responses related to specific tasks for input user sentences. With the maturity of deep learning, neura...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/332G06F16/33G06F16/36G06N3/04
CPCG06F16/3329G06F16/3344G06F16/36G06N3/045Y02D10/00
Inventor 胡若云王正国沈然吕诗宁江俊军丁麒朱斌孙钢金良峰汪一帆谷泓杰
Owner STATE GRID ZHEJIANG ELECTRIC POWER CO MARKETING SERVICE CENT
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products