Generation type dialogue abstracting method integrated with common knowledge

A generative and knowledge-based technology, applied in biological neural network models, natural language data processing, special data processing applications, etc., can solve problems such as low abstraction and inaccurate dialogue summaries

Active Publication Date: 2020-12-29
HARBIN INST OF TECH
View PDF15 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The present invention aims to solve the problem that the existing generated dialogue summarization method does not use common sense knowledge, resulting in inaccurate and low abstraction of the generated dialogue summaries

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Generation type dialogue abstracting method integrated with common knowledge
  • Generation type dialogue abstracting method integrated with common knowledge
  • Generation type dialogue abstracting method integrated with common knowledge

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach 1

[0032] Specific implementation mode 1: In this implementation mode, a generative dialogue summarization method incorporating common sense knowledge includes:

[0033] Step 1: Obtain the large-scale common sense knowledge base ConceptNet and the dialogue summary data set SAMSum.

[0034] Step 11. Obtain ConceptNet, a large-scale common sense knowledge base:

[0035] Get the large-scale common sense knowledge base ConceptNet from http: / / conceptnet.io / ; the common sense knowledge contained in it exists in the form of tuples, that is, tuple knowledge, which can be expressed as:

[0036] R = (h, r, t, w),

[0037] Among them, R represents a tuple knowledge; h represents the head entity; r represents the relationship; t represents the tail entity; w represents the weight, which represents the confidence of the relationship; It is w; for example R=(call, related, contact, 10), which means that the relationship between "call" and "contact" is "relevant", and the weight is 10; throug...

specific Embodiment approach 2

[0046] Embodiment 2: This embodiment differs from Embodiment 1 in that the step 2 utilizes the obtained large-scale common sense knowledge base ConceptNet to introduce tuple knowledge into the dialog summary data set SAMSum, and construct a heterogeneous dialog graph; specifically The process is:

[0047] Step 21. Obtain relevant knowledge of the dialogue; for a section of dialogue, the present invention first obtains a series of related tuple knowledge from ConceptNet according to the words in the dialogue, eliminates noise knowledge, and finally can obtain the tuple knowledge set relevant to the given dialogue, like Figure 4 ;

[0048] Step 22. Construct the sentence-knowledge map:

[0049] For the relevant tuple knowledge obtained in step 21, suppose there are sentence A and sentence B, word a belongs to sentence A, word b belongs to sentence B, if the tail entity h of the related knowledge of a and b is consistent, then sentence A Connect with sentence B to tail entity...

specific Embodiment approach 3

[0058] Embodiment 3: The difference between this embodiment and Embodiment 1 or 2 is that the step 31 is to construct a node encoder, and use a bidirectional long-short-time neural network (Bi-LSTM) to obtain node initialization representation and word initialization representation The specific process is:

[0059] For the heterogeneous dialogue graph proposed by the present invention in step 2, each node v i contains|v i | words, the sequence of words is where w i,n represents the node v i The nth word of n∈[1,|v i |]; use bidirectional long-short-time neural network (Bi-LSTM) to sequence words Generate forward hidden sequence and backward hidden sequence Among them, the forward hidden layer state backward hidden state x n means w i,n The word vector representation; the initial representation of the node is obtained by splicing the last hidden layer representation of the forward hidden layer state and the first hidden layer representation of the backward...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a generation type dialogue abstract method integrated with common knowledge, and belongs to the field of natural language processing. According to the invention, the problems of inaccurate generated dialogue abstract and low abstraction caused by the fact that the conventional generative dialogue abstract method does not utilize common knowledge are solved. The method comprises the following steps: acquiring a common knowledge base ConceptNet and a dialogue abstract data set SAMSum; introducing tuple knowledge into a dialogue abstract data set SAMSum by utilizing the obtained common knowledge base ConceptNet, and constructing a heterogeneous dialogue diagram; training the dialogue heterogeneous neural network model constructed in the step 3, and generating a final dialogue abstract from a section of dialogue through the trained dialogue heterogeneous neural network model. The method is applied to generation of the dialogue abstract.

Description

technical field [0001] The invention relates to the field of natural language processing, in particular to a generative dialog summarization method incorporating common sense knowledge. Background technique [0002] Based on natural language processing - automatic text summary (AutomaticSummarization) [1] (Title: Constructing literature abstracts by computer: techniques and prospects, author: Chris D Paice, year: 1990, literature cited from Information Processing & Management) under the field of generative dialogue summarization (Abstractive Dialogue Summarization), that is, given a multi-person dialogue Transcript, which generates a short textual description containing key information about the conversation, such as figure 1 , showing a multi-person conversation and its corresponding standard summary. [0003] For dialogue summarization, most of the existing work focuses on generative (Abstractive) methods, which allow the final summary to contain novel words and phrases ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/332G06F40/295G06N3/04
CPCG06F16/3329G06F40/295G06N3/045G06N3/044
Inventor 冯骁骋冯夏冲秦兵刘挺
Owner HARBIN INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products