Data-to-text generation method based on fine-grained topic modeling

A topic modeling and fine-grained technology, applied in the fields of electrical digital data processing, natural language data processing, biological neural network models, etc., can solve problems such as ignoring modeling, and achieve the effect of improving the quality of generation

Active Publication Date: 2020-12-11
STATE GRID TIANJIN ELECTRIC POWER +1
View PDF5 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, these models mainly rely on the neural network's own representation learning ability to improve the quality of the generated text, while ignoring the modeling of topical consistency between text and data.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data-to-text generation method based on fine-grained topic modeling
  • Data-to-text generation method based on fine-grained topic modeling
  • Data-to-text generation method based on fine-grained topic modeling

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] The present invention will be described in further detail below in conjunction with specific examples. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0021] Such as figure 1 with 2 As shown, a kind of data-to-text generation method based on fine-grained topic modeling of the present invention comprises the following steps:

[0022] Step 1: In the encoding layer, learn the semantic representation of each data record in the structured data table based on the bidirectional long-term short-term memory network;

[0023] Step 1.1: Given a data table record set s, first convert s into a data record sequence s q ={r 1 , r 2 ,...,r |r|}, and each data record r j The three attributes contained in Mapped to the low-dimensional, dense feature vector space, respectively, to get three feature vectors where d r Represents the dimension of each feature vector; by spli...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data-to-text generation method based on fine-grained topic modeling. The data-to-text generation method comprises the following steps: learning semantic representation of each data record on a coding layer based on a bidirectional long-short-term memory network; learning topic distribution corresponding to each data record and word distribution corresponding to each topicbased on a non-negative matrix factorization method to obtain a topic word table corresponding to each data record; based on the semantic representation of each data record in a decoding layer, carrying out text generation by utilizing a long-term and short-term memory network, an attention mechanism and fine-grained topic representation in combination with a topic word table; and performing model training to obtain an optimal text generation result. According to the method, topic distribution of data and word distribution corresponding to topics are mined by utilizing a non-negative matrix factorization method, so that topic consistency between a generated text and a data table is restrained, and a model is guided to learn a more accurate word use mode; a copying mechanism is introducedin the text generation process, and it is guaranteed that the model can accurately generate numerical description.

Description

technical field [0001] The invention relates to the field of computer application technology, in particular to a data-to-text generation method based on fine-grained topic modeling. Background technique [0002] With the development of information technology, industry data accumulated in various fields is growing rapidly, for example, financial statements accumulated in the financial field, live game data accumulated in the sports field, etc. In order to solve the problem of information overload brought by massive data, the task of data-to-text generation has attracted more and more attention from researchers. The data-to-text generation task aims to describe the main information contained in structured data in natural language, and then help people better grasp the specific meaning behind massive data. [0003] Early research work mainly split this task into three independent subtasks: content planning, sentence planning, and surface realization, and constructed a series o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/30G06F40/216G06F40/284G06F40/126G06N3/04
CPCG06F40/30G06F40/216G06F40/284G06F40/126G06N3/049
Inventor 王旭强
Owner STATE GRID TIANJIN ELECTRIC POWER
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products