Unlock instant, AI-driven research and patent intelligence for your innovation.

A Chinese-Vietnamese news document summarization method based on feature association attention mechanism

A technology of document summarization and attention, applied in neural learning methods, computer components, biological neural network models, etc.

Active Publication Date: 2020-08-21
KUNMING UNIV OF SCI & TECH
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The present invention provides a Chinese-Vietnamese news document summary generation method based on element association attention mechanism to solve the problem of Chinese-Vietnamese news document summary generation. generate effect

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Chinese-Vietnamese news document summarization method based on feature association attention mechanism
  • A Chinese-Vietnamese news document summarization method based on feature association attention mechanism
  • A Chinese-Vietnamese news document summarization method based on feature association attention mechanism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0036] Embodiment 1: as Figure 1-2 As shown, a Chinese-Vietnamese news document summary generation method based on the feature association attention mechanism, the specific steps of the method are as follows:

[0037] a1. Collection of Chinese-Vietnamese bilingual news documents: A data set of 20,000 documents was constructed with the method of manual inspection and machine labeling, of which 12,000 are Chinese news and 8,000 are Vietnamese news, involving issues of common concern between China and Vietnam in recent years Hot news, including policy topics such as the Belt and Road Initiative, also covers tourism, studying abroad, etc. Each news set contains at least two documents, one in Chinese and one in Vietnamese. For each set of events, the selection of reference abstracts selects 4 sentences for each language as the standard.

[0038] a2. Preprocessing of Chinese-Vietnamese bilingual news documents: including steps such as document segmentation, word segmentation, and...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a Chinese-Vietnamese news document abstract generation method based on an element-associated attention mechanism, and belongs to the technical field of natural language processing. The present invention first constructs Chinese-Vietnamese bilingual word vectors, and converts the word vectors of the two languages ​​into the same semantic space. Then, a multi-feature fusion vector is constructed, and statistical features such as bilingual news element co-occurrence degree, word frequency feature, sentence position and sentence correlation feature are integrated into the bilingual word vector. Finally, the LSTM neural network model based on the element association attention mechanism is constructed to calculate the importance score of the sentence. According to the correlation analysis algorithm, the sentence with a higher score can be selected to delete redundant information to generate a summary. The invention achieves a good summarization generating effect on Chinese-Vietnamese bilingual news document sets.

Description

technical field [0001] The invention relates to a Chinese-Vietnamese news document abstract generation method based on an element-associated attention mechanism, and belongs to the technical field of natural language processing. Background technique [0002] With the rapid growth of information in the new era, a large number of hot news events will be published on the Internet in different languages. How to quickly grasp the hot news and its main content between different countries on the Internet has become a problem of widespread concern to all walks of life. . To address this issue, document information from various sources needs to be summarized and a concise yet informative response provided to the user. This concern has led to the development of multilingual text summarization systems, which aim to take multilingual document sets as input and produce a concise and fluent summary that reflects the gist of the original document set in refined text. With the increasingl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/62G06N3/08G06F40/20G06F40/40
CPCG06N3/04G06F40/205G06F40/40G06F18/253G06F18/214
Inventor 余正涛宋燃高盛祥黄于欣吴瑾娟郭军军赖华
Owner KUNMING UNIV OF SCI & TECH