Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Multi-feature fusion Vietnamese keyword generation method

A multi-feature fusion, keyword technology, applied in neural learning methods, semantic analysis, biological neural network models, etc., can solve the problems of inaccurate keyword generation, scarcity of corpus, and insufficient features of model extraction. Effects of Scarce, Good Title Semantic Representation

Pending Publication Date: 2021-11-09
KUNMING UNIV OF SCI & TECH
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The present invention provides a Vietnamese keyword generation method with multi-feature fusion, which is used to generate keywords for Vietnamese documents, and solves the problem of insufficient keyword generation due to the scarcity of Vietnamese and other low-resource language corpora and insufficient features extracted by the model. exact question

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-feature fusion Vietnamese keyword generation method
  • Multi-feature fusion Vietnamese keyword generation method
  • Multi-feature fusion Vietnamese keyword generation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0057] Embodiment 1: as Figure 1-5 As shown, the Vietnamese keyword generation method based on multi-feature fusion firstly integrates part-of-speech information, named entity information and location information in the encoding process; secondly, the two-way attention mechanism is used to enhance the guiding role of title information in the generation process; finally, the The feature vector fused with various semantic information is sent to the decoding layer, and the final predicted probability distribution is output to generate Vietnamese keywords.

[0058] Further, the specific steps of the method are as follows:

[0059] Step1. Use crawlers based on the Scrapy framework to crawl Vietnamese news documents and keywords in eight fields including politics, economy, culture, society, and technology from Vietnamese news websites Dantri, Vnexpress, and VietNamNet;

[0060] Step2. Filter and screen Vietnamese news documents and keywords, delete documents with a charact...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a multi-feature fusion Vietnamese keyword generation method, and belongs to the field of natural language processing. The Vietnamese keyword generation comprises the following steps: carrying out keyword prediction on a Vietnamese news text to obtain a keyword of highly summarizing news text information; according to the method, part-of-speech information, named entity information and position information are fused in the encoding process; secondly, the guiding effect of the title information in the generation process is enhanced by using a bidirectional attention mechanism; and finally, the feature vector fused with various semantic information is sent into a decoding layer, final prediction probability distribution is output, and further Vietnamese keywords are generated. According to the method, a good effect is achieved in Vietnamese keyword generation, and support is provided for subsequent text classification and information retrieval.

Description

technical field [0001] The invention relates to a multi-feature fusion Vietnamese keyword generation method, which belongs to the field of natural language processing. Background technique [0002] In natural language processing tasks, keywords are usually used as multi-word units to summarize the basic ideas of documents in short texts. Vietnamese keyword generation provides downstream tasks such as Chinese-Vietnamese bilingual text summarization, cross-language information retrieval, and public opinion analysis in Southeast Asia. important support. [0003] Good progress has been made in keyword generation in the English environment. Meng Rui et al. proposed a CopyRNN network using an encoder-decoder structure, an attention mechanism and a replication mechanism, and trained a generative model on a large-scale corpus. Bidirectional RNNs with gated recurrent units are not as good as non-deep learning methods at extracting keywords that already appear in documents in most da...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/284G06F40/295G06F40/30G06K9/62G06N3/04G06N3/08
CPCG06F40/284G06F40/295G06F40/30G06N3/084G06N3/045G06F18/22G06F18/253G06F18/214
Inventor 高盛祥陈瑞清余正涛毛存礼王振晗
Owner KUNMING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products