Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Keyword extraction method based on fusion of network high-order structure and topic model

A topic model and keyword technology, applied in the field of unsupervised news text keyword automatic extraction scene, can solve problems such as topics without actual consideration of words, and achieve the effect of improving accuracy and low computational complexity

Inactive Publication Date: 2020-06-02
CHENGDU SOBEY DIGITAL TECH CO LTD
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although the calculation efficiency has been improved, in essence, KSMT and KSMQ are still a statistical model in essence, and do not actually consider the subject of words

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Keyword extraction method based on fusion of network high-order structure and topic model
  • Keyword extraction method based on fusion of network high-order structure and topic model
  • Keyword extraction method based on fusion of network high-order structure and topic model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, and are not intended to limit the present invention, that is, the described embodiments are only some of the embodiments of the present invention, but not all of the embodiments. The components of the embodiments of the invention generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations. Accordingly, the following detailed description of the embodiments of the invention provided in the accompanying drawings is not intended to limit the scope of the claimed invention, but merely represents selected embodiments of the invention. Based on the embodiments of the present ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a keyword extraction method based on fusion of a network high-order structure and a topic model. The keyword extraction method comprises the following steps of 1, performing word segmentation on a news text D; 2, removing stop words from a word segmentation result to generate a word sequence; 3, constructing a word co-occurrence network G based on the word sequence; 4, endowing the connection edge of the word co-occurrence network G with a weight based on a network high-order structure to obtain a weighted adjacency matrix M; 5, calculating the topic expression capability of the words in the word co-occurrence network G under the target text; and 6, calculating final importance scores of words in the word co-occurrence network G based on the weighted adjacency matrix M obtained in the step 4 and the topic expression capability obtained in the step 5, and selecting the first k words as keywords of the news text D from large to small according to the final importance scores. According to the keyword extraction method implemented by the invention, on one hand, the calculation complexity is low; on the other hand, the topics of the words are fused, and the accuracy of news text keyword extraction is improved.

Description

technical field [0001] The invention belongs to the field of news keyword automatic extraction, in particular to a keyword extraction method based on the fusion of network high-level structure and theme model, which is suitable for unsupervised news text automatic keyword extraction scene. Background technique [0002] The development of network technology and the rise of financial media have led to a sharp increase in the number of news information. A large amount of news data is generated on major news platforms (such as Toutiao, etc.) every day. How to enable the audience to quickly obtain information from a large number of comprehensive and comprehensive news documents is facing a huge challenge. [0003] As the two basic tasks of natural language processing, text classification technology and keyword extraction technology can obtain key information related to the content of news documents, so that the audience can quickly understand the content of news documents. Class...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/289
Inventor 朱婷婷杨瀚温序铭王炜谢超平
Owner CHENGDU SOBEY DIGITAL TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products