Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Extraction type document automatic abstracting method based on context semantic perception

An automatic summarization and contextual technology, applied in semantic analysis, neural learning methods, natural language data processing, etc., can solve problems such as single results, lack of contextual semantic relationship understanding, one-sidedness, etc.

Active Publication Date: 2020-01-24
HUBEI UNIV OF TECH
View PDF4 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is: in order to solve the lack of understanding of the contextual semantic relationship in automatic document summarization, resulting in a single and one-sided summarization result, thereby proposing an extractive document automatic summarization method based on contextual semantic awareness

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Extraction type document automatic abstracting method based on context semantic perception
  • Extraction type document automatic abstracting method based on context semantic perception
  • Extraction type document automatic abstracting method based on context semantic perception

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0076] Step 1: Select two short documents:

[0077] "On September 6, at the IFA2019 conference in Berlin, Germany, Huawei officially released the Kirin 990 5G chip. In comparison, among the main competitors of the Kirin 990 chip, the Snapdragon 865 has not yet been released, and it remains to be seen how it will perform."

[0078] "The 5G chip is the world's first flagship 5G SoC and the smallest 5G mobile phone chip solution in the industry. For the majority of users, the most intuitive performance is faster speed and more beautiful images, but more importantly, its powerful AI computing power It will give wisdom to more life scenarios, and I believe that this year, the first year of 5G commercial use, will bring you the best application experience."

[0079] The number of topics in the specified document is 2, the number of subject words is 3, and the hyperparameters are set After word segmentation, sentence segmentation, and stop words removal, the results are as follow...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an extraction type document automatic abstracting method based on context semantic perception. The extraction type document automatic abstracting method mainly solves the problem that a traditional algorithm lacks the recognition degree of sentences in different contexts. The method comprises: firstly, using an LDA topic model for calculating topic probability distributionin a document, and then determining the similarity between each sentence and a topic word; extracting semantic features of sentences by using a CNN model, further calculating the similarity between each sentence and the features, finally adding values of topic similarity and feature similarity of each sentence to obtain a final sentence score, and taking a proper number of sentences as abstracts according to score ranking. According to the method, a topic model and a deep learning model are introduced, a topic abstracting method is determined, sentence meanings in different contexts can be analyzed more accurately, and a calculation reference method is provided for other automatic document abstracting methods.

Description

technical field [0001] The invention belongs to the field of natural language processing, and relates to an extractive document automatic summarization method based on context semantic perception, which applies the LDA topic model and deep learning method to automatic text summarization, and solves the lack of understanding of semantic information in current document automatic summarization The problem. Background technique [0002] With the continuous development of modern Internet technology, the amount of data generated every day is very explosive. How to extract effective information from massive data has become an urgent need. Automatic text summarization is the use of computers to refine a large amount of data content. A technique for producing a concise, concise summary that replaces the content of an entire document. According to different types of algorithms, mainstream technologies are divided into traditional algorithms based on word frequency statistics and algo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/289G06F40/30G06F16/34G06N3/04G06N3/08
CPCG06F16/345G06N3/08G06N3/045Y02D10/00
Inventor 熊才权沈力王壮周磊陈曦
Owner HUBEI UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products