Automatic text abstracting method and device based on global semantics, medium and equipment

A technology for automatic summarization and text, applied in semantic analysis, neural learning methods, unstructured text data retrieval, etc.

Active Publication Date: 2020-08-21
SOUTH CHINA UNIV OF TECH
View PDF9 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, there are still many problems in generative text summarization, such as unsound semantics, grammatical errors, repeated words in context, etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic text abstracting method and device based on global semantics, medium and equipment
  • Automatic text abstracting method and device based on global semantics, medium and equipment
  • Automatic text abstracting method and device based on global semantics, medium and equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0075] In this embodiment, a text automatic summarization method based on global semantics, the process is as follows figure 1 shown, including the following steps:

[0076] S1. Preprocess the content of the original text, write a script to divide the original text into bytes, and replace the uppercase letters with lowercase letters to obtain text information.

[0077] S2, input the preprocessed text information to the encoder, the encoder globally encodes the text information based on the convolutional neural network and the self-attention mechanism, and screens through a control unit to obtain the final encoded output result, such as figure 2 shown.

[0078] Specifically, step S2 includes the following sub-steps:

[0079] S21, sequentially receive the word embedding of each word from the text information, import it into a bidirectional LSTM network, and output the result at each time node t i=0,1,2...n, n is the number of encoded information.

[0080] output result f...

Embodiment 2

[0132] A storage medium in this embodiment is characterized in that the storage medium stores a computer program, and when the computer program is executed by a processor, the processor executes the global semantics-based automatic text processing described in Embodiment 1. Abstract method.

Embodiment 3

[0134] In this embodiment, a computing device includes a processor and a memory for storing a program executable by the processor. It is characterized in that, when the processor executes the program stored in the memory, the text based on global semantics described in Embodiment 1 is implemented. Automatic summarization method.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an automatic text abstracting method and device based on global semantics, a medium and equipment. The method comprises the following steps: preprocessing the content of an original text, dividing the content according to bytes, and replacing capital letters in the original text with lowercase letters to obtain text information; by an encoder, performing global encoding on the text information based on a convolutional neural network and a self-attention mechanism, and performing screening through a control unit to obtain a final encoding output result; and by the decoder, decoding the encoded output result based on a repetitive punishment mechanism and generating a text summary. According to the method, a convolution filter is added to athe encoder, and the repetitive punishment mechanism is used in the decoder part to further suppress repetitive words so that the smoothness of abstract text semantics can be improved, the repeated punishment mechanism can greatlyinhibit already appeared words, the repeatability problem of abstract generation is reduced, and therefore the readability of abstract generation is improved.

Description

technical field [0001] The present invention relates to the technical field of natural language processing, and more specifically, relates to a method, device, medium and equipment for automatic text summarization based on global semantics. Background technique [0002] With the rapid development of the Internet, how to quickly and accurately read a large amount of information makes the in-depth study of automatic text summarization technology a necessary requirement. As a technology that can alleviate information overload, automatic text summarization has been widely used in practice, such as automatically generating summaries of news articles and technical articles, automatically generating snapshots of search engine retrieval results, automatic writing robots, and so on. [0003] Automatic text summarization technology is to use computer to automatically extract the central idea and key content from the original article, and perform semantic analysis and processing to gen...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/34G06F40/30G06N3/08
CPCG06F16/345G06F40/30G06N3/08G06N3/044G06N3/045Y02D10/00
Inventor 姜小波杨博睿
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products