Chinese text abstract generate method, computer-readable storage medium and computer device

A computer program and summary technology, which is applied in the field of text data processing, can solve problems such as inaccurate summaries, and achieve the effect of enhancing key information, improving accuracy, and accurate summaries

Active Publication Date: 2019-01-18
SOUTH CHINA NORMAL UNIVERSITY
View PDF3 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

When the new text contains some important words beyond the vocabulary, since the encoding-decoding model only uses the parameters learned in the training set, when generating, predict which word in the vocabulary sh

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese text abstract generate method, computer-readable storage medium and computer device
  • Chinese text abstract generate method, computer-readable storage medium and computer device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049] see figure 1 , which is a flow chart of the method for generating text abstracts in the present invention. The generation method of described text summarization, comprises the steps:

[0050] Step S1: Carry out word segmentation processing on the text and abstract in the training set respectively, and obtain the word set of the text and the word set of the abstract.

[0051]Step S2: Calculate the word frequency-inverse document frequency of each word in the word set of the text and the word set of the abstract respectively, obtain the vocabulary of the text and the vocabulary of the abstract, and vectorize the words in the vocabulary of the text and the vocabulary of the abstract respectively Processing to obtain the fusion vector of each word in the text vocabulary and the fusion vector of each word in the abstract vocabulary.

[0052] The text vocabulary and abstract vocabulary are obtained according to word frequency-inverse document frequency, so that some words t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method for generating text abstracts, a computer-readable storage medium and a computer device, comprising the steps of obtaining a word set of text and a word set of abstracts respectively; calculating the word frequency of each word in the word set of the text and the word set of the abstract Inverse document frequency, obtaining a text thesaurus and an abstract thesaurus, obtaining a fusion vector of each word in the text thesaurus and a fusion vector of each word in the abstract thesaurus; obtaining a word set of the text to be processed; obtaining a fusion vector of each word in a word set of the text to be processed; generating a summary word vector; according to the mapping relationship between each word in the abstract thesaurus and the fusion vector of each word in the abstract thesaurus, the word corresponding to the abstract thesaurus vector is obtained, and the word is outputted as the abstract. Using word frequency-reverse document frequency to obtain the vocabulary, so that some low-frequency vocabulary which can reflect the subject of the text can be retained, reducing the problem beyond the vocabulary, so that the generated abstract can more accurately express the meaning of the text vocabulary.

Description

technical field [0001] The invention relates to the field of text data processing, in particular to a method for generating text summaries, a computer-readable storage medium and computer equipment. Background technique [0002] With the explosive development of data, especially the rapid increase of text data, people have been unable to browse and understand all the texts of interest in time, but the omission of some important text data will cause a lot of organizational and application losses. Therefore, text summarization As the information that summarizes the important data of the text, it has become the focus of people's attention, and how to automatically generate summaries based on the text data has also become a hot research topic. [0003] At present, the existing methods for automatically generating text summaries mainly use the encoding-decoding model in machine learning. Specifically, the model first uses Recurrent Neural Networks (RNN) as an encoder to encode in...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/34G06F17/27
CPCG06F40/284G06F40/289
Inventor 曾碧卿周才东
Owner SOUTH CHINA NORMAL UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products