Text abstract intelligent extraction method and device, computer equipment and storage medium

An extraction algorithm and computer program technology, applied in the fields of unstructured text data retrieval, text database browsing/visualization, special data processing applications, etc. Quality, comprehensive content, and the effect of avoiding extraction

Pending Publication Date: 2020-01-10
CHINA PING AN PROPERTY INSURANCE CO LTD
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The summarization extracted by the traditional multi-text summarization algorithm has high redundancy and cannot reflect the overall structure and con...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text abstract intelligent extraction method and device, computer equipment and storage medium
  • Text abstract intelligent extraction method and device, computer equipment and storage medium
  • Text abstract intelligent extraction method and device, computer equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0048] see figure 1 , this embodiment proposes an intelligent extraction method for a text summary, which specifically includes the following steps:

[0049] S1: Obtain multiple feature sentences from multiple texts, divide each feature sentence into feature words, and obtain multiple feature words.

[0050] The invention is especially suitable for the intelligent extraction of abstracts of multiple texts. For example, there are three texts, namely the first text, the second text and the third text. The present invention first divides the three texts into characteristic sentences, and further classifies the characteristic words on the basis of the characteristic sentences. For example, the first text, the second text, and the third text respectively contain feature sentences with a quantity of a, b, and c, and these feature sentences are marked, for example, respectively marked as feature sentences 11, 12, ... 1a, 21, 22 , ..., 2b, 31, 32, ... 3c. Each characteristic senten...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a text abstract intelligent extraction method and device, computer equipment and a storage medium, and the method comprises the steps: obtaining a plurality of feature statements from a plurality of texts, dividing feature words for each feature statement, and obtaining a plurality of feature words; classifying the plurality of feature words into different class clusters through clustering analysis; classifying the feature statement to which each feature word belongs into a corresponding class cluster; and extracting a fixed number of feature statements from each class cluster to form an overall abstract of the plurality of texts, wherein the clustering analysis process comprises the following steps: respectively carrying out word vector representation on the plurality of feature words to obtain a plurality of feature vectors; weighting each feature vector according to the importance degree to obtain a plurality of weighted vectors; calculating the similarity between every two weighting vectors; and performing clustering operation according to the similarity to obtain the number of clustering centers, and dividing the plurality of feature words into a plurality of class clusters according to the number of clustering centers.

Description

technical field [0001] The invention relates to the technical field of data mining, in particular to an intelligent extraction method, device, computer equipment and storage medium for a text summary. Background technique [0002] Automatic text summarization is a relatively difficult task in natural language processing. In essence, text summarization is a kind of information filtering. The output text is much less than the input text, but it contains the main information. According to the amount of text, text summarization can be divided into single-text summarization and multi-text summarization. The former is the basis of the latter, but the latter is not just a simple superposition of the results of the former. The former is often used to filter news information, while the latter has great potential in search engines, and the difficulty also increases accordingly. [0003] The summarization extracted by the traditional multi-text summarization algorithm has high redunda...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/34
CPCG06F16/345
Inventor 杨春春
Owner CHINA PING AN PROPERTY INSURANCE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products