Document summary extracting method based on data reconstruction

A data reconstruction and document summarization technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve difficult problems and achieve the effect of improving browsing speed

Active Publication Date: 2012-12-26
ZHEJIANG UNIV
View PDF2 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

It is difficult for such methods to ensure that the summary results contain the least redundant information while including the central idea of ​​the document

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Document summary extracting method based on data reconstruction
  • Document summary extracting method based on data reconstruction
  • Document summary extracting method based on data reconstruction

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] With reference to accompanying drawing, further illustrate the present invention:

[0024] A method for extracting document summaries based on data reconstruction, the method comprising the following steps:

[0025] 1) Get the document from the document database as the target document to be extracted;

[0026] 2) For each target document, extract each sentence in the document as a candidate sentence library for the document summary;

[0027] 3) Count the weight information of all keywords in all documents, and express each sentence in the candidate sentence library as a vector;

[0028] 4) Use the data reconstruction algorithm to select the optimal summary sentence that contains both the central idea of ​​the document and the least redundant information from the candidate sentence library;

[0029] 5) Extract the selected sentences to form a summary of the target document.

[0030] The weight information of the keywords described in step 3) in all documents, and use ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a document summary extracting method based on data reconstruction. The document summary extracting method comprises the steps of: obtaining a document from a document databank to be used as an objective document, wherein the summary of the objective document is to be extracted; aiming at each objective document, extracting all sentences of the document to be used as a standby sentence library of the summary of the document; counting the weight information of all keywords in all documents, and expressing each sentence in the standby sentence library into a vector; selecting optimal summary sentences which both contain the main idea of the document and contain the less redundant information from the standby sentence library according to a data reconstruction algorithm; and extracting the selected sentences to form the summary of the objective document. The method has the advantages that a user, particularly the disabled users with visual disturbance, can be helped to understand the main content of the original document rapidly in a mode that the summary contains fewer words.

Description

technical field [0001] The invention relates to the technical field of document abstract extraction methods, in particular to a document abstract extraction method based on data reconstruction. Background technique [0002] There are about 30 million blind people in the world, and there are about 5 million blind people in China, accounting for 18% of the world's total. With the high popularity of the Internet and the increasing importance of the Internet in daily life, how to help blind people quickly obtain information from the Internet? Information will become an important issue in the construction of accessibility. Because the blind cannot receive information through vision, the problem of obtaining text content is particularly prominent. Traditional blind people rely on screen reading software to understand the text content of web pages word by word, which greatly limits the speed at which they can obtain text information on web pages. Moreover, while the content of the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
Inventor 陈纯卜佳俊何占盈王灿李平
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products