Multiple-document summarization using document clustering

A document clustering and document technology, applied in the field of systems that generate multi-document summaries, can solve problems such as incompleteness and achieve high-quality results

Inactive Publication Date: 2010-03-24
NEC LAB AMERICA
View PDF3 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore this summary is not comprehensive

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multiple-document summarization using document clustering
  • Multiple-document summarization using document clustering
  • Multiple-document summarization using document clustering

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0016] figure 1 The framework structure of an exemplary multi-document summarization system is shown. First, a number of documents are received (10). The document is preprocessed (20) by removing formatting characters and stopping words. Then, a unigram language model is used to obtain documents by terms and sentences by term matrices. If the task is query-related generalization, the sentences obtained through the term matrix are projected into a subspace where each candidate sentence is related to the query. Afterwards, given the two matrices, the system performs nonnegative factorization on the document and simultaneously clusters documents and sentences into latent topics (30). Sentences with high probabilities in topics are used to form summaries (40).

[0017] figure 2 An exemplary process for summarizing multiple documents is shown. exist figure 1 In B, a number of documents are provided as input in box 101 . In block 102, the process obtains a language model fo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to multiple-document summarization using document clustering. Systems and methods are disclosed for summarizing multiple documents by generating a model of the documents as a mixture of document clusters, each document in turn having a mixture of sentences, wherein the model simultaneously representing summarization information and document cluster structure; and determininga loss function for evaluating the model and optimizing the model.

Description

[0001] This application claims priority to Provisional Application Serial No. 61 / 056,595, filed May 28, 2008, the contents of which are incorporated herein by reference. technical field [0002] The present application relates to systems and methods for generating multi-document summaries. Background technique [0003] Multi-document summarization is the process of producing general or topic-focused summaries by reducing the document size while maintaining the main features of the original document. Since one cause of the data overload problem is that many documents share the same or similar topics, automatic multi-document summarization has gained much attention in recent years. The explosion of documentation on the Internet has fueled the need for generalization applications. For example, the generation of informative snippets in WEB search can help users further explore the snippets, and in question / answer systems, it is often necessary to provide the information queried...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F17/30
CPCG06F17/30719G06F16/345
Inventor S·朱D·王Y·赤Y·龚
Owner NEC LAB AMERICA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products