The present invention relates to a method for generating document summaries, which includes performing
sentence segmentation on a document set to obtain a
sentence set and expressing it with a
vector space model, determining the similar sentences and the number of similar sentences corresponding to each
sentence according to a preset similarity threshold, and calculating Get the corresponding importance
score, obtain each sentence in the sentence set as the current
processing sentence in turn, compare the number of similar sentences in the current
processing sentence with the corresponding similar sentence numbers of all similar sentences in the current
processing sentence, find the maximum value and The corresponding sentences are added to the diversity reference set, and then the diversity
score and comprehensive
score of each sentence are calculated, and finally all the sentences in the sentence set are sorted and screened to form a
document summary. In addition, a device for generating document summaries is provided. The above method and device for generating document summaries comprehensively consider the internal information of the sentence and the
global information in the document collection, and reduce the redundancy of the document summaries as a whole.