Unlock instant, AI-driven research and patent intelligence for your innovation.

An automated text summarization evaluation method based on pre-trained language model and information theory

A language model and evaluation method technology, applied in natural language data processing, semantic analysis, instruments, etc., can solve problems such as limited use scenarios, information, and correlation differences, and achieve the effects of reducing evaluation costs and guiding design and training

Active Publication Date: 2022-04-05
SHANGHAI JIAOTONG UNIV
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, these automated evaluation methods require manually written reference summaries when evaluating system summaries, which greatly limits their usage scenarios.
[0004] In addition, there are great differences between automated evaluation methods and artificial subjective feelings, and there are still great differences in indicators such as information and relevance.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An automated text summarization evaluation method based on pre-trained language model and information theory
  • An automated text summarization evaluation method based on pre-trained language model and information theory
  • An automated text summarization evaluation method based on pre-trained language model and information theory

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0062] The following is a detailed description of the embodiments of the present invention: this embodiment is implemented on the premise of the technical solution of the present invention, and provides detailed implementation methods and specific operation processes. It should be noted that those skilled in the art can make several modifications and improvements without departing from the concept of the present invention, and these all belong to the protection scope of the present invention.

[0063] An embodiment of the present invention provides an automatic text summarization evaluation method based on a pre-trained language model and information theory. Provide a convenient, low-cost method.

[0064] Assume that the system input text and generated summary are D and S respectively. After the input text and the generated summary are segmented, they respectively form a sequence of semantic units [w 1 ,w 1 ,...,w i ,...]. Then the evaluation method includes the following...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention provides an automatic text summarization evaluation method based on a pre-trained language model and information theory, including: using a pre-trained language model to calculate the probability of semantic units based on input text and generated summaries; using information theory to calculate information content for semantic units; The unit information is summed to obtain the total information of the text abstract; the mutual information is used to calculate the correlation between the input text and the abstract; the abstract redundancy is modeled by subtracting the total information of the abstract from the maximum information content; the total information, The weighted average of correlation and redundancy is used as a comprehensive evaluation index. At the same time, a corresponding system, terminal and storage medium are provided. The present invention uses the pre-trained language model to assist information theory to more accurately estimate text probability and calculate the amount of text information, and the created three automatic indicators of information amount, relevance, and redundancy are more in line with human evaluation standards and can be used to replace manual evaluation , to reduce the evaluation cost of the automated summarization system.

Description

technical field [0001] The present invention relates to the technical field of natural language processing, in particular to an automatic text summarization evaluation method based on a pre-trained language model and information theory, and provides a corresponding system. Background technique [0002] Text summarization is an important means to quickly acquire knowledge from massive text information, and it is becoming more and more important in the era of information explosion. The design and learning of automated text summarization systems largely depend on the accuracy of evaluation metrics. A good evaluation quality should be able to reflect human's subjective feelings about the abstract. [0003] There are currently two evaluation methods: one is manual evaluation, and the other is automated evaluation that simulates manual evaluation. Human evaluation is the gold standard for summarization evaluation techniques, for example, the recall-oriented pyramid evaluation me...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/126G06F40/216G06F40/30
CPCG06F40/126G06F40/216G06F40/30
Inventor 金耀辉何浩肖力强陈文清田济东
Owner SHANGHAI JIAOTONG UNIV