Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Automatic writing method based on extraction type multi-document abstract method

A multi-document, extractive technology, applied in the direction of instrumentation, computing, electrical digital data processing, etc., can solve problems such as partial repetition, inability to process text, and poor results

Pending Publication Date: 2019-08-30
SHENZHEN GRADUATE SCHOOL TSINGHUA UNIV
View PDF3 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, at present, generative summarization based on deep learning is still far from the actual deployment and application. First, it cannot handle long texts, especially for multiple documents. Second, the generalization ability is very poor. The effect of training and testing on the set is often very poor on other data sets. Finally, the generated language itself often has language problems, partial repetitions, etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic writing method based on extraction type multi-document abstract method
  • Automatic writing method based on extraction type multi-document abstract method
  • Automatic writing method based on extraction type multi-document abstract method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] The following embodiments of the present invention are dedicated to constructing an automatic news writing system based on extractive multi-document summarization technology, not only to realize the general requirements of the summary system, namely to find important information, but also to ensure the integrity and continuity of the information. The basic function is that the user gives a topic, the system automatically collects relevant data, and outputs the corresponding complete article in a way suitable for human reading. This method can help users quickly understand the overall picture of complex news events, and can also be used to continuously track news topics of interest, that is, different stages can be described with the help of our system as the events develop. It can also be used as an auxiliary tool for news editors, that is, you can use our system to write based on current news hotspots to assist editors to understand the latest development of things or to...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to an automatic writing method based on an extraction type multi-document abstract method, which comprises the following steps of A1, user input and data preprocessing of receiving a keyword inputted by a user, retrieving the related data on a data retrieval platform, and performing the preliminary processing on the retrieved related data; A2, graph sorting of inputting a plurality of documents, firstly identifying all sentences by the system, and scoring the importance of all the sentences; A3, redundancy removal of if two or more sentences with the similarity exceedinga preset threshold exist in the sentences, only reserving one sentence, and outputting an ordered sentence list with redundant sentences removed; and A4, constructing and outputting, selecting the most important sentences from the ordered sentence list provided in the previous stage according to the limitation of the text from front to back, reordering the sentences, and outputting a manuscript formed by the ordered sentences.

Description

Technical field [0001] The invention belongs to computer applications, computer systems, the Internet, information processing and its technical products. Background technique [0002] Automated writing for news refers to the application of intelligent algorithms that cooperate with computer software systems and big data resources. The writing system completes the writing of a news article through data collection, sorting, analysis and integration. Since 2010, foreign media such as the Associated Press, domestic media such as Xinhua News Agency and Tencent News have launched their own writing robots. The current writing is mainly for writing in the fields of finance, sports, and emergencies such as extreme weather and earthquakes. At present, the characteristics of these automatic writing systems are that the format of the information source is fixed and highly refined, and the output manuscripts are generally short in length. The realization method is mostly by using a template m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
CPCG06F40/211G06F40/30
Inventor 韩旭旺郑海涛赵从志
Owner SHENZHEN GRADUATE SCHOOL TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products