Supercharge Your Innovation With Domain-Expert AI Agents!

Intelligent extraction method for individual announcement abstract

An extraction method and announcement technology, applied in the field of computer software, can solve problems such as high cost, defects, and low efficiency, and achieve the effects of high accuracy, fast speed, and strong scalability

Pending Publication Date: 2019-07-30
武汉优品楚鼎科技有限公司
View PDF4 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of this method is to solve the problems of technical defects, high cost and low efficiency in the current method, and to design a method that can directly generate customized summaries quickly and effectively

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Intelligent extraction method for individual announcement abstract
  • Intelligent extraction method for individual announcement abstract
  • Intelligent extraction method for individual announcement abstract

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] The system architecture diagram of this method is shown in figure 1 As shown, the function description of each module is as follows:

[0020] 1: Configure the crawling source URL and crawling rules;

[0021] 2: Crawl announcements according to the configured crawl source URL and crawl rules;

[0022] 3: Use the PDF2HTML open source library to convert the captured announcements into HTML format;

[0023] 4: Clean up redundant tags, styles, etc. in HTML;

[0024] 5: Extract the Table tag in HTML and store it in the form of tableList;

[0025] 6: Extract the plain text information of HTML, and divide it into lists according to the set punctuation marks to store the sentenceList;

[0026] 7: Structure each table, extract the entries and their data in the table, and store them in the form of ;

[0027] 8: According to the preset summary keyword module, extract the data in the tableList according to the keyword and fill the module. For the situation that cannot be extra...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for extracting the abstract of the individual share announcements through table extraction and text paragraph similarity. The method comprises the following steps of firstly separating and then merging a bulletin form and a plain text, carrying out structured processing on the form, carrying out paragraph division processing on the plain text, and then extracting the keyword index data from the structured form and filling the template by combining a predefined abstract template (keyword template); and searching top N paragraphs most similar to the template as the abstract candidate paragraphs in the divided paragraphs, and if keywords cannot be matched in the structured table, searching the most similar paragraph from the candidate paragraphs as a sub-abstract. According to the method, the accuracy of the abstract is greatly improved, the editing efficiency of an editor is improved, the extraction accuracy is improved through continuous feedback, and finally automation is truly achieved.

Description

technical field [0001] The invention relates to the field of computer software, in particular to the scene of extracting summary information of individual stock announcements issued by listed companies. Background technique [0002] At present, there are many types of individual stock announcements, each type of announcement describes different key events, and each type of individual stock announcements is numerous. As an investor, it is urgent to keep abreast of the content of individual stock announcements disclosed by listed companies for their own interests. However, each type of individual stock announcement is numerous and redundant. Investors only want to understand the core events and data (that is, the summary), instead of spending a lot of time and effort to download and browse the content of each announcement. [0003] The technical solution to this problem is to use event frame-based event information extraction, based on a set of regular expressions (expert ru...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/34G06F17/22G06F17/27G06Q40/06
CPCG06Q40/08G06F40/151G06F40/205G06F16/345
Inventor 方明陈平
Owner 武汉优品楚鼎科技有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More