Announcement text key information extraction method and device

A technology of key information and extraction methods, which is applied in the field of key information extraction of announcement texts, can solve the problems of key information extraction of announcement texts, cannot solve the problems of fast and accurate extraction of key information, and output of undisclosed information, so as to improve the efficiency of reading analysis and avoid standard Inconsistency, the effect of reducing the time to extract data
CN109933796AActive Publication Date: 2019-06-25厦门商集网络科技有限责任公司

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
厦门商集网络科技有限责任公司
Publication Date
2019-06-25

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention relates to an announcement text key information extraction method, which comprises the following steps: converting an announcement text into an HTML file, wherein the HTML file comprisesDIV controls, and each DIV control correspondingly represents a row of characters; extracting text information and table information according to the description style of the DIV control, merging adjacent semantically associated rows into paragraphs in the extraction process, and independently forming the paragraphs with the rows without semantic association with the adjacent rows to obtain structured texts; establishing a key information form containing keywords; and obtaining key information through feature engineering, and writing the key information into the key information form to complete key information extraction of the announcement text. The announcement text can be deeply analyzed, the unstructured data is converted into the structured text, key information can be quickly and accurately extracted, the manual data extraction time is greatly shortened, the research and investment efficiency and accuracy are improved, and a value is created for the analysis process.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to a method and equipment for extracting key information of an announcement text, belonging to the field of natural language processing. Background technique

[0002] The text of the announcement, taking the announcement of a listed company as an example, means that the listed company publishes relevant company information to the public through a designated platform in accordance with the requirements of the China Securities Regulatory Commission. In the process of stock market investment research, the announcements and disclosures of listed companies are an important reference for investors, especially for professional institutional researchers, mining important information from announcements is a necessary process for daily investment research. However, most of the announcement texts are expressed in unstructured natural language, and the description patterns and phrases vary greatly, making manual processing difficult, and some...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More