Announcement text key information extraction method and device
A technology of key information and extraction methods, which is applied in the field of key information extraction of announcement texts, can solve the problems of key information extraction of announcement texts, cannot solve the problems of fast and accurate extraction of key information, and output of undisclosed information, so as to improve the efficiency of reading analysis and avoid standard Inconsistency, the effect of reducing the time to extract data
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0034] see figure 2 , a method for extracting key information of an announcement text, comprising the steps of: converting the announcement text into an HTML file, wherein the HTML file includes DIV controls, and each DIV control corresponds to a line of text; extracting text information and text information according to the description style of the DIV controls Table information, and in the process of extraction, the adjacent semantically related lines are merged into paragraphs, and the lines that have no semantic relationship with adjacent lines are independently formed into paragraphs to obtain structured text, and establish a key information form containing keywords (such as Figure 8 As shown), the key information is obtained through feature engineering, and the key information is written into the key information form to complete the key information extraction of the announcement text, such as Figure 9 shown. The invention discloses a method and device for extracting ...
Embodiment 2
[0047] A device for extracting key information from an announcement text, comprising a memory and a processor, the memory stores instructions, the instructions are suitable for being loaded by the processor and performing the following steps: converting the announcement text into an HTML file, the HTML file contains DIV control, each DIV control corresponds to a line of text; text information and table information are extracted according to the description style of the DIV control, and the adjacent semantically related lines are merged into paragraphs during the extraction process, and there is no semantic relationship with adjacent lines Associated lines are independently formed into paragraphs to obtain structured text; a key information form containing keywords is established; key information is obtained through feature engineering, and the key information is written into the key information form.
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com