Method and system for intelligently extracting document structure
A document structure and document technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problems of unstructured processing, non-paragraph-style document fragments cannot be correctly extracted, etc., and achieve the effect of strong flexibility
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
no. 1 example
[0025] figure 1 is a flow chart of the method for intelligently extracting document structure according to the first embodiment of the present invention. refer to figure 1 , the method includes the following steps:
[0026] Step S1, small sample analysis step
[0027] In this step, according to the content of each part contained in the sample of the document to be extracted and its key attributes, the extraction rules of each part and the corresponding structured keywords and the hierarchical relationship between the structured keywords are established, That is to say, the established extraction rules and structured keywords of each part should be able to reflect the content and / or key attributes of the part.
[0028] Among them, the key attribute may be, but not limited to, font style, paragraph style, text attribute and title level. The extraction rules can be set according to the text content of each part of the content in the sample, and can also be, but not limited to...
no. 2 example
[0045] The difference between this embodiment and the first embodiment is that the sample or document is converted into a logical tree as an intermediate result, and then a unified method is applied to the logical tree with consistent specifications to structure it. In this way, documents of any format can be processed in a uniform and structured way.
[0046] Figure 4 is a flow chart of the method for intelligently extracting document structure according to the second embodiment of the present invention. refer to Figure 4 , the method includes the following steps:
[0047] Step S41, small sample analysis step
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com