A document processing method and device
A processing method and literature technology, which is applied in the direction of instruments, calculations, character and pattern recognition, etc., can solve the problems of low efficiency of literature data processing, achieve the effect of improving efficiency and reducing the amount of calculation
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0073] This embodiment provides a document processing method, such as figure 1 As shown, the method includes: step 100 to step 120.
[0074] Step 100, acquiring a feature template for expressing style features of the target document.
[0075] Wherein, the above feature template includes at least: service features.
[0076] The document data processed in this application are local chronicles, ancient books and other documents with clear style characteristics, which are generated after scanning and identification. The document data usually records the text blocks corresponding to the text blocks and the format of each text block from front to back according to the sequence of the text blocks appearing in the document.
[0077] The style features mentioned in the embodiments of this application refer to the writing format of documents, including two parts: format features and business features. Among them, format features such as large characters in the top grid, reversed whit...
Embodiment 2
[0155] Correspondingly, this application also discloses a document processing device, such as Figure 7 As shown, the device includes:
[0156] A feature template acquisition module 710, configured to acquire a feature template for expressing style features of the target document, the feature template includes: business features;
[0157] The text identification module 720 is used to perform text identification on the text file describing the target document according to the above feature template, and determine the feature value of the business feature of the target document;
[0158] The document information output module 730 is configured to output the document information of the target document in a preset format according to the determined characteristic value of the business characteristic and the above characteristic template.
[0159] optional, such as Figure 8 As shown, before acquiring the feature template for expressing the style features of the target document, ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com