Unlock instant, AI-driven research and patent intelligence for your innovation.
Literature processing method and device
What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A processing method and document technology, applied in the direction of instruments, character and pattern recognition, computer components, etc., can solve the problem of low efficiency of document data processing
Active Publication Date: 2019-04-16
HANVON CORP
View PDF6 Cites 0 Cited by
Summary
Abstract
Description
Claims
Application Information
AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology
Problems solved by technology
[0004] The embodiment of the present application provides a document processing method and device, which can identify and match document data through feature templates to solve the problem of low document data processing efficiency
Method used
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more
Image
Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
Click on the blue label to locate the original text in one second.
Reading with bidirectional positioning of images and text.
Smart Image
Examples
Experimental program
Comparison scheme
Effect test
Embodiment 1
[0073] This embodiment provides a document processing method, such as figure 1 As shown, the method includes: step 100 to step 120.
[0074] Step 100, acquiring a feature template for expressing style features of the target document.
[0075] Wherein, the above feature template includes at least: service features.
[0076] The document data processed in this application are local chronicles, ancient books and other documents with clear style characteristics, which are generated after scanning and identification. The document data generally records the text blocks corresponding to the text blocks and the format of each text block from front to back according to the order in which the text blocks appear in the document.
[0077] The style features mentioned in the embodiments of this application refer to the writing format of documents, including two parts: format features and business features. Among them, format features such as large characters in the top grid, reversed wh...
Embodiment 2
[0155] Correspondingly, this application also discloses a document processing device, such as Figure 7 As shown, the device includes:
[0156] A feature template acquisition module 710, configured to acquire a feature template for expressing style features of the target document, the feature template includes: business features;
[0157] The text identification module 720 is used to perform text identification on the text file describing the target document according to the above feature template, and determine the feature value of the business feature of the target document;
[0158] The document information output module 730 is configured to output the document information of the target document in a preset format according to the determined characteristic value of the business characteristic and the above characteristic template.
[0159] optional, such as Figure 8 As shown, before acquiring the feature template for expressing the style features of the target document, ...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More
PUM
Login to View More
Abstract
The invention provides a literature processing method, belongs to the field of literature processing, and solves the problem of low literature data processing efficiency in the prior art. The method comprises the following steps: obtaining a feature template for expressing a posture feature of a target literature; performing text recognition on the text file describing the target literature according to the feature template, and determining a feature value of a business feature of the target literature; and outputting preset format literature information of the target literature according to the determined characteristic values of the service characteristics and the characteristic template. According to the literature processing method disclosed by the embodiment of the invention, literature data extraction is carried out based on the feature template, semantic recognition of a large amount of data is not needed, the operand is effectively reduced, and the literature data extraction efficiency is improved.
Description
technical field [0001] The present application relates to the field of document processing, in particular to a document processing method and device. Background technique [0002] Ancient books and documents are an important basis for studying the natural, social, political, economic, cultural and other aspects of a certain period and / or a certain region. For example, local chronicles are documents that comprehensively record the natural, social, political, economic, cultural and other aspects of a certain region in a certain period. In order to facilitate research and access to literature information, the structure of ancient literature is particularly important. In the process of structuring ancient literature, the usual practice is to first obtain the words in the fragmented literature through scanning and identification; then, through the semantic recognition of the words in the literature, the content of the fragmented literature is classified or organized. index. ...
Claims
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More
Application Information
Patent Timeline
Application Date:The date an application was filed.
Publication Date:The date a patent or application was officially published.
First Publication Date:The earliest publication date of a patent with the same application number.
Issue Date:Publication date of the patent grant document.
PCT Entry Date:The Entry date of PCT National Phase.
Estimated Expiry Date:The statutory expiry date of a patent right according to the Patent Law, and it is the longest term of protection that the patent right can achieve without the termination of the patent right due to other reasons(Term extension factor has been taken into account ).
Invalid Date:Actual expiry date is based on effective date or publication date of legal transaction data of invalid patent.