Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Document layout recognition method and device, electronic equipment and storage medium

A recognition method and document technology, applied in neural learning methods, character and pattern recognition, instruments, etc., can solve problems such as inaccurate document layout recognition results, and achieve the effect of improving accuracy and comprehensive recognition results

Pending Publication Date: 2022-01-07
SHANGHAI GOLDWAY INTELLIGENT TRANSPORTATION SYST CO LTD
View PDF0 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Since the current document layout analysis method is based on text line-level recognition of text lines, the position and category of the identified elements are also the result of coarse-grained text line level, and the recognition result of document layout is not accurate

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Document layout recognition method and device, electronic equipment and storage medium
  • Document layout recognition method and device, electronic equipment and storage medium
  • Document layout recognition method and device, electronic equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach

[0156] As an implementation manner of an embodiment of the present invention, the above-mentioned feature extraction unit may include:

[0157] A filling subunit, configured to fill the text content into the document to be recognized according to the text position to obtain a semantic feature map;

[0158] The first extraction subunit is configured to input the image to be recognized and the semantic feature map into a pre-established first convolutional neural network, and obtain visual features and semantic features output by the first convolutional neural network; or,

[0159] The second extraction subunit is configured to input the image to be recognized into a pre-established second convolutional neural network, obtain the visual features output by the second convolutional neural network, and input the semantic feature map into the pre-established first convolutional neural network. Three convolutional neural networks, for obtaining semantic features output by the third c...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a document layout recognition method and device, electronic equipment and a storage medium. The method comprises the steps of obtaining a to-be-recognized document, extracting visual features and semantic features of the to-be-recognized document, where the visual features identify visual features on image overall typesetting corresponding to the to-be-recognized document, and the semantic features at least comprise character level features and text line level features; fusing the image features with the semantic features to obtain multi-modal document features; on the basis of the multi-modal document features, identifying the element position and category of each element in the to-be-recognized document. Text-level elements such as formulas embedded in the text can be extracted through the character-level semantic features; the visual features can identify visual elements such as graphs, so that elements including the visual features such as the graphs and character-level elements in text lines such as formulas can be obtained through the multi-modal document features; the recognition result of the document layout is more comprehensive, and the accuracy of the recognition result of the document layout is greatly improved.

Description

technical field [0001] The present invention relates to the technical field of document processing, in particular to a document layout identification method, device, electronic equipment and storage medium. Background technique [0002] Document layout analysis refers to the analysis and identification of documents, and then obtains the position and category of elements included in the document. Document layout analysis technology is widely used in document understanding, document compression, document digitization and other application scenarios, and has a wide range of application values . Elements in a document can include figures, tables, text, headings, and more. [0003] In the current document layout analysis method, the text information and image information of multiple text lines in the document are first extracted, and then the text information and image information of multiple text lines are encoded and decoded to obtain the position of the element in the documen...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/00G06K9/62G06N3/04G06N3/08G06F40/30
CPCG06F40/30G06N3/08G06N3/045G06F18/253
Inventor 张鹏
Owner SHANGHAI GOLDWAY INTELLIGENT TRANSPORTATION SYST CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products