Document element identification method and device, equipment and storage medium

A recognition method and element technology, applied in character and pattern recognition, instruments, biological neural network models, etc., can solve the problems of low recognition accuracy of document elements and poor generalization of recognition methods, so as to improve recognition accuracy and improve recognition ability, the effect of increasing the amount of data

Active Publication Date: 2022-04-12
TENCENT TECH (SHENZHEN) CO LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The embodiment of the present application provides a document element recognition method, device, device and storage medium to solve the problems of poor generalization of the recognition method and low recognition accuracy of document elements

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Document element identification method and device, equipment and storage medium
  • Document element identification method and device, equipment and storage medium
  • Document element identification method and device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0071] In order to make the purpose, technical solutions and advantages of the embodiments of the application clearer, the technical solutions of the application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the application. Obviously, the described embodiments are the Some embodiments of the technical solution, but not all embodiments. Based on the embodiments described in the application documents, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the technical solutions of the present application.

[0072] Part of the terms used in the embodiments of the present application are explained below to facilitate the understanding of those skilled in the art.

[0073] 1. Artificial Intelligence (AI):

[0074] Artificial intelligence is the theory, method, technology and application system that uses digital computers or machines co...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of computers, in particular to the field of artificial intelligence, and provides a document element recognition method, device and equipment and a storage medium, and the embodiment of the invention can be applied to various scenes such as cloud technology, artificial intelligence, intelligent traffic and auxiliary driving. The method comprises the steps that a large number of similar real document images are obtained by filling new element data in an element display area of a corresponding document image template, and the problem that the generalization of a model is poor is solved. In each round of training, based on a training sample set, two document images with the same document content and different image sizes are used for multi-scale prediction, the small-scale wide-view advantage and the large-scale boundary positioning advantage are combined, and a prediction recognition result of the first document image is obtained. And the model parameters are adjusted by using the predicted recognition result and the corresponding processed labeling result, so that the problem of inaccurate boundary recognition is solved, and the document element recognition accuracy of the model is improved.

Description

technical field [0001] The present application relates to the field of computers, in particular to the field of artificial intelligence, and provides a document element identification method, device, equipment and storage medium. Background technique [0002] In daily life, especially when working in office, we often encounter the situation that the downloaded files are in image format. Then, in order to obtain the files in text format, we need to use a document format conversion tool to realize the function of converting images to text. [0003] At present, the element display area of ​​each document element on the image is identified through the document element recognition algorithm of the document format conversion tool, and then the text content in the corresponding area is recognized by methods such as Optical Character Recognition (OCR) and natural language understanding. . [0004] However, due to the low accuracy of the document element extraction method, when proc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06V30/40G06V30/148G06V10/774G06V10/80G06V10/82G06K9/62G06N3/04
Inventor 徐士戈胡益清吴云飞刘兵姜德强
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products