Document element identification method, device, equipment and storage medium

A recognition method and element technology, applied in character and pattern recognition, instruments, biological neural network models, etc., can solve the problems of low recognition accuracy of document elements and poor generalization of recognition methods, so as to improve recognition accuracy and improve recognition. capacity, the effect of increasing the amount of data

Active Publication Date: 2022-06-21
TENCENT TECH (SHENZHEN) CO LTD
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The embodiment of the present application provides a document element recognition method, device, device and storage medium to solve the problems of poor generalization of the recognition method and low recognition accuracy of document elements

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Document element identification method, device, equipment and storage medium
  • Document element identification method, device, equipment and storage medium
  • Document element identification method, device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0071] In order to make the purposes, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are of the present application. Some embodiments of the technical solution, but not all embodiments. All other embodiments obtained by persons of ordinary skill in the art without creative work based on the embodiments recorded in the present application documents fall within the protection scope of the technical solutions of the present application.

[0072] Some terms in the embodiments of the present application are explained below to facilitate understanding by those skilled in the art.

[0073] 1. Artificial Intelligence (AI):

[0074] Artificial intelligence is a theory, method, technology and application system that us...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

This application relates to the field of computers, especially the field of artificial intelligence, and provides a method, device, device, and storage medium for identifying document elements. Embodiments of this application can be applied to cloud technology, artificial intelligence, intelligent transportation, assisted driving, etc scene. The method includes: obtaining a large number of similar real document images by filling the element display area of ​​the corresponding document image template with new element data, and solving the problem of poor generalization of the model. In each round of training, based on the training sample set, two document images with the same document content and different image sizes are used for multi-scale prediction, combining the advantage of small-scale wide field of view with the advantage of large-scale boundary positioning to obtain The predicted recognition result of the first document image, and then use the predicted recognition result and the corresponding processed labeling result to adjust the model parameters, which solves the problem of inaccurate boundary recognition and improves the accuracy of the model's document element recognition.

Description

technical field [0001] The present application relates to the field of computers, in particular to the field of artificial intelligence, and provides a method, apparatus, device and storage medium for identifying document elements. Background technique [0002] In daily life, especially in work and office, it is often encountered that the downloaded files are in image format. Then, in order to obtain files in text format, it is necessary to use a document format conversion tool to realize the function of converting pictures to text. [0003] At present, the document element recognition algorithm of the document format conversion tool is used to identify the element display area of ​​each document element on the image, and then optical character recognition (OCR), natural language understanding and other methods are used to identify the text content of the corresponding area. . [0004] However, due to the low precision of the document element extraction method, when process...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06V30/40G06V30/148G06V10/774G06V10/80G06V10/82G06K9/62G06N3/04
Inventor 徐士戈胡益清吴云飞刘兵姜德强
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products