OCR Layout Analysis Method Based on Gini Impurity
A technology of layout analysis and purity, applied in the fields of instrumentation, calculation, electrical and digital data processing, etc., can solve the problem of inability to judge the typesetting information of image and text, and achieve the effect of improving the accuracy.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0038] OCR is a common application in the field of image processing. The current OCR model based on deep learning can achieve accurate positioning and recognition of text in images. How to get text typography letters information and further extraction of effective text information , is still an open problem in OCR identification.
[0039] The invention proposes an OCR layout analysis method based on Gini impurity. By finding the dividing line with the smallest Gini impurity in the image, and then judging the typesetting direction of the text in the image by the position and direction of the dividing line, invalid recognition can be filtered out based on the typesetting direction. As a result, the final and effective OCR recognition text information is obtained.
[0040] The technical solutions of the present invention will be further described below with reference to the specific embodiments and the accompanying drawings. figure 1 Show the schematic flow sheet of the OCR l...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


