Segmentation-based multi-scale feature pyramid text detection method

A multi-scale feature and feature pyramid technology, applied in character and pattern recognition, instruments, computer components, etc., can solve the problems of poor detection of small targets, inability to distinguish boundaries well, and inability to perform good positioning. , to achieve the effects of reduced computing overhead, high accuracy, and good detection performance

Active Publication Date: 2020-07-28
SOUTH CHINA UNIV OF TECH +1
View PDF2 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Among them, the anchor-based method is different from the fixed aspect ratio of general objects due to the variety of scene text sizes, which makes the network insensitive to the size of the text and has low accuracy; in addition, most of the existing anchor-based text detection is Based on quadrilaterals or rotated rectangles, text of arbitrary shape cannot be positioned well
The method based on pixel segmentation is easily limited by the receptive field, which is not good for small target detection, and for some relatively close text instances, it cannot distinguish the boundary well

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Segmentation-based multi-scale feature pyramid text detection method
  • Segmentation-based multi-scale feature pyramid text detection method
  • Segmentation-based multi-scale feature pyramid text detection method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0051] A multi-scale feature pyramid text detection method based on segmentation, the whole framework of the network is as follows figure 1 As shown, it mainly includes the following steps:

[0052] S1 data acquisition; in this embodiment, ICDAR2015, CTW1500, RCTW17 and other public text detection datasets widely used in academia are used for training and testing. Among them, the ICDAR2015 data set contains 1000 training data and 500 test data; the CTW1500 data set contains 1000 training data and 500 test data; the RCTW17 data set contains 8034 training data and 4229 test data.

[0053] S2 builds a pyramid feature extraction model (PEFM), and the network structure is as follows figure 2 As shown, the features are extracted from the acquired data, which specifically includes the following steps:

[0054] S2.1 The input image extracts features through several layers of convolution operations of the backbone, and obtains a feature pyramid;

[0055] In this embodiment, the bac...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a segmentation-based multi-scale feature pyramid text detection method. The method comprises the steps of obtaining data; constructing a pyramid feature extraction model, and extracting features from the acquired data; sampling input data to obtain input images with different scales, respectively inputting the input images into a pyramid feature extraction model, extractingtext features, fusing and processing the text features of the input images with different scales through a multi-scale detection network to obtain a feature map, and predicting the feature map; and processing a prediction result to obtain a contour boundary line of the text region. The method has high robustness, can be directly applied to text detection of any shape in a natural scene, and achieves high accuracy, recall rate and F value.

Description

technical field [0001] The invention belongs to the field of image text analysis, in particular to a segmentation-based multi-scale feature pyramid text detection method. Background technique [0002] With the development of computer vision technology, the application of image understanding technology is becoming more and more extensive. As an important part of images, text contains rich semantic information and is the key to image understanding. Accurate text detection is the first step to extract key information from images. Text detection in natural scene images faces many challenges due to the diversity of backgrounds, uncertainties in size and direction: (1) the diversity of text formats, and the diversity of text line arrangements; (2) the diversity of text directions ( 3) Diversity of text size and dimension (4) Diversity of text background. [0003] There are currently two main methods for text box detection using deep learning: [0004] (1) Using the anchor-based...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/32G06K9/46G06K9/62
CPCG06V10/255G06V20/62G06V10/44G06F18/24G06F18/253
Inventor 高学韩思怡
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products