Semantic segmentation-based text detection method for scene of any shape

A technology of text detection and arbitrary shape, which is applied in the field of computer vision, can solve the problems of inability to accurately locate text of arbitrary shape, cannot be solved, and the detection accuracy rate is not high, and achieves fast speed, strong generalization ability, and high detection accuracy Effect

Pending Publication Date: 2020-08-18
FOSHAN NANHAI GUANGDONG TECH UNIV CNC EQUIP COOP INNOVATION INST +1
View PDF3 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Currently, there are two major difficulties in widely used scene text detection methods: on the one hand, most existing methods adopt quadrilateral bounding boxes (bounding boxes), which cannot accurately locate text with arbitrary shapes; on the other hand, There are many scenes where the text lines are very similar to each other, resulting in a low detection accuracy, and the connected text lines will be recognized as one line
Traditionally, segmentation-based methods are good at alleviating the first difficulty, but often fail to solve the second difficulty

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Semantic segmentation-based text detection method for scene of any shape
  • Semantic segmentation-based text detection method for scene of any shape
  • Semantic segmentation-based text detection method for scene of any shape

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach

[0046] As a preferred embodiment of the present invention, the construction method of scene text detection network model in described step S1 comprises the following steps:

[0047] S101, using the feature pyramid network for feature extraction and multi-feature fusion, refer to figure 2 , the feature pyramid network is a network based on the residual deep convolutional neural network, which consists of a bottom-up connection, a top-down connection and a horizontal connection structure; using the feature pyramid network model from the input data set picture Extract and fuse low-level high-resolution features and high-level high-semantic information features: First, input the training data set images into the bottom-up network structure of the feature pyramid network, that is, the forward process of the network. In the forward process, the network feature map will change after passing through some layers, but will not change when passing through other layers. The convolutional...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a semantic segmentation-based text detection method for a scene of any shape, wherein the method comprises the following steps of S1, constructing a semantic segmentation-basedtext detection network model for a scene of any shape; S2, according to the overall target loss function, minimizing the overall loss function by using a back propagation algorithm and stochastic gradient descent optimization, and carrying out iterative training on the model designed in S1; S3, using a step-by-step scale expansion method, performing scene text detection and recognition accordingto the model trained in the S2. For the text detection problem of text instances which are in any shape and are close to each other in a natural scene, the invention provides a semantic segmentation-based method for creatively utilizing a multi-kernel step-by-step scale expansion method to detect the text, so that the positions of the text blocks are detected more accurately.

Description

technical field [0001] The invention relates to the field of computer vision, in particular to a method for detecting text of arbitrary shapes in a scene based on semantic segmentation. Background technique [0002] With the development of convolutional neural networks, scene text detection has achieved rapid development, and it is currently widely used in geolocation, real-time translation, and assistance for the blind. But the detection of scene text is different from traditional optical character recognition (OCR), which is more challenging due to multi-directional, curved and even text layout of non-text lines. Currently, there are two major difficulties in widely used scene text detection methods: on the one hand, most existing methods adopt quadrilateral bounding boxes (bounding boxes), which cannot accurately locate text with arbitrary shapes; on the other hand, There are many scenes where the text lines are very similar to each other, resulting in a low detection ac...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/32G06K9/34G06K9/62G06N3/04
CPCG06V20/62G06V30/153G06V30/10G06N3/045G06F18/253
Inventor 杨海东罗哲黄坤山彭文瑜林玉山
Owner FOSHAN NANHAI GUANGDONG TECH UNIV CNC EQUIP COOP INNOVATION INST
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products