Unlock instant, AI-driven research and patent intelligence for your innovation.

Method for detecting and segmenting literature layout area

An area detection and document technology, which is applied in the field of document layout area detection and segmentation, can solve the problem of difficulty in dividing the layout in a regular way, and achieve the effect of automatic detection and segmentation with high accuracy.

Pending Publication Date: 2020-05-08
福建两岸信息技术有限公司
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Due to the diversity of different document layouts, it is difficult to divide the layout in a regular way, and there is no corresponding technology to achieve automatic layout segmentation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for detecting and segmenting literature layout area
  • Method for detecting and segmenting literature layout area
  • Method for detecting and segmenting literature layout area

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0039] Please refer to figure 1 , a method for document layout region detection and segmentation, characterized in that it comprises the steps:

[0040] S1. Acquiring document pictures and establishing a training data set;

[0041] Step S1 is specifically:

[0042] Obtain literature pictures in different formats, and establish the first detection data set.

[0043] Step S1 also includes:

[0044] The pictures in the first detection data set are marked to obtain a second detection data set.

[0045] S2, create the first detection model, and train the first detection model by the training data set, obtain the trained second detection model;

[0046] Step S2 is specifically:

[0047] Create the first neural network YOLO V3 detection model, and train the first neural network YOLO V3 detection model by the training data set to obtain the trained second neural network YOLO V3 detection model.

[0048] The described first neural network YOLO V3 detection model is trained specif...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method for detecting and segmenting a literature layout area, which comprises the following steps of: obtaining a literature picture, and establishing a training data set; creating a first detection model, and training the first detection model through the training data set to obtain a trained second detection model; and detecting and segmenting the to-be-detected and segmented literature picture according to the second detection model, so that automatic detection and segmentation of the literature picture can be realized, and the accuracy is high.

Description

technical field [0001] The invention relates to the technical field of image detection, in particular to a method for detection and segmentation of document layout regions. Background technique [0002] At present, OCR technology usually first recognizes all the text in the entire picture, and then analyzes the content to extract useful information. When using OCR technology to digitize documents and make them into e-books, it is not only necessary to use OCR technology to detect and recognize text, but also to follow the layout of the original book. For this reason, it is necessary to determine the effective content area and border (such as black frame) in the layout. Areas, headers, footers, etc. Due to the diversity of different document layouts, it is difficult to divide the layout in a regular way, and there is no corresponding technology to achieve automatic layout segmentation. [0003] Therefore, there is a need for a method for document layout region detection and...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06K9/20G06K9/34G06N3/04G06N3/08
CPCG06N3/08G06V30/416G06V10/225G06V30/153G06N3/045
Inventor 张雄
Owner 福建两岸信息技术有限公司