Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Academic-literature semantic restructuring method based on image processing and sequence labeling

A sequence labeling and image processing technology, applied in the direction of instruments, calculations, character and pattern recognition, etc., can solve problems such as low efficiency, and achieve the effect of improving utilization efficiency

Active Publication Date: 2016-01-20
WUHAN UNIV
View PDF3 Cites 25 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, there is artificial structure labeling of documents, but the efficiency is too low, and it can be said to be stretched in the case of a large amount of academic documents.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Academic-literature semantic restructuring method based on image processing and sequence labeling

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] In order to facilitate those of ordinary skill in the art to understand and implement the present invention, the present invention will be described in further detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the implementation examples described here are only used to illustrate and explain the present invention, and are not intended to limit this invention.

[0022] please see figure 1 , the method for semantic restructuring of academic documents based on image processing and sequence labeling provided by the present invention includes the following steps:

[0023] Step 1: Carry out relevant processing on academic documents and transform them into image form, and perform layout analysis on them. Firstly, grayscale, binarization, contour acquisition, outer contour, and rtree spatial index are established, and then the spatial index is used to fuse the text blocks that cover each other, and finally the text block t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an academic-literature semantic restructuring method based on image processing and sequence labeling. The method is characterized by through carrying out correlation processing, converting an academic literature into an image form and carrying out format analysis on the image form; using an OCR(optical character recognition) technology to identify each text block according with an academic literature logical structure and converting an image and the like into a plain text which can be read by a machine; using a sequence labeling model in nature language processing to carry out label sequence conversion on a processed literature content; through a literature logic structure result obtained by a contrast format analysis and sequence labeling, carrying out optimization so as to acquire a final literature logic structure. A semantic label is automatically added for the literature so as to assist reading. The literature is converted into a structural content to a certain degree and utilization efficiency of the academic literature is improved.

Description

technical field [0001] The invention belongs to the technical field of information processing, and in particular relates to a semantic restructuring method for existing academic documents in the semantic publishing field. Background technique [0002] The continuous development of information technology has changed the way of social information production, dissemination and consumption to a large extent, which has led to the evolution of traditional publishing to digital publishing. As an advanced form of digital publishing, semantic publishing can not only improve the semantics of academic documents, but also facilitate their automatic acquisition and enable them to link to semantically relevant content. However, what Semantic Publishing is currently facing is that there are a large number of existing documents. How to deal with such a large amount of existing documents to improve the quality and depth of information is an important issue. Furthermore, with the explosive...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00
CPCG06V30/416G06V30/413
Inventor 陆伟丁恒方龙
Owner WUHAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products