Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method for conducting title and text logic connection for newspaper pages

A logical association and text technology, applied in the direction of instruments, calculations, electrical digital data processing, etc., can solve problems such as unsatisfactory title matching effects, achieve high matching accuracy and improve matching effects

Active Publication Date: 2005-04-06
PEKING UNIV FOUNDER R & D CENT +1
View PDF0 Cites 28 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Aiming at the unsatisfactory defect of newspaper page title matching effect in the prior art, the purpose of the present invention is to provide a method for logically associating the title and text of the newspaper page, which can extract the chapter structure of the newspaper page, which can greatly Improve headline matching performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for conducting title and text logic connection for newspaper pages
  • Method for conducting title and text logic connection for newspaper pages
  • Method for conducting title and text logic connection for newspaper pages

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] Below in conjunction with accompanying drawing, the present invention will be further described, and flow chart of the present invention is as follows figure 1 Shown:

[0020] (1) Read in newspaper documents after layout analysis. Newspaper documents include scanned paper newspapers and OCR-recognized documents, PDFs, documents generated by professional typesetting software such as Founder Feiteng, etc. Layout analysis is to divide the layout from bottom to top Each block area is physically classified into text blocks and image blocks. Classify each text block into body text blocks and non-text text blocks according to the font style and the number of lines in the block, such as figure 2 As shown, the solid line rectangle represents the text block, and the dotted line rectangle represents the non-text block. The adjacency relationship of the text block is expressed as a directed graph, which is split and converted into a weighted bipartite graph, and the bipartite gra...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

This invention belongs to intelligent font and graph information process technique and in detail relates to a method of paper page headline and cross logic connection, which comprises the following steps: first to establish a mathematics model with graph theory; to use bisect graph matching model to prescribe non cross area and cross area matching particle with one to one characteristics; to establish the weigh bisect graph according to space relationship; to firstly adopt nature language process technique to compute the bisect graph weigh value; to make the optimized result pair saturation top as logic connection success headline and content page.

Description

technical field [0001] The invention belongs to intelligent text and graphic information processing technology, in particular to a method for logically associating titles and texts on newspaper layouts. Background technique [0002] Newspaper titles play an important role in content management systems such as classification and retrieval. Both Dublin Core and NewsML regard titles as important metadata. Especially in cross-media publishing, titles are important elements of metadata and XML message structures. Whether the logic associated with the text is correct or not directly affects the reuse and further processing of information in the digital asset management system, such as retrieval, republishing, and hyperlinks. Logical association refers to the logical classification of each text block tiled on the two-dimensional space of the newspaper layout into title, body, header, quotation, etc. according to its semantic function, and then associates the title and text represen...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/21
Inventor 贾娟陈晓鸥陈堃銶
Owner PEKING UNIV FOUNDER R & D CENT
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products