Vision-based document segmentation

A document-based technology, applied in permanent visual display devices, unstructured text data retrieval, text database browsing/visualization, etc., can solve problems such as reducing the accuracy of the search process
CN1577328AInactive Publication Date: 2005-02-09MICROSOFT CORP

Patent Information

Authority / Receiving Office
CN Ā· China
Patent Type
Applications(China)
Current Assignee / Owner
MICROSOFT CORP
Publication Date
2005-02-09
Estimated Expiration
Not applicable Ā· inactive patent

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

Vision-based document segmentation identifies one or more portions of semantic content of a document. The one or more portions are identified by identifying a plurality of visual blocks in the document, and detecting one or more separators between the visual blocks of the plurality of visual blocks. A content structure for the document is constructed based at least in part on the plurality of visual blocks and the one or more separators, and the content structure identifies the one or more portions of semantic content of the document. The content structure obtained using the vision-based document segmentation can optionally be used during document retrieval.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The present invention relates to segmenting documents, and more particularly to vision-based document segmentation. Background technique

[0002] People have access to vast amounts of information. However, finding the specific information they need in any given situation can be quite difficult. For example, through the Internet, a vast amount of information is accessible to people in the form of web pages. The number of such web pages may be on the order of 1 million or more. In addition, the available web pages are constantly changing, with some pages being added, others being deleted, and others being modified.

[0003] Thus, when one desires to find out certain information, such as an answer to a question, the ability to extract specific information from this large source of information becomes very important. Processes and technologies were developed to allow users to search for information over the Internet, and are generally made available to...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More