Method and device for forming merge tree for generating document template

A document template and subtree technology, applied in the computer field, can solve the time-consuming and labor-intensive problems of generation and long-term maintenance, and achieve the effects of automatic and simple production, increased flexibility, and improved accuracy

Inactive Publication Date: 2012-03-14
FUJITSU LTD
View PDF3 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Moreover, the original template generation is usually done manually, but due to the large number of sites and the cha

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for forming merge tree for generating document template
  • Method and device for forming merge tree for generating document template
  • Method and device for forming merge tree for generating document template

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] Embodiments of the present invention will be described below with reference to the drawings. Elements and features described in one drawing or one embodiment of the present invention may be combined with elements and features shown in one or more other drawings or embodiments. It should be noted that representation and description of components and processes that are not related to the present invention and known to those of ordinary skill in the art are omitted from the drawings and descriptions for the purpose of clarity.

[0025] figure 1 is a simplified flowchart showing a method 100 for forming a merged tree for generating a document template according to an embodiment of the present invention. Such as figure 1 As shown, the method starts at step S110. In the similarity calculation step S120, when each tree is compared with another tree in the multiple trees parsed from multiple pages, the similarity of the subtrees at the same level in the compared two trees is...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method and a device for forming a merge tree for generating a document template. The method comprises the following steps of: a similarity calculating step: calculating similarity of sub-trees on the same layer in two trees under comparison when comparing each tree of a plurality of trees analyzed from a plurality of pages with another tree to extract the similar sub-trees with similarity greater than or equal to that of a predetermined first threshold value from the two trees under comparison and a common root node of the similar sub-trees, wherein required characteristic can be extracted from the nodes of the trees; a merging step: forming an initial merge tree by using the extracted similar sub-trees from all the trees, wherein the root node of the initial merge tree is the common root node of the similar sub-trees of all the trees; and a post-processing step: post-processing the initial merge tree to acquire a merge tree by removing invalid sub-trees of the initial merge tree.

Description

technical field [0001] The present invention generally relates to the computer field, and more specifically, relates to a method and an apparatus for forming a merge tree for generating document templates. Background technique [0002] With the rapid development of the Internet and electronic technology, people are no longer restricted by regions, and can conveniently exchange various information on the Internet. With the participation of a large number of users, there is a large amount of useful information in the web pages of websites (such as forums, blogs, product catalog websites, etc.), which are of good use value not only for individuals but also for enterprises. [0003] In order to obtain these useful information, it is necessary to download multiple web pages included in the website for further analysis and extraction. [0004] For the web pages of the same website, most of them have a similar structure and composition. If you use the templates of these pages, it ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 王新文夏迎炬孟遥于浩
Owner FUJITSU LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products