System and method for self-adaptively locating dynamic web page elements

一种动态网页、网页元素的技术,应用在特殊数据处理应用、仪器、计算等方向

Inactive Publication Date: 2009-12-02
IBM CN
View PDF0 Cites 34 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] Therefore, in order to extract the required data and functionality from web pages as they change dynamically, a major challenge is to accurately locate unstructured or semi-structured data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • System and method for self-adaptively locating dynamic web page elements
  • System and method for self-adaptively locating dynamic web page elements
  • System and method for self-adaptively locating dynamic web page elements

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] Exemplary embodiments of the present invention will be described below with reference to the accompanying drawings. In the interest of clarity and conciseness, not all features of an actual implementation are described in this specification. It should be appreciated, however, that in developing any such practical embodiment, many implementation-specific decisions must be made in order to achieve the developer's specific goals, such as compliance with system and business-related constraints, where these Restrictions vary from implementation to implementation. Furthermore, it should also be understood that such development, while potentially complex and time-consuming, would nevertheless be a routine undertaking for those skilled in the art having the benefit of this disclosure.

[0025] In addition, it should be noted that in order to avoid confusing the present invention with unnecessary details, only the device structure and / or processing steps closely related to the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a system and a method for self-adaptively locating dynamic web page elements. The system comprises an XPath generalizer and an enhanced XPath analyzing engine, wherein the XPath generalizer is used for gradually generalizing a complete XPath expression of the web page elements based on an HTML knowledge database for describing the importance of the degree of association and the attribute of an HTML label; and the enhanced XPath analyzing engine is used for searching the web page elements in an HTML DOM tree of a target web page through the generalized XPath expression. The invention ensures that a needed Web content can be located on the basis of the XPath in spite of the change of various web page contents in a dynamic web page through combining the HTML knowledge database.

Description

technical field [0001] The present invention generally relates to data query and collection, and in particular to a system and method for adaptively locating dynamic web page elements. Background technique [0002] With the vigorous development of the World Wide Web (WWW), the content of the Web has become more and more abundant. After entering the era of Web 2.0, it is estimated that the Web has a total of about 15-30 billion web pages. Therefore, for a user, manually visiting interested webpages one by one and locating interesting content therein is becoming a heavy labor. Thus, many websites provide REST, SOAP, WSDL, FEED, and other web services for machine access. However, these Web services have improved much more slowly than the rapid growth of web pages and their content. Most information on the web is still only accessible to people browsing the web. Although a web page can be well designed for access, it only focuses on presentation structure or type setting for...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F17/30938G06F16/8373
Inventor 高伟赵石顽俞益琴付荣耀
Owner IBM CN
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products