Check patentability & draft patents in minutes with Patsnap Eureka AI!

Webpage element identification method based on XPath

An identification method and technology for web page elements, applied in the Internet field, can solve problems such as dynamic id and name attributes are not unique, elements without ID or dynamic ID, and target elements cannot be accurately found, so as to increase the success rate and accuracy rate , the effect of narrowing the range

Pending Publication Date: 2020-07-03
苏州数字力量教育科技有限公司
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But it can't solve the case that the element has no ID or dynamic ID
[0008] In practical applications, most web page elements are located using attributes such as id, name, and class in HTML to locate elements, but due to reasons such as dynamic id and name attributes are not unique, it is often impossible to accurately find the target element

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Webpage element identification method based on XPath
  • Webpage element identification method based on XPath
  • Webpage element identification method based on XPath

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0041] see Figure 1-2 As shown, the present embodiment is a method for identifying web page elements based on XPath, comprising the following steps:

[0042] (1) Generate the XPath of the bottom element attribute;

[0043] S101 extracting the attributes of the elements selected by the programmer from the bottom layer of the webpage structure; the attributes include id, name, and class in HTML. The number of attributes ≥ 2;

[0044] S102 Generate XPath according...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a webpage element identification method based on XPath. The webpage element identification method is divided into three parts, wherein in the first part, the XPath set T1 of the bottommost layer element attribute is searched for, in the second part, the most unique hierarchy capable of finding the target element is searched for, an XPath set T2 of the hierarchy attribute isgenerated, and in the third part, XPath in T1 and XPath in T2 are combined. The invention provides a brand-new XPath generation method, and the method can reduce the range of the target element and increase the success rate and accuracy of finding the element. And meanwhile, the robustness in operations such as webpage testing, process automation and data capturing is also improved.

Description

technical field [0001] The invention relates to the technical field of the Internet, in particular to an XPath-based web page element identification method. Background technique [0002] Web page element positioning has important applications in crawling web page data, developing automated processes, and writing web page test scripts. However, due to the low accuracy of web page elements, the development of these technologies is limited, and it is easy to cause data capture failure or interruption of automated processes. Currently, there are several ways to find web page elements: [0003] 1) Machine vision technology: Computers are mainly used to simulate human visual functions, and information is extracted from images of objective things, processed and understood, and finally used for actual detection, measurement and control. However, if the image on the web page changes, it is likely to cause element recognition to fail. [0004] 2) Link positioning: looking for a spe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/958G06F16/951
CPCG06F16/986G06F16/951Y02D10/00
Inventor 龚燕玲潘宇汪玉林
Owner 苏州数字力量教育科技有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More