Unlock instant, AI-driven research and patent intelligence for your innovation.

Webpage element obtaining method and apparatus

A web page and element technology, applied in the Internet field, can solve the problem of low accuracy of target page elements and achieve the effect of improving accuracy

Active Publication Date: 2018-05-25
BEIJING GRIDSUM TECH CO LTD
View PDF4 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The main purpose of this application is to provide a method and device for obtaining web page elements, so as to solve the problem of low accuracy in parsing target page elements in related technologies

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Webpage element obtaining method and apparatus
  • Webpage element obtaining method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other. The present application will be described in detail below with reference to the accompanying drawings and embodiments.

[0021] In order to enable those skilled in the art to better understand the solution of the present application, the technical solution in the embodiment of the application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiment of the application. Obviously, the described embodiment is only It is an embodiment of a part of the application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the scope of protection of this application.

[0022] It should be noted that the terms "first" and "second...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a webpage element obtaining method and apparatus. The method comprises the steps of obtaining multiple target webpages from a target website, wherein the target webpages contain target page elements; determining the types of the target webpages; obtaining corresponding path expressions of the target page elements under the types of the target webpages, thereby obtaining a path expression set, wherein the path expression set comprises multiple path expressions; according to the path expressions in the path expression set, analyzing to-be-analyzed webpages to obtain multiple analysis results, wherein the to-be-analyzed webpages are webpages crawled from the target website according to business demands; and obtaining target page elements of the to-be-analyzed webpagesfrom the multiple analysis results. Through the method and the apparatus, the problem of relatively low accuracy of performing analysis to obtain the target page elements in related technologies is solved.

Description

technical field [0001] The present application relates to the technical field of the Internet, in particular, to a method and device for acquiring web page elements. Background technique [0002] Usually, when obtaining network information in batches, crawler technology is generally used to crawl a large number of web pages, and then the crawled pages are analyzed. When performing customized analysis on a website's content pages (text, video, news, etc.), it is often necessary to obtain some specific elements, such as: release time, number of comments, number of likes, number of readings, etc., in the process An XML path language (XML Path Language, referred to as XPath) can be used to locate these specific elements on the page. [0003] This method of parsing may cause element parsing failure or element parsing conflicts due to the existence of multiple pages under the same website, that is, the same path expression can be parsed normally on one page, but not on another pa...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/951
Inventor 吕现彪
Owner BEIJING GRIDSUM TECH CO LTD