Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method for analyzing web page form object nodes

A technology of object nodes and table nodes, used in special data processing applications, instruments, electrical digital data processing, etc.

Inactive Publication Date: 2009-11-11
北京瑞佳晨科技有限公司
View PDF0 Cites 22 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the expansion of information has brought challenges to information analysis and processing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for analyzing web page form object nodes

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] Embodiments of the present invention, a kind of method of parsing webpage form object node, utilize Internet to provide a kind of acquisition webpage, locate the data of specific area in the webpage, and classify analysis and compare data, this method comprises the following steps:

[0022] Step 1. Define a three-dimensional data table, set its first dimension to be the address of the webpage, the second dimension to be the field column in the data object, and the third dimension to be the field value in the data object;

[0023] Step 2, obtaining the target web page address queue;

[0024] Step 3. For each address in step 2, check whether it already exists in the first dimension of the three-dimensional data table in step 1; if not, fill it in the first dimension of the three-dimensional data table ; If it exists, delete it and check the next address;

[0025] Step 4, download the webpage according to the webpage address queue, and store the downloaded webpage in the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a method for analyzing web page form object nodes, comprising step 1, defining a three-dimensional data table; step 2, acquiring target web page address queues; step 3, checking if each of the addresses acquired in step 2 exists in a first dimension of the three-dimensional data table defined in step 1; step 4, downloading web pages according to the web page address queues and storing the downloaded web pages in a temporary web page storage area; step 5, performing form object detection on the web pages in the temporary web page storage area and extracting web pages with form nodes. The invention provides a set of programmed and automated method to realize the analysis process of web page form node data, can effectively acquire data of web page form nodes by the process, realizes analysis and comparison, and especially provides possibility for value-added data services. The method can help users collect and arrange a large amount of network information and has wide application perspective in the field of Internet information collection.

Description

technical field [0001] The invention relates to the field of webpage data analysis, in particular to a method for analyzing object nodes of a webpage table. Background technique [0002] The rapid development of the Internet has made it the most important source of information for people. However, the expansion of information has brought challenges to information analysis and processing. How to effectively extract relevant information from web pages written in Hypertext Markup Language (HTML) or Extensible Markup Language (XML) has become an important research topic in Internet information services. The Internet (Internet) is an open public information platform, and more and more companies publish their product information and service information on the Internet through a website server (Web server), or move their entire business to the Web. Collecting and categorizing this dynamic information, followed by comparative analysis can provide critical data for many value-added...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 孙晨
Owner 北京瑞佳晨科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products