A visual acquisition tool for web page code

A collection tool, web technology, applied in the direction of network data indexing, network data retrieval, data processing input/output process, etc., can solve the problem of fast configuration and accurate collection of data collection, etc., to reduce the skill threshold, improve The effect of compatibility

Active Publication Date: 2022-02-01
YANTAI JIERUI NETWORK TRADING
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For non-professional technicians, it takes a lot of time to learn web code related technologies before they can operate, so it is still a tool with a threshold, which is not enough for relevant practitioners with data collection needs The effect of rapid configuration and accurate collection to data collection

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A visual acquisition tool for web page code
  • A visual acquisition tool for web page code

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] attached Figure 1 is an example of the webpage to be collected, News 1-5 in Figure 1 are the data areas that need to be collected. Conventional collection methods use HTML tag regularization and HTML tag prefix and suffix interception to collect regional data. Users still need to have a certain coding foundation to use this method. There is also a part of the technology that uses node feature acquisition and comparison collection methods, but it is very likely that misjudgments will occur when the "Section", "News", and "Feature" node similar groups appearing on the page coexist. What is obtained is not the content of the similar group of "news" nodes that is ultimately needed.

[0029] In addition, some websites use as the list node label of data, and some websites use , , , , etc. as the list node labels of the data, so that the code labels of different websites cannot be unified, and even if the features are matched, there will be a situation where the sample is missi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention relates to the technical field of electronic digital data processing, in particular to a web page code-oriented visual acquisition tool, including a web client, the web client includes a visual operation page and a task configuration page that can be loaded on any web page, and the visual operation page includes Create a new task button, the display area of ​​the webpage to be collected, hover in the display area of ​​the webpage to be collected, and when the mouse moves to the webpage to be collected, several areas to be collected will be highlighted after clicking on the area to be collected, and the area to be collected will be displayed. Data preview area, the save button to save the proposed collection results if you are satisfied with the proposed collection results in the proposed collection area, and the cancel button to abandon the proposed collection results if you are not satisfied with the proposed collection results in the proposed collection area; the data preview area is divided into text preview area , URL preview area and XPath preview area. Compared with the prior art, the invention can effectively reduce the skill threshold of users.

Description

technical field [0001] The invention relates to the technical field of electrical digital data processing, in particular to a visual acquisition tool for web page codes. Background technique [0002] With the popularization of information technology, data collection and analysis has become the normal work of practitioners such as search engines, data analysts, and self-media. [0003] The current collection methods and tools are to collect the content in the page by specifying the URL, using code development to set the regularity of the collection content area, intercepting the suffix and suffix of HTML tags, etc., and further setting paging or adding scheduling tasks in the code, and finally completing The content of the whole site is collected. Due to the differences in webpage codes of different websites, continuous analysis and code adjustment are required, and the data collection efficiency is low; at the same time, due to the irregularity of webpage codes, conventiona...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F3/0483G06F16/951G06F16/955
CPCG06F3/0483G06F16/951G06F16/955
Inventor 朱春华王涛刘超曾繁诚张恒振
Owner YANTAI JIERUI NETWORK TRADING
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products