Configurable webpage data acquisition method and system

A technology of webpage data and database, applied in the field of network communication, can solve the problems of singleness and poor practicability, and achieve the effect of convenient collection

Active Publication Date: 2015-03-25
SHENZHEN LAN YOU TECH
View PDF3 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The technical problem to be solved by the present invention is to provide a configurable and widely used configurable webpage data collection system, aiming at the defects that the e

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Configurable webpage data acquisition method and system
  • Configurable webpage data acquisition method and system
  • Configurable webpage data acquisition method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0061]In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0062] Such as figure 1 As shown, in the flow chart of the first preferred embodiment of the method for configurable webpage data collection of the present invention, the method for configurable webpage data collection starts at step S100: after step S100, proceed to step S110, from The configuration information of webpage data collection is obtained in the database, and the configuration information includes: configuration collection website classification information, configuration collection theme template information, configuration collection content page template information and configuration d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a configurable webpage data acquisition method and system which is especially suitable for the situation that a webpage data acquisition mode is needed to be continuously updated. The configurable webpage data acquisition method comprises the steps that S1, configuration information for webpage data acquisition is obtained from a database; S2, required classified websites are obtained and logined according to the configuration information; S3, themes required to be acquired by the websites are obtained according to website information after login; S4, required website contents are acquired according to the configuration information and the acquired themes; S5, required information of acquired content pages is extracted through a regular expression in a configured data table or according to a certain rule; S6, extracted table data are stored in the database. By means of the configurable webpage data acquisition method and system, a user can voluntarily and optionally configure webpage data required to be acquired, acquire relevant data information of the whole network according to a configured acquisition scheme and achieve flexible and convenient webpage data acquisition.

Description

technical field [0001] The present invention relates to the technical field of network communication, and more specifically, relates to a configurable webpage data collection method and system for realizing the situation of constantly updating the collection mode of webpage data. Background technique [0002] With the rapid development of web technology and web applications, and the advent of the era of big data, the monitoring of various web application websites, especially social platforms, public opinion monitoring of various companies, user data collection, and big data mining are becoming more and more widely used; All walks of life are increasingly dependent on the Internet and highly dependent on Internet information. However, the data on the Internet is massive, so how to extract the data we need? [0003] At present, there are only collection systems for a certain website or several websites on the market, and there is no configurable webpage data collection method...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/9577
Inventor 吴正辉
Owner SHENZHEN LAN YOU TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products