Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Data collection method and data collection device based on script engine

A script engine and data acquisition technology, applied in the computer field, can solve the problems of unfavorable wide application and low extraction efficiency, achieve the effects of reducing professional requirements, facilitating widespread promotion, and improving collection efficiency

Inactive Publication Date: 2013-05-08
BEIJING 58 INFORMATION TECH
View PDF8 Cites 32 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] To sum up, it can be seen that the existing data extraction methods based on template configuration have a characteristic, that is, many of the extracted data have to go through secondary cleaning, processing, conversion, etc. to obtain the desired target data, resulting in extraction efficiency. Low; in addition, some extraction methods are highly specialized, which is not conducive to wide application

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data collection method and data collection device based on script engine
  • Data collection method and data collection device based on script engine
  • Data collection method and data collection device based on script engine

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach

[0034] Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

[0035] In order to reduce the professional requirements of data collection and improve the efficiency of data collection, the embodiment of the present invention provides a script engine-based data collection method and device. The method and device realize simultaneous extraction in the data collection process through scripts. Cleaning, processing and conversion, well solved the technical problems raised.

[0036] Before specifica...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data collection method and a data collection device based on script engine. The data collection method based on script engine comprises the following steps: loading collection configuration files which are configurated in advance and corresponding to current collecting tasks, analyzing the collection configuration files, and obtaining target data collecting rules; initializing all the script engines which support different scripting languages, and loading script files which are configurated in advance and formed by script methods collecting target data; downloading webpage data, searching the collecting rules of the target data which are defined on a webpage and need to be collected, and sending script method names which are configurated in the downloaded webpage data and the collecting rules to the script engine of the corresponding script languages; and transferring and executing the corresponding script methods through the script engine according to the script method names, and collecting the target data in the webpage data. Extracting, cleaning, processing and transferring in the process of data collection are achieved through modes of scrip, and suggested technical problems are solved perfectly.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a script engine-based data collection method and device. Background technique [0002] There are already many mature directional acquisition software in the industry, and their implementation methods are basically based on template configuration. These data extraction methods based on template configuration are generally regular matching method, tag interception method, XPath extraction method, plug-in customization method, etc. . [0003] Among them, regarding the regular matching method: some data extraction results may require secondary cleaning, processing, and conversion to obtain the target data, and this type of extraction method is highly professional and requires proficiency in regular expressions; [0004] Regarding the mark interception method: some data extraction results may need secondary cleaning, processing, and conversion to obtain the target data; [0005] R...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/20G06F40/00
Inventor 侯赋文
Owner BEIJING 58 INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products