Unlock instant, AI-driven research and patent intelligence for your innovation.

Crawler method for configuring and collecting APP information and storage medium

A configuration and configuration item technology, applied in the field of crawler, can solve the problems of low development efficiency, high development cost, inability to realize APP information data, etc., and achieve the effect of high development efficiency and low coding and development cost.

Pending Publication Date: 2019-06-14
福建省天奕网络科技有限公司
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

None of the above-mentioned existing technologies can provide a data screening strategy for APP information data, specifying known APP market domain names for analysis through configuration, and matching with APP structure customization in the configuration, so as to solve the problems of low development efficiency and high development cost.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Crawler method for configuring and collecting APP information and storage medium
  • Crawler method for configuring and collecting APP information and storage medium
  • Crawler method for configuring and collecting APP information and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0049] Please refer to figure 1 , this embodiment provides a crawler method for configuring APP information collection, which can adapt to various application markets to collect APP information, and reduce development costs while improving development efficiency.

[0050] The method includes the following steps:

[0051] 1. Crawler site configuration

[0052] S1: Construct a content parsing dictionary, the content parsing dictionary includes a primary parsing configuration key and its corresponding primary parsing configuration item, a secondary parsing configuration key and its corresponding secondary parsing configuration item;

[0053]The content parsing dictionary consists of key / item associations. Specifically, the content analysis dictionary in this embodiment records the association relationship between the first-level analysis configuration key and the corresponding first-level analysis configuration item, and the relationship between the second-level analysis config...

Embodiment corresponding Embodiment 1

[0076] This embodiment corresponds to Embodiment 1, and provides a specific application scenario:

[0077] Take the crawler crawling APP information in the App Store as an example to illustrate:

[0078] 1. Configure the initial domain name address of the App Store application market "https: / / itunes.apple.com / cn"; in addition, configure multiple relative path addresses corresponding to it. Here, "action games" and "adventure games" are classified as For example, configure the corresponding relative path and configure the corresponding parsing configuration key as "list-parse", the configuration is as follows:

[0079]

[0080]

[0081] 2. The crawler downloads the page content of "Action Game" and "Adventure Game" respectively; reads the parsing configuration item 'items.app_info'.name' corresponding to the parsing configuration key (list-parse), which is after the following list-parse According to the content in the first {}, the APP name in the list can be extracted a...

Embodiment 3

[0090] This embodiment corresponds to Embodiment 1 and Embodiment 2, and provides a computer-readable storage medium on which a computer program is stored. When the program is executed by a processor, it can implement the above-mentioned embodiment or one of the embodiments provided by the embodiment. Configure all the steps included in the crawler method for collecting APP information. The specific steps will not be repeated here, please refer to the description of Embodiment 1 or Embodiment 2 for details.

[0091] Wherein, the storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM) or a random access memory (Random Access Memory, RAM) and the like.

[0092] To sum up, the crawler method and storage medium for configuring APP information provided by the present invention are not only applicable to crawlers crawling in different types of application market layouts, but also do not require a lot of coding to realize, and do not need cu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a crawler method for configuring and collecting the APP information and a storage medium, and the method comprises the steps of S3, configuring a first-level analysis configuration key corresponding to each path address, and configuring a second-level analysis configuration key corresponding to each application detail page; S4, downloading webpage content corresponding to one path address in the path address list; S5, analyzing the downloaded webpage content according to the first-level analysis configuration item corresponding to the first-level analysis configuration item, and screening out an application detail page address list; S6, analyzing each downloaded application detail page according to the second-level analysis configuration item corresponding to the second-level analysis configuration item to obtain an application detail field; and S7, if it is analyzed that a paging address exists in an application detail page, downloading corresponding webpage content, and returning to execute S5. The crawler grabbing method can be suitable for crawler grabbing of different types of application market layouts, and has the advantages of being high in development effect, low in development cost and easy to maintain.

Description

technical field [0001] The invention relates to the field of crawlers, in particular to a crawler method and a storage medium for configuring and collecting APP information. Background technique [0002] When collecting APP information data, you need to face complex and different market page layouts; the page levels to be captured are also different. A common implementation method is to write specific crawlers for different application market pages to analyze and capture data. This method requires a lot of coding to achieve data capture of all application market pages, because custom development of crawlers for various application markets requires high development costs. [0003] In the prior art, the application number is 201310654507.4, and the title of the invention is "Method and Device for Discovering Active IP Addresses in IPv6 Network", which provides a data collection for active changes of IPv6 addresses; another application number is 201410655647.8, The name of th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/9535G06F9/445
Inventor 刘德建伍张发林琛
Owner 福建省天奕网络科技有限公司