Crawler algorithm for capturing webpage in online shopping mall

A technology of web pages and shopping malls, applied in the field of crawler algorithms, can solve the problems affecting the long-term development of online stores and the proliferation of counterfeit and inferior products, and achieve the effect of reasonable program design and good use effect

Inactive Publication Date: 2013-03-20
FUJIAN NORMAL UNIV
View PDF1 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The review of sellers by online stores often has great limitations, which will also lead to the proliferation of fake and shoddy products
In the long run, it will cause a lot of negative comments on the online store, which will affect the long-term development of the online store

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Crawler algorithm for capturing webpage in online shopping mall

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] The present invention is used for grabbing the crawler algorithm of the webpage in the network mall, such as figure 1 shown, including the following steps:

[0022] Step 1: Set the width, depth and total number of crawling. The width represents the number of unrelated page links allowed to visit. The depth represents the depth that can be accessed along the links. The total represents the total number of visited web pages. Limit S; Enter initial link;

[0023] Step 2: Establish a url queue, the url queue is used to store the initial links to be crawled, and the url seed set is added to the url queue; the domain names of several online shopping malls can be used as url entrances to obtain the url seeds set;

[0024] Step 3: If the number of visited pages is less than the upper limit S of the total number of visited webpages, or the length of the url queue is not zero, that is, the url queue is not empty, then download the corresponding page according to the initial lin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a crawler algorithm for capturing a webpage in an online shopping mall, which comprises the following steps: acquiring a page in the online shopping mall according to an initial link and adding a seed set in the page to a url queue; downloading the page according to the initial link, adding a new link to a list queue, and computing the degree of correlation of the page; setting a corresponding link value according to the depth of the page and the degree of correlation between the page and a topic; for the url existing in both the list queue and the url queue, comparing the potential coefficient in the url queue with that in the list queue to update the potential coefficient in the url queue; for the url existing in the list queue but not in the url queue, inserting the url to the url queue according to the potential coefficient; and finally, setting the depth according to the degree of correlation of the current page. The algorithm is favorable for precisely capturing the webpage in the online shopping mall related to the topic, and is rational in design and good in running effect.

Description

technical field [0001] The invention relates to the technical field of web page search, in particular to a crawler algorithm for grabbing web pages in an online shopping mall. Background technique [0002] The online shopping mall takes the Internet as the operating carrier, relies on Internet resources, and uses various means of e-commerce to achieve a virtual store in the process from buying to selling, thereby reducing intermediate links, eliminating the difference between transportation costs and agents, and creating a shopping mall for ordinary consumers. And increasing market circulation brings huge room for development. [0003] The online shopping mall can browse and purchase commodities 24 hours a day, and can communicate with customer service at any time during working hours to solve the difficulties encountered in shopping; its large amount of information can allow customers to understand more and increase the space for choice; its Unlimited customers, anyone in ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 陈志德
Owner FUJIAN NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products