Webpage data crawling method and device and webpage login method and device

A technology of web page data and login method, applied in the network field, can solve the problem of easy failure of web page data, and achieve the effect of avoiding multiple repeated logins
CN110968760APending Publication Date: 2020-04-07BEIJING GRIDSUM TECH CO LTD

Patent Information

Authority / Receiving Office
CN Β· China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING GRIDSUM TECH CO LTD
Publication Date
2020-04-07

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses a webpage data crawling method and device, a webpage login method and device, relates to the technical field of networks, and mainly aims to solve the problem that crawling ofwebpage data in websites is extremely easy to fail. The method comprises the steps that when a crawling request is received, identity certificate information is obtained, and the identity certificateinformation is generated according to a user name and a password when a user logs in a webpage; logging in the webpage through the identity certificate information; and after logging in the webpage, crawling webpage data in the webpage. The invention is suitable for crawling the data in the website through the crawler.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The present invention relates to the field of network technologies, and in particular, to a method and device for crawling web page data, and a method and device for logging in to a web page. Background technique

[0002] With the gradual increase of screen names, the number of visits to different websites in the network also increases gradually. Usually, in order to obtain the data in the website more comprehensively, many users like to crawl the data in the website through web crawlers. Among them, crawlers are often referred to as web crawlers, web spiders or web robots. It is a program or script that automatically grabs information from the World Wide Web according to certain rules.

[0003] At present, many websites are equipped with user identity verification mechanisms, which require users to log in through account numbers and passwords. When crawlers crawl web page data in such websites, crawlers generally need account names and passwords to ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More