Data collection method based on virtual login

A technology for data collection and registration information, which is applied in the directions of digital data authentication, data exchange network, digital transmission system, etc., can solve the problem of incomplete collection results, improve the efficiency and quality of data collection, and avoid manual collection.

Inactive Publication Date: 2016-02-03
INSPUR QILU SOFTWARE IND
View PDF4 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In the process of collecting content by existing crawler software, the content that is not open to non-users cannot be collected by these software, and the obtained web page information is only part of the text information, resulting in incomplete collection results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data collection method based on virtual login
  • Data collection method based on virtual login

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0018] The content of the present invention is described in more detail below:

[0019] 1. HttpClient virtual login

[0020] Functional design: Get the output stream by establishing a call with the target website, and write the user login information into the program. After successful login, write the identity information into the cookie, and send the cookie information when visiting other pages of the website to obtain access rights. The detailed process design is as follows:

[0021] 1) Through the URL of the target webpage, simulate access to the target webpage through the client (webClient) encapsulated by the system, and establish a communication channel between the local and the target webpage;

[0022] 2) Use the communication channel to obtain website information, read the saved user name and password of this website from the local database, and create an "ID card" for visiting the target website—Cookie;

[0023] 3) When visiting the target webpage, submit the cookie...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a data collection method based on virtual login, and relates to the technical field of data collection. The data collection method provided by the invention comprises the following steps: 1) HttpClient virtual login; and 2) identification and submission of a verification code. User information is stored by Cookie to achieve analog website login to obtain contents which can be only accessed by registered users, and thus webpage contents are collected without registration.

Description

technical field [0001] The invention relates to data collection technology, in particular to a data collection method based on virtual login. Background technique [0002] Internet data has greatly promoted the development of the big data industry with its advantages of being open, free, and huge. Data collection, cleaning, filtering, mining, and analysis have become important components of the big data industry. However, in order to protect data or reduce the pressure on the website server, some websites restrict the access of non-users of this website. For such sites, access to data is only possible after registration and login. Therefore, how to log in to some websites through system simulation has become one of the most important problems for data collectors to solve. [0003] Nowadays, for the purpose of security protection, many websites restrict non-user visitors from accessing some key data. Only after the user logs in to the website can the user access all data ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L12/26H04L29/06G06F21/36
Inventor 王贵友崔乐乐王传超
Owner INSPUR QILU SOFTWARE IND
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products