Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and device for identifying search engine crawler, and method and device for processing search engine crawler

A search engine and crawler technology, applied in the field of crawler identification, can solve the problems of low accuracy and reliability, and achieve the effect of accurate and reliable identification

Active Publication Date: 2016-11-23
ALIBABA GRP HLDG LTD
View PDF2 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Embodiments of the present invention provide a method and device for identifying and processing search engine crawlers, so as to at least solve the technical problems of low accuracy and reliability in the related art due to missed reports and false detections in the identification of search engine crawlers.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for identifying search engine crawler, and method and device for processing search engine crawler
  • Method and device for identifying search engine crawler, and method and device for processing search engine crawler
  • Method and device for identifying search engine crawler, and method and device for processing search engine crawler

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0040] According to an embodiment of the present invention, an embodiment of a method for identifying search engine crawlers is provided. It should be noted that the steps shown in the flow charts of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, and , although a logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in an order different from that shown or described herein.

[0041] The method embodiment provided in Embodiment 1 of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Take running on a computer terminal as an example, figure 1 It is a hardware structure block diagram of a computer terminal of a search engine crawler identification method according to an embodiment of the present invention. Such as figure 1 As shown, the computer terminal 10 may include one or more (only one is shown in ...

Embodiment 2

[0076] According to an embodiment of the present invention, there is also provided a device for implementing the identification method of the above-mentioned search engine crawler, such as image 3 As shown, the device includes:

[0077] The obtaining module 30 is used to obtain statistical data obtained after the client visits multiple websites within each statistical time period of the statistical cycle, wherein the above-mentioned statistical time periods constitute the above-mentioned statistical cycle;

[0078] Here, the statistical cycle can be set according to the actual situation, for example, it can be set as one month or one quarter. The statistical time period can also be flexibly set according to the actual situation, such as 24 hours, 48 ​​hours, etc.

[0079] In an optional implementation manner, statistics may be made on the number of websites visited by the client every day in a month. In an optional implementation manner, there are multiple specific implemen...

Embodiment 3

[0098] According to an embodiment of the present invention, an embodiment of a processing method for a search engine crawler is also provided. figure 1 run on the computer terminal. Figure 5 is a schematic diagram of a processing method for a search engine crawler according to an embodiment of the present invention. Such as Figure 5 As shown, the method includes the following processing steps:

[0099] Step S502, acquiring statistical data obtained after the client visits multiple websites within each statistical time period of the statistical period, wherein each statistical time period constitutes the statistical period;

[0100] Here, the statistical cycle can be set according to the actual situation, for example, it can be set as one month or one quarter. The statistical time period can also be flexibly set according to the actual situation, such as 24 hours, 48 ​​hours, etc.

[0101] In an optional implementation manner, statistics may be made on the number of websi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and a device for identifying a search engine crawler, and a method and a device for processing a search engine crawler. The method for identifying a search engine crawler comprises: obtaining statistic data obtained by statistics after a client accesses a plurality of websites in each statistic time period of a statistic period, wherein the each statistic time period forms the statistic period; preprocessing the data corresponding to an assigned statistic parameter in the statistic data, to obtain a statistic value, wherein the assigned statistic parameter is used to reflect the same statistic characteristics of the statistic data; when each statistic value corresponding to the assigned statistic parameter is larger than a preset threshold value, determining access behavior of the client on the website to be search engine crawler access. The method and the device solve technical problems in the prior art that accuracy and reliability are not high caused by underreporting and false detection in search engine crawler identification.

Description

technical field [0001] The invention relates to the field of crawler identification, in particular to a method and device for identifying and processing search engine crawlers. Background technique [0002] At present, the development trend of cloud computing is rapid, and it is increasingly known and accepted by the public. Enterprises are gradually migrating various applications, websites, and services to the cloud computing environment provided by cloud service providers. At the same time, it is more and more common to obtain data from the Internet by accessing web pages through crawler programs. [0003] Crawlers come from both traditional search engines and crawlers from other channels. Although many websites in the cloud environment hope to allow search engine crawlers to expand their popularity and attract more user visits, due to the presence of crawlers from other channels For example, some crawlers visit only for their own purposes and do not contribute to the web...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 任宏伟
Owner ALIBABA GRP HLDG LTD