Network specific content digging method and device and electronic equipment

A specific content and mining device technology, applied in the Internet field, can solve problems such as complex execution process, high requirements for manpower and material resources, and inability to accurately obtain specific network content, so as to achieve the effect of comprehensive and accurate mining

Active Publication Date: 2015-02-25
BEIJING QIHOO TECH CO LTD
View PDF4 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] However, this method involves complex technologies such as crawling scheduling, webpage analysis, data update storage, etc. The execution process is complex and requires high manpower and material resources.
[0008] As a result, current web-specific content acquisition methods cannot quickly and accurately acquire web-specific content from a website

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Network specific content digging method and device and electronic equipment
  • Network specific content digging method and device and electronic equipment
  • Network specific content digging method and device and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0079] refer to figure 1 , shows a flow chart of steps of a method for mining network-specific content in Embodiment 1 of the present invention. In this embodiment, the method for mining network-specific content may include the following steps:

[0080] Step 100, respectively extracting a first URL and a second URL redirected from the first URL from multiple browser logs.

[0081] There are a lot of inserted URL (Universal Resource Locator, Uniform Resource Locator) content in the webpage of some websites, if the user is interested in the content associated with this URL, then when the user clicks these inserted URLs, he can visit the content of the URL. the webpage that points to. In the embodiment of the present invention, it is considered that these URLs that are inserted into the web pages of the website and clicked may be network hotspot URLs, and therefore mining of network-specific content is performed based on these clicked URLs.

[0082] The browser can provide a lo...

Embodiment 2

[0094] refer to figure 2 , shows a flow chart of steps of a method for mining network-specific content in Embodiment 2 of the present invention. In this embodiment, the method for mining network-specific content may include the following steps:

[0095] Step 200, respectively extracting a first URL and a second URL redirected from the first URL from multiple browser logs.

[0096] In the embodiment of the present invention, the generation time of the log can also be recorded in the browser log, so multiple browser logs within a certain time period can be obtained according to the actual situation according to the generation time of the browser log, for example, a certain The embodiment of the present invention does not limit the specific time period for multiple browser logs within several hours, a certain day, a certain week, or a certain month.

[0097] After the browser logs are obtained, the first URL and the second URL redirected from the first URL can be respectively e...

Embodiment 3

[0149] refer to image 3 , shows a structural block diagram of an apparatus for mining network-specific content in Embodiment 3 of the present invention. In this embodiment, the network-specific content mining device may include the following modules:

[0150] The extracting module 300 is adapted to respectively extract a first URL and a second URL redirected from the first URL from a plurality of browser logs;

[0151] A determination module 302, adapted to determine a first URL that matches the identification information of the specified website;

[0152] A screening module 304, adapted to screen URLs originating from the specified website from second URLs jumping from the first URL matching the identification information of the specified website;

[0153] The search module 306 is adapted to search for a network hotspot URL from the URLs originating from the specified website, and determine the web page content corresponding to the network hotspot URL as network-specific c...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a network specific content digging method and device and electronic equipment. The network specific content digging method includes the steps that a first URL and a second URL skipping from the first URL are extracted in multiple browser logs; the first URL matched with identification information of a specific website is determined; a URL from the specific website is screened from the second URL skipped from the first URL matched with the identification information of the specific website; a network hotpoint URL is searched for in the URL from the specific website, and webpage content corresponding to the network hotpoint serves as network specific content. The network specific content can be dug quickly and accurately, and the obtained content is more comprehensive.

Description

technical field [0001] The invention relates to the technical field of the Internet, in particular to a method and device for mining specific content on a network, and an electronic device. Background technique [0002] With the rapid development of the Internet, the Internet, as a medium of information dissemination, has become an important channel for people to obtain information and exchange information. It has the advantage of faster dissemination of information, and is more and more favored by the majority of netizens. A large number of netizens flock to some websites that provide interactive services to express their opinions and break news. Thousands of topics are generated from the Internet every day. How to obtain network-specific content more quickly from the massive information of relevant websites will play a guiding role in understanding the social development situation and grasping the dynamics of public opinion. [0003] There are two main methods for obtaini...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/958
Inventor 罗维
Owner BEIJING QIHOO TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products