Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and device for recognizing network resource entity content page

A network resource and network identification technology, which is applied in the field of identifying network resource entity directory pages, can solve the problems of poor scalability and filtering, and achieve the effect of efficient, accurate, and strong scalability

Active Publication Date: 2014-02-26
BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO LTD
View PDF4 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method of text judgment needs to establish some rules in advance. If the text in a web page does not meet the preset rules, it will be filtered out.
But in fact, even if the text of a web page does not meet the preset rules, it may belong to the catalog page
It can be seen that the scalability of the existing technology is relatively poor

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for recognizing network resource entity content page
  • Method and device for recognizing network resource entity content page

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0079] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. All other embodiments obtained by persons of ordinary skill in the art based on the embodiments of the present invention belong to the protection scope of the present invention.

[0080] In the embodiment of the present invention, the directory page of the network resource entity is discovered according to the webpage jump path generated during the user's access to the webpage. The principle is: in the process of accessing network resource entities, users usually click from the catalog page to a specific chapter page, but rarely return to the catalog page from a specific chapter page. According to this feature, you can use statistics Analyze the user's access path to a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and a device for recognizing a network resource entity content page. The method includes: pointing out process information of an entity resource webpage relevant to network resource entities during acquiring a user's browse webpage; restoring entity access tracks, by a user, of the specific network resource entity according to the process information; acquiring a starting-point webpage address on the entity access tracks and determining the content page of the specific network resource entity according to the starting-point webpage address on the entity access tracks. By the method, expandability in recognizing the content page can be improved.

Description

technical field [0001] The invention relates to the technical field of web page identification, in particular to a method and device for identifying a network resource entity catalog page. Background technique [0002] A web browser is software that displays and allows users to interact with files on a web server or file system. It can be used to display text, images and other information on the World Wide Web or a local area network. These texts or images may be hyperlinks to other websites, and users can browse various information by clicking various hyperlinks. [0003] Among the abundant network resources, there is a special kind of network resources, which are continuous and periodically updated in units of episodes, chapters, and sections. For example, for a series, two episodes are updated every day, for a manga, one episode is updated every week, and so on. For such network resources, each specific entity generally corresponds to a directory page, and the browsing...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/955
Inventor 崔华肖镜辉
Owner BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO LTD
Features
  • Generate Ideas
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More