Method and system for quickly recognizing webpage types through links
A web page type and type of technology, applied in the field of network communication, can solve problems such as low versatility and system resource occupation, and achieve the effect of improving work efficiency
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0031] In order to have a more specific understanding of the technical content, characteristics and effects of the present invention, now in conjunction with the illustrated embodiment, the details are as follows:
[0032] The present invention firstly needs to construct a link normalization dictionary for recording the link (url) normalization methods required by each web page type. The specific method is as follows:
[0033] First, for each website to be crawled, analyze the url naming rules of the types of webpages to be crawled. For example, the urls of all book display pages (contentpage) of Boku.com (www.bookuu.com) are in the form:
[0034] http: / / www.bookuu.com / kgsm / ts / 2010 / 07 / 13 / 1786270.shtml
[0035] http: / / www.bookuu.com / kgsm / ts / 2010 / 09 / 21 / 1827795.shtml
[0036] http: / / www.bookuu.com / kgsm / ts / 2009 / 12 / 08 / 1644478.shtml
[0037] That is, in the url, the prefix is the same, but some parts (the last number string in the above example) are changed.
[0038] Then, ac...
PUM

Abstract
Description
Claims
Application Information

- R&D
- Intellectual Property
- Life Sciences
- Materials
- Tech Scout
- Unparalleled Data Quality
- Higher Quality Content
- 60% Fewer Hallucinations
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com