Judging system and judging method for web page repeating
A webpage and webpage content technology, applied in the field of judging system for repeated webpages, can solve problems such as high time complexity and time-consuming calculations
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0041] The present invention will be described in detail below in conjunction with the accompanying drawings and embodiments.
[0042] like figure 1 as shown, figure 1 It is a flow chart of the judging method for repeated web pages of the present invention.
[0043] In step 10, multiple web pages are obtained. In this step, a web crawler (spider) may be used to crawl a large number of web pages from the Internet.
[0044] In step 11, the webpage text of each webpage is extracted respectively. Many methods can be used to extract the text of the web page in the web page, see below figure 2 A specific embodiment of step 11 is described in detail.
[0045] like figure 2 as shown, figure 2 Yes figure 1The sub-flow chart of step 11.
[0046] In step 111, the web page is divided into blocks. In this step, if Figure 4 As shown, the web page content displayed by the browser can be divided into multiple content blocks, including: navigation block, web page location block,...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com