Web data similarity detection method based on two-stage filtration of structure and content
A detection method and a secondary filtering technology, applied in the field of similarity detection of Web data structure and content, can solve problems such as difficult to efficiently find approximate content blocks, and the characteristics of Web data distribution areas are not fully utilized.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0097] In order to facilitate the understanding and implementation of the present invention by those of ordinary skill in the art, the present invention will be further described in detail with reference to the accompanying drawings and embodiments. It should be understood that the implementation examples described here are only used to illustrate and explain the present invention, and are not intended to limit this invention.
[0098] Please see figure 1 The technical solution adopted by the present invention is: a Web data similarity detection method based on secondary filtering of structure and content. On the basis of the traditional general similarity detection method, the characteristics of Web data structure and content distribution are discovered, Two-stage filtering is performed on the detected document set; the present invention believes that documents containing similar Web data should be similar in structure at first, and if there is a big difference in the structure ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com