DOM tree-based page partitioning method, apparatus and device, and storage medium
A DOM tree and page segmentation technology, which is applied to other database clustering/classification, special data processing applications, network data browsing optimization, etc. Time savings, speed and efficiency, results in improved accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0046] It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.
[0047]The solution of the embodiment of the present invention is mainly as follows: the present invention generates a DOM tree according to the denoised webpage by performing denoising processing on the webpage to be divided; obtains the node path of each node on the DOM tree, and calculates the similarity of each node path degree, each node is clustered according to the similarity, and a clustering result is generated; the webpage to be divided is divided into blocks according to the clustering result, which can reduce the influence of noise content on webpage information extraction, and improve the The accuracy of page information extraction, and can adapt to web pages with different structures, has strong versatility and adaptability, saves the time of information extraction, speeds up the speed and efficiency of in...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


