Website information merging and deduplication method
A website information and website technology, applied in the Internet field, can solve the problems of not being able to enjoy the convenience of the Internet to the greatest extent, time and labor waste, and achieve the effect of timeliness and convenience
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0030] A method for merging and deduplicating website information, the method includes the following steps:
[0031] (1) Obtain the data information of multiple target websites that need to be analyzed, compare the data information horizontally among the websites, and merge and deduplicate the information;
[0032] A. According to the structure of the target website, set the website template of the target website to be analyzed, and set the URL of the target website; the design process of the website template includes analyzing the structure of each target website to be compared, and setting the crawling needs according to the website structure The URL of the data home page, the URL of the corresponding data page under the data home page, the page label to be captured, through regular expression matching, and DOM parsing of HTML label elements; the required website content can be obtained through the website template.
[0033] B. Set up an independent thread for the website te...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 

