Method and device for creating and querying catalog database
A database and directory technology, applied in other database retrieval, network data indexing, network data retrieval and other directions, can solve the problems of small amount of data, unfavorable website security detection, difficult to scan website vulnerability catalogue, etc. Effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0055] The embodiment of the present invention provides a method for generating a catalog database. The catalog database generated by the method provided by the embodiment of the present invention has a large amount of data, and the primary key and slave key of the source code data of the website in the catalog database are set to facilitate cataloging. Inquire.
[0056] Such as figure 1 As shown, the directory database generation method provided by the embodiment of the present invention includes steps S110-S140, specifically as follows.
[0057] S110. Obtain the website structure of the target website, and determine a crawling strategy according to the above website structure.
[0058] The above-mentioned target website is an open source website building platform. For example, the above-mentioned target website can be Github, Webmaster's Home, etc., and of course it can also be other websites. The embodiment of the present invention does not limit the specific type of the a...
Embodiment 2
[0079] The embodiment of the present invention provides a catalog database query method, which is applied to the catalog database generated by the catalog database generating method in Embodiment 1 of the present invention. set, which improves the query efficiency.
[0080] Such as figure 2 As shown, when using the directory database query method provided by the embodiment of the present invention to query the directory database subset matching the target website, it specifically includes steps S210-S220.
[0081] S210, setting a crawler strategy for the target website, and obtaining a site map (Sitemap) of the target website according to the crawler strategy, the crawler strategy includes setting a crawler start URL, encrypted data of the target website, and header information of a request.
[0082] When using web crawler technology to obtain the Sitemap of the target website, it is necessary to set the starting URL for crawling. Therefore, the website has an anti-crawler m...
Embodiment 3
[0099] An embodiment of the present invention provides a device for generating a directory database, which is used to execute the method for generating a directory database provided in Embodiment 1 of the present invention.
[0100] Such as image 3 As shown, the directory database generation device provided by the embodiment of the present invention includes a first determination module 310, an acquisition module 320, a second acquisition module 330 and a generation module 340;
[0101] The above-mentioned first determining module 310 is used to obtain the website structure of the target website, and determine the crawler strategy according to the website structure;
[0102] The acquisition module 320 is configured to acquire the source code data of the target website according to the crawler strategy;
[0103] The above-mentioned second determination module 330 is used to determine the primary key of the source code data in the directory database according to the path for o...
PUM

Abstract
Description
Claims
Application Information

- R&D
- Intellectual Property
- Life Sciences
- Materials
- Tech Scout
- Unparalleled Data Quality
- Higher Quality Content
- 60% Fewer Hallucinations
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com