Web page classification method based on training set
A webpage classification and training set technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as decline, webpages cannot be classified, classification accuracy, etc.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0052] The present invention proposes a technical framework for automatically classifying webpages effectively, and designs a classification algorithm in detail, as shown in the attached figure 1 shown. It can be seen from the figure that the system is divided into three parts, namely: webpage content processing, webpage vector representation and webpage vector comparison.
[0053] There are 2 textual terms that need to be pointed out here. The training set refers to a large collection of web page source codes with known classifications. The source codes are stored in the form of text and stored in different folders according to the corresponding genres. These texts are finally processed and converted into corresponding vectors. Feature extraction refers to the process of determining each element of the web page vector, where the element is a keyword entry that can reflect the content of the web page, and the value of the element is the calculation result of the weight value ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com