Webpage clustering processing method based on improved K-means algorithm
A technology of k-means algorithm and processing method, which is applied in the fields of electrical digital data processing, natural language data processing, special data processing applications, etc., can solve problems such as the optimal solution of multiple factors, shorten calculation time, and ensure reliability. And the effect of precision and improving efficiency
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0033] The technical solutions in the embodiments of the present invention will be described clearly and in detail below with reference to the drawings in the embodiments of the present invention. The described embodiments are only some of the embodiments of the invention.
[0034] The technical scheme that the present invention solves the problems of the technologies described above is:
[0035] In this embodiment, an improved web page clustering processing method based on the K-means algorithm is performed as follows.
[0036] Step 1: Collect webpage text dataset
[0037] Collect the website text data set, and grab the text information that needs to be obtained by using the web crawler tool developed based on the Python language.
[0038] Step 2: Perform dataset preprocessing on the collected data
[0039] Perform data set preprocessing on the collected data, use the Chinese text word segmentation tool to perform word segmentation processing on the acquired text informati...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com