Machine learning-based data classification method and device
A data classification and machine learning technology, applied in the field of data classification based on machine learning, can solve problems such as time-consuming, complicated process, and incoherent operation, and achieve the effect of optimized extraction effect, simple process operation, and high overall efficiency
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0044] Such as figure 1 As shown, a data classification method based on machine learning, establishes a data classification model through machine learning, and reads and classifies the data to be classified according to the data classification model. The data classification model is established based on feature words, and the documents are clustered. Carry out word segmentation processing on the file content, use TFIDF algorithm to calculate the weight of words, and then calculate the file similarity, and cluster similar files.
Embodiment 1
[0045] Embodiment one method comprises the steps:
[0046] S11. Determine the first characteristic word group corresponding to each type of data based on the learning data;
[0047]Learning data refers to the sample data concerned for machine learning. These sample data are data that have been classified. For example, these sample data are selected from academic materials in different fields such as medicine, forestry, architecture, and petroleum. Because the same field can also be divided into different categories or research directions, the present invention will cluster these data, classify documents with high similarity into one category as much as possible, and then calculate the corresponding rules for this category, That is, the model, which contains the characteristics of this class. With this feature, users can search for the documents that the user cares about from a large number of documents, that is, documents that meet the characteristics of the model, compared wi...
Embodiment 2
[0062] Such as figure 2 As shown, a data classification method based on machine learning includes the following steps:
[0063] S11. Determine the first characteristic word group corresponding to each type of data based on the learning data;
[0064] Learning data refers to the sample data concerned for machine learning. These sample data are data that have been classified. For example, these sample data are selected from academic materials in different fields such as medicine, forestry, architecture, and petroleum. Because the same field can also be divided into different categories or research directions, the present invention will cluster these data, classify documents with high similarity into one category as much as possible, and then calculate the corresponding rules for this category, That is, the model, which contains the characteristics of this class. With this feature, users can search for the documents that the user cares about from a large number of documents, th...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com