An integrated classification method for massive multi-word short texts
A classification method, short text technology, applied in the field of text representation and representation learning, which can solve problems such as the curse of dimensionality
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0043] In this embodiment, an integrated classification method for massive multi-word short texts, such as figure 1 shown, including the following steps:
[0044] Step 1. Obtain the multi-word short text collection, as shown in Table 1, and use the jieba_fast word segmentation method to perform word segmentation processing on the multi-word short text collection in the multi-process precise mode. jieba_fast is an improved version based on jieba word segmentation, which can Significantly improve word segmentation speed under large data volume. Adopt the multi-process word segmentation method to improve the utilization rate of CPU and memory, and increase the precision of word segmentation by adding a custom thesaurus, and finally get the word segmentation result X={x 1 ,x 2 ,...,x i ,...,x M+N},x i Indicates the i-th short text after word segmentation, and has: Indicates the i-th short text x i In the k-th word, the word segmentation result X is a marked word segmenta...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com