Data processing method, device, storage medium and program product
By transforming high cardinality feature data into descriptive text and integrating it with unstructured data, the curse of dimensionality caused by high cardinality features is solved, enabling more efficient data processing and model training, and improving the accuracy and robustness of task processing.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CHINA UNIONPAY
- Filing Date
- 2026-01-23
- Publication Date
- 2026-06-12
AI Technical Summary
In data processing, one-hot encoding of high cardinality feature data leads to the curse of dimensionality in the feature space, increases computational complexity and storage overhead, dilutes the influence of other features, and reduces the accuracy of similar data.
High cardinality feature data is transformed into descriptive text and integrated with unstructured data to form enhanced unstructured data, generating feature vectors to avoid high-dimensional one-hot encoding. Feature extraction is then performed using a natural language processing model.
It significantly reduces the dimensionality of feature vectors, alleviates the computational and storage burden on models, improves data processing and model training speed, enhances the semantic coherence and expressiveness of features, and improves the accuracy and reliability of task processing results.
Smart Images

Figure 1 
Figure 2