Method for enhancing high-dimensional category feature expression capability
A feature expression and category technology, applied in character and pattern recognition, instruments, computing, etc., can solve the problems of high hardware resource requirements, long training time, weak feature expression ability, etc., so that the training time will not increase and the model performance will be improved , the effect of improving performance
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0041] The data source of this example comes from Porto Seguro’s Safe DriverPrediction of the kaggle competition platform.
[0042] The specific link is as follows: https: / / www.kaggle.com / c / porto-seguro-safe-driver-prediction , Due to the large amount of data used, there are many types of attributes corresponding to single-category variables. Only the names of single-category variables are given below, so this embodiment does not provide specific data (you can find it in the link). If necessary, provide , which we can provide separately.
[0043] The single-category variables used are: "ps_ind_02_cat", "ps_ind_04_cat", "ps_ind_05_cat", "ps_car_01_cat", "ps_car_02_cat", "ps_car_03_cat", "ps_car_04_cat", "ps_car_05_cat", "ps_car_06_cat", "ps_car_07_cat", "ps_car_08_cat" ", "ps_car_09_cat", "ps_car_10_cat" and "ps_car_11_cat", the above variables are also public expressions, which are meanings known in the art.
[0044] When using the target conversion formula in the present in...
Embodiment 2
[0048] For further illustration, the processed attribute target feature variable of the present invention can improve model performance, specifically as follows:
[0049] The data source comes from: Lending Club (a US peer-to-peer lending company) customer loan data, the purpose is to predict the "good or bad" of the applicant, the link is as follows:
[0050] https: / / raw.githubusercontent.com / h2oai / app-consumer-loan / master / data / loan.csv, due to the large amount of data used, there are many types of attributes corresponding to single-category variables, and only given below The name of the single-category variable is specified, so this embodiment does not provide specific data (can be found in the link), and we can provide it separately if necessary.
[0051] The target conversion formula in the present invention is mainly used to process the category variable "addr_state", so as to observe the performance of the gbdt model before and after processing, and the evaluation crite...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com