Sample selection method and device and electronic equipment
A screening method and sample technology, applied in the computer field, can solve the problem of low accuracy of screening samples, and achieve the effect of avoiding inconsistent description and improving accuracy.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0024] A sample screening method disclosed in the present application, such as figure 1 As shown, the method includes: step 100 to step 120.
[0025] Step 100, clustering all samples based on sample features.
[0026] The samples used in the embodiments of this application are the historical behavior logs of users in the current system or platform, such as the user's click or purchase behavior logs on the O2O platform, and the user's click or browsing logs in the search system. The specific method for obtaining user behavior logs, that is, samples used for training models is an existing technology, and will not be repeated here.
[0027] Before model training, firstly, manually screen the training samples and set the sample labels. The purpose is to screen out the samples that obviously do not meet the requirements of the model, and mark the positive samples and negative samples. The samples with positive and negative sample labels are set as Alternative samples.
[0028] W...
Embodiment 2
[0036] A sample screening method disclosed in this embodiment, such as figure 2 As shown, the method includes: Step 200 to Step 230.
[0037] Step 200, clustering all samples based on sample features.
[0038] The samples used in the embodiments of this application are the historical behavior logs of users in the current system or platform, such as the user's click or purchase behavior logs on the O2O platform, and the user's click or browsing logs in the search system. The specific method of obtaining user behavior logs as training samples, manually screening the training samples and setting positive and negative sample labels to obtain candidate samples can be found in Embodiment 1, and will not be repeated here.
[0039] When implementing this application, it is assumed that the feature dimensions of the sample include: time, geographic location, user age, user behavior type, and product category. After labeling the samples with positive and negative sample labels, the f...
Embodiment 3
[0055] A sample screening device disclosed in this embodiment, such as Figure 4 As shown, the device includes:
[0056] A sample clustering module 400, configured to cluster all samples based on sample characteristics;
[0057] The confusion degree metric determination module 410 is used to determine the sample chaos degree metric of the cluster where the candidate sample is located according to the clustering result of the sample clustering module 400;
[0058] The sample ratio determination module 420 is configured to determine the sample selection ratio of the corresponding cluster according to the sample confusion degree metric determined by the confusion degree metric determination module 410 .
[0059] During specific implementation, samples can be clustered using local centroid clustering methods such as kmeans and hierarchical clustering.
[0060] Optionally, the sample ratio determination module 420 is specifically configured to: determine the sample selection rati...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com