Method and device for determining image sample set for model training and electronic equipment
An image sample set and model training technology, which is applied in character and pattern recognition, payment systems, instruments, etc., can solve problems that affect training results, limited sample data, and low classification accuracy of classification model predictions, so as to ensure accuracy Effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0061] In the case of large-scale model training, in view of the lack of data samples, a classification method (data+taxonomy=Dataonomy) is proposed to extract the internal correlation between large public data sets (such as ImageNet, MIT places) and limited task sample data, Thus creating a metadata set containing a large number of related data samples. The process mainly includes: using AHP (Analytic Hierarchy Process) to mine data similarity between data classes; using BIP (Binary Integer Program) to extract highly similar data classes from public data sets. This process involves a fully computational approach to quantify dataset relationships from which structure can be extracted. "Structure" refers to a set of relationships specifying which data set provides useful information to another data set, and how much information is provided.
[0062] refer to figure 1 As shown, it is a schematic diagram of steps of a method for determining an image sample set for model trainin...
Embodiment 2
[0175] refer to Figure 10 As shown, the device 1000 for determining an image sample set for model training provided by the embodiment of this specification may include:
[0176] Model selection module 1002, selects the pre-training model;
[0177] Matrix determination module 1004, determining the correlation matrix between the source data set and the target data set based on the pre-training model, wherein the number of images in the source data set is much larger than the number of images in the target data set, and the target The dataset contains image samples required for model training;
[0178] Normalization module 1006, using AHP to normalize the correlation matrix;
[0179] The sample amplification module 1008, according to the binary integer programming method, selects image samples satisfying the similarity conditions determined by the correlation matrix from the source data set as the image sample set based on the normalized correlation matrix.
[0180]Optionally...
Embodiment 3
[0191] Figure 11 It is a schematic structural diagram of an electronic device according to an embodiment of this specification. Please refer to Figure 10 , at the hardware level, the electronic device includes a processor, and optionally also includes an internal bus, a network interface, and a memory. Wherein, the memory may include a memory, such as a high-speed random-access memory (Random-Access Memory, RAM), and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory. Of course, the electronic device may also include hardware required by other services.
[0192] The processor, the network interface and the memory can be connected to each other through an internal bus, which can be an ISA (Industry Standard Architecture, industry standard architecture) bus, a PCI (Peripheral Component Interconnect, peripheral component interconnection standard) bus or an EISA (Extended Industry StandardArchitecture, extended industry standard arc...
PUM

Abstract
Description
Claims
Application Information

- R&D
- Intellectual Property
- Life Sciences
- Materials
- Tech Scout
- Unparalleled Data Quality
- Higher Quality Content
- 60% Fewer Hallucinations
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com