Model training method and system, nonvolatile storage medium and computer terminal

A technology of model training and training data sets, applied in the field of machine learning, can solve the problem of high consumption of GPU computing resources, achieve the effect of improving convergence speed and training efficiency

Pending Publication Date: 2022-05-10
ALIBABA GRP HLDG LTD
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The embodiment of the present application provides a model training method and system, a non-volatile storage medium, and a computer terminal, to at least solve the problem that the classification model needs to support the same number of classification results through the training process when the sample size in the data set is relatively large. , resulting in the technical problem of large consumption of GPU computing resources

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Model training method and system, nonvolatile storage medium and computer terminal
  • Model training method and system, nonvolatile storage medium and computer terminal
  • Model training method and system, nonvolatile storage medium and computer terminal

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0032] The purpose of unsupervised learning or self-supervised learning is to learn models or features with strong expressive ability through unlabeled data. Usually, it is necessary to define a pretext task to guide the training of the model. Agents include, but are not limited to, change prediction, image completion, spatial or temporal sequence prediction, clustering, data generation, etc. Among them, the method based on contrastive learning adopts a dual-branch network, and the input of the network is usually two data enhancements of images. Its purpose is to make the two data enhancements of the same image closer in feature space, while the corresponding feature distances of the data enhancements of different images are longer. The contrast learning method needs more negative samples, needs a large batch size, or needs a storage queue to store historical feature vectors, but even so, the diversity of negative samples is still relatively scarce.

[0033] However, the method of...

Embodiment 2

[0059] The embodiment of the application also provides a model training method, which is applied to a GPU cluster, wherein the GPU cluster includes a plurality of GPUs, and each GPU of the plurality of GPUs has a first feature extraction network and a first full connection layer, wherein the network structures of the first feature extraction networks in the plurality of GPUs are the same, and the first full connection layer is cut based on the second full connection layer in the target neural network model; It should be noted that the alternative implementation scheme of the GPU cluster in this embodiment can adopt the implementation scheme of the GPU cluster described in Embodiment 1, but it is not limited to this. such as Figure 3 As shown, the method includes:

[0060] S302, a first GPU of a plurality of GPUs initializes a first feature extraction network of the first GPU, and extracts a first sample feature in a target training data set by using the initialized first feature e...

Embodiment 3

[0073] The embodiment of the application provides a method embodiment of a model training method. It should be noted that the steps shown in the flowchart of the figure can be executed in a computer system such as a set of computer-executable instructions, and although the logical sequence is shown in the flowchart, in some cases, the steps shown or described can be executed in a sequence different from that here.

[0074] The method embodiment provided by the embodiment of the application can be executed in a mobile terminal, a computer terminal or a similar computing device. Figure 4 A block diagram of the hardware structure of a computer terminal (or mobile device) used to implement the model training method is shown. such as Figure 4 As shown, the computer terminal 40 (or mobile device 40) may include one or more (402a, 402b, ..., 402n are shown in the figure) processors 402 (the processors 402 may include, but are not limited to, processing devices such as microprocessor MCU ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a model training method and system, a nonvolatile storage medium and a computer terminal. The method comprises the steps that each GPU in a plurality of GPUs is provided with a first feature extraction network and a first full connection layer, and the network structures of the first feature extraction networks in the plurality of GPUs are the same; the first GPU in the plurality of GPUs is used for initializing a first feature extraction network of the first GPU and extracting a first sample feature in the target training data set by adopting the initialized first feature extraction network; inputting the first sample feature into a first full connection layer of the first GPU for processing; determining a prediction error of the first GPU based on the processing result; determining a target prediction error of a target neural network model based on the prediction error of the first GPU and the received prediction errors of other GPUs; and updating parameters of the target neural network model based on the target prediction error.

Description

Technical field [0001] The application relates to the field of machine learning, in particular to a model training method and system, a nonvolatile storage medium and a computer terminal. technical background [0002] The purpose of unsupervised learning or self-supervised learning is to learn models or features with strong expressive ability through unlabeled data. Usually, it is necessary to define a pretext task to guide the training of the model. Agents include, but are not limited to, change prediction, image completion, spatial or temporal sequence prediction, clustering, data generation, etc. [0003] Among them, in the field of unsupervised learning, the method of instance classification regards each data sample in the data set as a class, and it can adopt the same training network as supervised classification, which can make full use of all negative examples in the data set, so it is a more potential scheme. [0004] However, in order to realize the case classification m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06T1/20G06V10/74G06V10/774G06V10/764G06V10/82G06K9/62G06N3/04G06N3/08
CPCG06T1/20G06N3/08G06N3/045G06F18/22G06F18/24G06F18/214
Inventor 刘宇黄梁华潘攀王彬徐盈辉
Owner ALIBABA GRP HLDG LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products