Model migration training method, device and equipment, and storage medium

A training method and model technology, applied in computational models, character and pattern recognition, instruments, etc., can solve the problems of easy overfitting and poor generalization ability of target models, so as to avoid overfitting and improve generalization. effect of ability

Pending Publication Date: 2020-07-24
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF0 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the generalization ability of the target model trained by the above method is poor, and it is prone to overfitting.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Model migration training method, device and equipment, and storage medium
  • Model migration training method, device and equipment, and storage medium
  • Model migration training method, device and equipment, and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0053] figure 1 It is a flowchart of a model migration training method in the first embodiment of the present application. The embodiment of the present application is applicable to the case where the source model in the source domain is transferred to the target model in the target domain, and the target model is trained. The method is executed by a model migration training device, which is implemented by software and / or hardware, and is specifically configured in an electronic device.

[0054] Such as figure 1 A model migration training method shown includes:

[0055] S101. Use network parameters of at least two migration layers in the source model as initial parameters of associated migration layers in the target model.

[0056] Among them, the source model can be understood as a stable network model that is successfully trained through a large number of source training samples in the source domain. The target model can be understood as a model to be trained in the target domain ...

Embodiment 2

[0079] figure 2 It is a flowchart of a model migration training method in the second embodiment of the present application. The embodiment of the present application is optimized and improved on the basis of the technical solutions of the foregoing embodiments.

[0080] Further, the operation "construct an objective function based on the distance between the training parameters associated with the at least two migration layers and the initial parameters" is refined into "according to the weights of the at least two migration layers, and The distance between the training parameters associated with the at least two migration layers and the initial parameters, construct an objective function" to improve the objective function construction mechanism.

[0081] Such as figure 2 A model migration training method shown includes:

[0082] S201: Use network parameters of at least two migration layers in the source model as initial parameters of associated migration layers in the target model...

Embodiment 3

[0096] Figure 3A It is a flowchart of a model migration training method in the third embodiment of the present application. The embodiment of the present application provides a preferred implementation on the basis of the technical solutions of the foregoing embodiments.

[0097] Such as Figure 3A A model migration training method shown includes:

[0098] S301. Use the network parameters of each migration layer in the source model as the initial parameters of the corresponding migration layer in the target model. Among them, the migration layer is the image feature extraction layer.

[0099] S302. Divide the migration layer into multiple network blocks.

[0100] S303: Based on the weight function, determine the weight of the migration layer according to the sequence number of the network block to which each migration layer belongs.

[0101] Specifically, the weight of the migration layer is determined according to the following formula:

[0102] W i =softmax(N-i);

[0103] Where W i Is...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a model migration training method, device and equipment and a storage medium, and relates to the field of artificial intelligence. The specific implementation scheme is as follows: taking network parameters of at least two migration layers in a source model as initial parameters of associated migration layers in a target model; constructing a target function according to the distances between the training parameters associated with the at least two migration layers and the initial parameters; and training a target model including initial parameters based on the target function. According to the embodiment of the invention, the target function is constructed; the distance between the training parameters and the initial parameters of the migration layers is introduced, so that the model migration and training conditions of each migration layer are considered in the model training process, the inheritance of information of the source model and the self-adaptation of the target model are realized, the over-fitting phenomenon in the model migration training process is avoided, and the generalization ability of the target model is improved.

Description

Technical field [0001] This application relates to computer technology, in particular to the field of artificial intelligence, and specifically to a model migration training method, device, equipment, and storage medium. Background technique [0002] Transfer learning can use the similarity between data, tasks or models to apply the source model trained in the source domain (that is, the old domain) to the target model in the target domain (that is, the new domain), thereby in the target model training process In, reduce the requirements for massive data resources and solve the problem of high cost of training tasks. [0003] In the prior art, when the target model is trained, the network parameters of the source model are used to initialize the network parameters of the target model instead of random initialization, and the initialized target model is retrained. [0004] However, the generalization ability of the target model obtained by the above-mentioned training method is poor,...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06N20/00
CPCG06N20/00G06F18/214
Inventor 卢阳
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products