Neural network training method and device and computer equipment

A neural network and training method technology, applied in the field of devices, computer equipment, and neural network training methods, can solve problems such as inability to utilize knowledge and affect students' network distillation learning efficiency, and achieve the effect of improving training efficiency.

Active Publication Date: 2021-01-29
BEIJING SENSETIME TECH DEV CO LTD
View PDF2 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Calculating the average loss value or averaging the output values ​​​​of each teacher network will lead to conflicts. The choice of student networks is often dominated

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Neural network training method and device and computer equipment
  • Neural network training method and device and computer equipment
  • Neural network training method and device and computer equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0041] see figure 1 As shown, it is a flow chart of the neural network training method provided by the embodiment of the present disclosure, the method includes steps S102-S108, wherein:

[0042] S102: Obtain a sample image; input the sample image to the first neural network and a plurality of second neural networks for classification, obtain the output data of the first preset network layer of the first neural network and each of the second neural networks The output data of the second preset network layer of the neural network, wherein the second neural network is used for distillation training of the first neural network.

[0043] In the embodiment of the present disclosure, the first neural network may be understood as a student network to be trained, and the second neural network may be understood as a teacher network for training the student network to be trained. Compared with the second neural network, the first neural network has a simpler structure and fewer network...

Embodiment 2

[0073] On the basis of the technical solution described in the first embodiment above, in the embodiment of the present disclosure, as figure 2 As shown, the method also includes the following steps:

[0074] Step S201: Determine a second target loss value based on the category label information of the sample image and the category prediction result of the sample image by the first neural network.

[0075] Based on this, the above-mentioned step of iteratively adjusting the network parameters of the first neural network according to the first target loss value may further include the following step S202: according to the first target loss value and the second target loss value The determined joint loss value iteratively adjusts network parameters of the first neural network. Wherein, the joint loss value may be the sum or average value of the first target loss value and the second target loss value. It should be noted that, in the embodiment of the present disclosure, the e...

Embodiment 3

[0092] In the embodiment of the present disclosure, on the basis of the above-mentioned embodiment 1 and embodiment 2, as image 3 As shown, based on the preset objective function, the weight values ​​respectively corresponding to the second neural networks are determined, including the following steps:

[0093] Step S301: Minimize the L2 norm on the objective function to obtain the optimal solution of the preset weight variable used to weight the gradient of the first loss value in the objective function, and convert the optimal A solution is determined as the weight value of the second neural network.

[0094] Specifically, in the embodiment of the present disclosure, the objective function may be solved and calculated by the L2 norm minimization solution method, and the obtained optimal solution of the preset weight variable is the weight value of the second neural network.

[0095] The formula of the above objective function can be expressed as: Among them, θ (τ) is th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a neural network training method and device and computer equipment, and the method comprises the steps: respectively inputting an obtained sample image into a first neural network and a plurality of second neural networks for classification, obtaining output data of a first preset network layer of the first neural network and output data of a second preset network layer of each second neural network; determining a weight value corresponding to each second neural network based on a preset target function; weighting the distillation loss value corresponding to each secondneural network by adopting the weight value to obtain a first target loss value; and iteratively adjusting network parameters of the first neural network according to the first target loss value. According to the embodiment of the invention, the weight value of the second neural network is determined through the output data of the second neural network and the first neural network, so that the training efficiency of the first neural network can be improved in a mode of training the neural network according to the first target loss value calculated according to the weight value.

Description

technical field [0001] The present disclosure relates to the technical field of artificial intelligence, in particular, to a neural network training method, device and computer equipment. Background technique [0002] Currently, knowledge distillation is widely used in model compression and transfer learning, which enables smaller student networks to imitate the behavior of large teacher networks and achieve good results. Especially in image classification tasks, the training of a large teacher network requires a large amount of computing resources, and after the training is completed, the processing of image data by the large teacher network also needs to run in a processor that supports high-speed computing. And because the small student network occupies less computing resources, has lower requirements on the hardware environment, and generates shorter hardware delay, it can be applied to real-time image or video stream processing. [0003] Among the existing knowledge di...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/08G06N3/04
CPCG06N3/084G06N3/045
Inventor 游山杜尚宸王飞钱晨
Owner BEIJING SENSETIME TECH DEV CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products