Neural network training method and device and computer equipment

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A neural network and training method technology, applied in the field of devices, computer equipment, and neural network training methods, can solve problems such as inability to utilize knowledge and affect students' network distillation learning efficiency, and achieve the effect of improving training efficiency.

Active Publication Date: 2021-01-29

BEIJING SENSETIME TECH DEV CO LTD

View PDF2 Cites 6 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Calculating the average loss value or averaging the output values of each teacher network will lead to conflicts. The choice of student networks is often dominated by some teacher networks, which cannot utilize the knowledge provided by all teacher networks, thus affecting the distillation learning of student networks. efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0041] see figure 1 As shown, it is a flow chart of the neural network training method provided by the embodiment of the present disclosure, the method includes steps S102-S108, wherein:

[0042] S102: Obtain a sample image; input the sample image to the first neural network and a plurality of second neural networks for classification, obtain the output data of the first preset network layer of the first neural network and each of the second neural networks The output data of the second preset network layer of the neural network, wherein the second neural network is used for distillation training of the first neural network.

[0043] In the embodiment of the present disclosure, the first neural network may be understood as a student network to be trained, and the second neural network may be understood as a teacher network for training the student network to be trained. Compared with the second neural network, the first neural network has a simpler structure and fewer network...

Embodiment 2

[0073] On the basis of the technical solution described in the first embodiment above, in the embodiment of the present disclosure, as figure 2 As shown, the method also includes the following steps:

[0074] Step S201: Determine a second target loss value based on the category label information of the sample image and the category prediction result of the sample image by the first neural network.

[0075] Based on this, the above-mentioned step of iteratively adjusting the network parameters of the first neural network according to the first target loss value may further include the following step S202: according to the first target loss value and the second target loss value The determined joint loss value iteratively adjusts network parameters of the first neural network. Wherein, the joint loss value may be the sum or average value of the first target loss value and the second target loss value. It should be noted that, in the embodiment of the present disclosure, the e...

Embodiment 3

[0092] In the embodiment of the present disclosure, on the basis of the above-mentioned embodiment 1 and embodiment 2, as image 3 As shown, based on the preset objective function, the weight values respectively corresponding to the second neural networks are determined, including the following steps:

[0093] Step S301: Minimize the L2 norm on the objective function to obtain the optimal solution of the preset weight variable used to weight the gradient of the first loss value in the objective function, and convert the optimal A solution is determined as the weight value of the second neural network.

[0094] Specifically, in the embodiment of the present disclosure, the objective function may be solved and calculated by the L2 norm minimization solution method, and the obtained optimal solution of the preset weight variable is the weight value of the second neural network.

[0095] The formula of the above objective function can be expressed as: Among them, θ (τ) is th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a neural network training method and device and computer equipment, and the method comprises the steps: respectively inputting an obtained sample image into a first neural network and a plurality of second neural networks for classification, obtaining output data of a first preset network layer of the first neural network and output data of a second preset network layer of each second neural network; determining a weight value corresponding to each second neural network based on a preset target function; weighting the distillation loss value corresponding to each secondneural network by adopting the weight value to obtain a first target loss value; and iteratively adjusting network parameters of the first neural network according to the first target loss value. According to the embodiment of the invention, the weight value of the second neural network is determined through the output data of the second neural network and the first neural network, so that the training efficiency of the first neural network can be improved in a mode of training the neural network according to the first target loss value calculated according to the weight value.

Description

technical field [0001] The present disclosure relates to the technical field of artificial intelligence, in particular, to a neural network training method, device and computer equipment. Background technique [0002] Currently, knowledge distillation is widely used in model compression and transfer learning, which enables smaller student networks to imitate the behavior of large teacher networks and achieve good results. Especially in image classification tasks, the training of a large teacher network requires a large amount of computing resources, and after the training is completed, the processing of image data by the large teacher network also needs to run in a processor that supports high-speed computing. And because the small student network occupies less computing resources, has lower requirements on the hardware environment, and generates shorter hardware delay, it can be applied to real-time image or video stream processing. [0003] Among the existing knowledge di...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/08G06N3/04

CPCG06N3/084G06N3/045

Inventor 游山杜尚宸王飞钱晨

Owner BEIJING SENSETIME TECH DEV CO LTD

Features

Generate Ideas
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Neural network training method and device and computer equipment

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology