Convolutional neural network initialization method based on pre-training model filter extraction

A convolutional neural network and initialization method technology, applied in the field of convolutional neural network initialization, can solve the problem that the large-scale network structure model cannot adapt to the target task, the target task cost and calculation speed are high, and the network structure cannot be flexibly designed, etc. problem, to achieve the effect of meeting the requirements of memory overhead and calculation speed

Inactive Publication Date: 2018-06-01
NORTHWESTERN POLYTECHNICAL UNIV
View PDF0 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method of initialization using pre-trained models has the following limitations
First of all, using the pre-training model requires that the network structure of the pre-training model is consistent with the network structure of the target task, such as the number of filters in the convolutional layer, the size of the filter, and the step size, which makes the network model unable to flexibly design the network structure according to the target task
Secondly, since the network structure of the pre-training model is usually large, the content overhead and calculation speed of the target task in the actual application problem are high, so the large-scale network structure model cannot be adapted to the target task in the actual application problem.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Convolutional neural network initialization method based on pre-training model filter extraction
  • Convolutional neural network initialization method based on pre-training model filter extraction
  • Convolutional neural network initialization method based on pre-training model filter extraction

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] The present invention will be further described below in conjunction with the accompanying drawings and embodiments.

[0028] The method embodiment takes CIFAR10, CIFAR100, SVHN and STL10 classification task data sets as target tasks, selects GoogleNet, CaffeNet and VGG16 obtained by training on ImageNet as pre-training models, and extracts filter parameters using the method of minimum entropy loss and minimum reconstruction error , initialize the target task network model, compare with the random initialization method using Gaussian distribution, and investigate the target task network model classification accuracy (TestingError) and network model training convergence speed (normalizedAUC). The working process of the method of the present invention is attached figure 1 shown.

[0029] Such as figure 1 Shown, the present invention comprises the following steps:

[0030] Step 1: Design the CNN network structure for the target task;

[0031] The present invention desi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a convolutional neural network (CNN) initialization method based on pre-training model filter extraction and relates to the technical field of video processing. The method extracts a filter parameter in a pre-training model by using a minimum entropy loss and a minimum reconstruction error in order to initialize a target task network model and achieve the small and medium-sized network initialization that meets practical application problems. By using the minimum entropy loss and the minimum linear reconstruction method, the method extracts the filter parameter from thepre-training model to initialize the target task network model, does not require that the target task network structure is consistent with the pre-training network structure, flexibly design the network structure of the target task according to practical applications so as to meet the memory overhead and calculation speed requirements in practical application problems.

Description

technical field [0001] The invention relates to the technical field of video processing, in particular to a method for initializing a convolutional neural network. Background technique [0002] Convolutional Neural Network (CNN) realizes the approximation of the objective function by learning a multi-layer nonlinear network structure and using a simple network structure, and then can learn the feature representation of the sample data from the original data sample set. Thanks to massive amounts of data, deep convolutional neural networks are one of the most important breakthroughs in artificial intelligence and machine learning in recent years, and they have achieved great success in image analysis, speech recognition, and natural language processing. [0003] Since there is often only a small amount of data in practical application problems, the trained CNN model has overfitting, its model generalization ability is weak, and its performance on target tasks is poor. An effe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04
CPCG06N3/045
Inventor 周巍张冠文
Owner NORTHWESTERN POLYTECHNICAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products