Unlock instant, AI-driven research and patent intelligence for your innovation.

Large category deep learning GPU parallel acceleration method

A deep learning and category technology, applied in the direction of neural learning methods, biological neural network models, processor architecture/configuration, etc., can solve the problems of high communication time and high communication costs, reduce GPU occupancy, and improve model learning efficiency Effect

Inactive Publication Date: 2018-06-01
CHONGQING INST OF GREEN & INTELLIGENT TECH CHINESE ACADEMY OF SCI
View PDF4 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, for data with a large number of categories, the communication cost for parameter interaction is too high on the last full-link layer in the deep neural network structure. Therefore, a new technical means is needed, which can greatly improve the model learning efficiency and reduce the GPU occupancy rate while maintaining the original deep learning effect.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Large category deep learning GPU parallel acceleration method
  • Large category deep learning GPU parallel acceleration method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] Embodiments of the present invention are described below through specific examples, and those skilled in the art can easily understand other advantages and effects of the present invention from the content disclosed in this specification. The present invention can also be implemented or applied through other different specific implementation modes, and various modifications or changes can be made to the details in this specification based on different viewpoints and applications without departing from the spirit of the present invention. It should be noted that, in the case of no conflict, the following embodiments and features in the embodiments can be combined with each other.

[0020] It should be noted that the diagrams provided in the following embodiments are only schematically illustrating the basic ideas of the present invention, and only the components related to the present invention are shown in the diagrams rather than the number, shape and shape of the compo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A large category deep learning GPU parallel acceleration method provided by the present invention comprises the steps of adopting a model to train the model parameters of the softmax layers in a deepneural network structure parallelly, using each GPU to train the respective model fragmentations, enabling the softmax layers of the GPUs to interact the data characteristics of the model parameters,thereby finishing the deep learning. The method of the present invention adopts a mixed architecture, namely all levels before the softmax layers still adopt a data parallel mode, and the softmax layers adopt a model parallel mode, thereby breaking through the bottleneck of the large category deep learning parallel operation, overcoming the problems that the communication cost is high and the expended communication time is too much by interacting the parameters in the last full-link layer in the deep neural network structure, being able to improve the model learning efficiency substantially while keeping an original deep learning effect, and reducing the GPU occupancy rate.

Description

technical field [0001] The invention relates to the field of computers and applications thereof, in particular to a method for parallel acceleration of large-scale deep learning GPUs. Background technique [0002] At present, deep learning has achieved breakthroughs in several major fields: in the field of speech recognition, in the field of image recognition, and in the field of natural language processing. It can be said that so far, deep learning is the closest intelligent learning method to the human brain. However, the deep learning model has many parameters, a large amount of calculation, and a larger scale of training data, which consumes a lot of computing resources. If the training can be accelerated, the work efficiency will be significantly improved, and for large-scale training data and models, difficult tasks can be made possible. [0003] With the continuous advancement of GPU's massively parallel architecture support, general-purpose computing-oriented GPU (...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N3/08G06T1/20
CPCG06N3/08G06T1/20
Inventor 石宇徐卉程诚周祥东
Owner CHONGQING INST OF GREEN & INTELLIGENT TECH CHINESE ACADEMY OF SCI
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More