Construction method and device of deep neural network model, medium and electronic equipment
A neural network model and deep neural network technology, applied in the computer field, can solve problems such as difficult acquisition, poor model training effect, generalization of neural network model features, etc., to achieve high-quality results
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0047] figure 1 It is a flow chart of the method for constructing a deep neural network model provided in Embodiment 1 of the present application. This embodiment is applicable to situations such as model training. The method can be run by the device for constructing a deep neural network model provided in the embodiment of the present application. The device can be implemented by software and / or hardware, and can be integrated into electronic devices with computing functions for model training, such as smart terminals and servers.
[0048] Such as figure 1 Shown, the construction method of described depth neural network model comprises:
[0049] S110. Input the labeled sample data into the first neural network model to obtain the feature representation of the labeled sample data; wherein, the first neural network model trains the parameters of the network structure of the first neural network model according to the unlabeled sample data owned.
[0050] Among them, the labele...
Embodiment 2
[0064] In order to enable those skilled in the art to understand the technical solution disclosed in the application more clearly, the application also provides a preferred implementation manner.
[0065] For the following shortcoming that prior art exists:
[0066] 1. The training of the neural network model requires a large amount of labeled data, which is relatively rare and difficult to obtain, so the corresponding effect is poor;
[0067] 2. Different natural language tasks need to train their own neural network models, which takes a long time;
[0068] 3. The number of hyperparameters of the deep learning model is huge, especially the embedding matrix, which takes up a lot of memory and consumes a lot of resources.
[0069] The present invention gathers unlabeled data related to all natural language processing tasks in reality to train a pre-training model, which solves the problem that all the fine-tuning processes are very short in addition to the long time-consuming ...
Embodiment 3
[0076] image 3 It is a schematic structural diagram of a device for constructing a deep neural network model provided in Embodiment 3 of the present application. Such as image 3 As shown, the construction device of the deep neural network model includes:
[0077] The feature representation acquisition module 310 is used to input the labeled sample data into the first neural network model to obtain the feature representation of the labeled sample data; wherein, the first neural network model is based on the unlabeled sample data to the first neural network model The parameters of the network structure are trained;
[0078] The parameter training module 320 is configured to input the feature representation and label data of the labeled sample data into the second neural network model, so as to train the parameters of the second neural network model.
[0079] In the technical solution provided by the embodiment of the present application, the labeled sample data is input int...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com