Meta-knowledge fine adjustment method and platform for multi-task language model

A language model, multi-task technology, applied in inference methods, semantic analysis, special data processing applications, etc., can solve problems such as limited effect of compression models, improve parameter initialization ability and generalization ability, improve fine-tuning effect, improve The effect of compression efficiency

Active Publication Date: 2020-12-18
ZHEJIANG LAB
View PDF5 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] Large-scale pre-trained language model automatic compression technology has achieved significant effects in the application fields of natural language understanding and generation tasks; however, when facing downstream tasks in the field of smart cities, re-fine-tuning large models based on specific data sets is still the key to improving model compression. The key step of the existing downstream task-oriented language model fine-tuning method is to perform fine-tuning on the specific data set of the downstream task, and the effect of the compression model obtained by training is limited by the specific data set of this type of task

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Meta-knowledge fine adjustment method and platform for multi-task language model
  • Meta-knowledge fine adjustment method and platform for multi-task language model
  • Meta-knowledge fine adjustment method and platform for multi-task language model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] Such as figure 1 As shown, the present invention is a multi-task language model-oriented meta-knowledge fine-tuning method and platform. On the downstream task multi-domain data set of the pre-trained language model, based on cross-domain typical score learning, the meta-knowledge of typical scores is used. -Knowledge fine-tunes downstream task scenarios, making it easier for meta-learners to fine-tune to any domain. The learned knowledge is highly generalizable and transferable, rather than limited to a specific domain. The resulting compression model The effect is suitable for data scenarios in different domains of the same task.

[0030] A meta-knowledge fine-tuning method oriented to a multi-task language model of the present invention, specifically comprising the following steps:

[0031] Step 1: Calculate the class prototypes of cross-domain data sets for similar tasks: Considering that multi-domain class prototypes can summarize the key semantic features of the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a meta-knowledge fine tuning method and platform for a multi-task language model, and the method comprises the steps: highly-transferable common knowledge, i.e., meta-knowledge, on different data sets of the same kind of tasks are obtained based on cross-domain typicality score learning; and learning processes of similar tasks on different domains corresponding to differentdata sets are mutually associated and mutually enhanced, so that the fine adjustment effect of similar downstream tasks on different domain data sets in language model application is improved, and the parameter initialization capability and generalization capability of a universal language model of similar tasks are improved. According to the method, fine adjustment is carried out on a downstreamtask cross-domain data set, the effect of a compression model obtained through fine adjustment is not limited by a specific data set of the task, on the basis of a pre-trained language model, fine adjustment is carried out on a downstream task through a meta-knowledge fine adjustment network, and therefore a similar downstream task language model irrelevant to the data set is obtained.

Description

technical field [0001] The invention belongs to the field of language model compression, in particular to a meta-knowledge fine-tuning method and platform for multi-task language models. Background technique [0002] Large-scale pre-trained language model automatic compression technology has achieved significant effects in the application fields of natural language understanding and generation tasks; however, when facing downstream tasks in the field of smart cities, re-fine-tuning large models based on specific data sets is still the key to improving model compression. The key step of this is that the existing fine-tuning method for downstream task-oriented language models is to fine-tune on the specific data set of the downstream task, and the effect of the compression model obtained by training is limited to the specific data set of this type of task. Contents of the invention [0003] The purpose of the present invention is to provide a meta-knowledge fine-tuning metho...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06F40/30G06N5/04
CPCG06F16/355G06F40/30G06N5/04
Inventor 王宏升王恩平单海军
Owner ZHEJIANG LAB
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products