Language task model training method and device, electronic equipment and storage medium

A task model and language model technology, applied in the field of artificial intelligence, can solve problems such as insufficient interface and insufficient learning
CN111159416AActive Publication Date: 2020-05-15TENCENT TECH (SHENZHEN) CO LTD

Patent Information

Authority / Receiving Office
CN Β· China
Current Assignee / Owner
TENCENT TECH (SHENZHEN) CO LTD
Publication Date
2020-05-15

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention provides a language task model training method and device, electronic equipment and a storage medium. The method comprises the steps of performing hierarchical pre-training in a languagemodel based on corpus samples of corresponding language tasks in a pre-training sample set; carrying out forward propagation on corpus samples corresponding to language tasks in a training sample setin the language task model; fixing parameters of the language model, and performing back propagation in the language task model to update the parameters of the task model; and performing forward propagation and reverse propagation on corpus samples corresponding to the language tasks in the training sample set in the language task model so as to update parameters of the language model and the task model. By means of the method and device, the catastrophic forgetting phenomenon of the language model can be prevented, and meanwhile it is guaranteed that the language model and the task model canachieve the training effect meeting the corresponding learning rate.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The present invention relates to artificial intelligence technology, in particular to an artificial intelligence-based language task model training method, device, electronic equipment and storage medium. Background technique

[0002] Artificial Intelligence (AI) is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.

[0003] Although various large-scale pre-trained language models in related technologies have strong context representation capabilities, they do not have rich interfaces for many specific tasks. For example, the application of language models to reading comprehension tasks simply puts the problem Splicing together with articles for training, the disadvantage of this training method is that the language model does not learn the a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More