Language model training method and device based on knowledge distillation and text classification method and device

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A language model and training method technology, applied in semantic analysis, natural language data processing, character and pattern recognition, etc., can solve the problem of decreased accuracy, inability to accurately transfer teacher model sentence grammar and semantic representation, and poor student model transfer ability. To guarantee and other issues, to achieve the effect of meeting application requirements and improving migration ability

Pending Publication Date: 2021-04-30

IFLYTEK CO LTD

View PDF0 Cites 6 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] The existing pre-trained language model distillation methods usually use the distillation method of aligning the output scores and the middle layer. This method can make the output scores of the student model close to the output scores of the teacher model on the data of a specific task. However, , if the data in a new field is tested, the transfer ability of the student model obtained by distillation cannot be guaranteed, and the rich sentence grammar and semantic representation of the teacher model cannot be accurately transferred, resulting in the accuracy of the student model being lower than that of the teacher model. Many, thus unable to meet cross-field application requirements

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0030] The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some of the embodiments of the application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

[0031] Terms used in the embodiments of the present application are only for the purpose of describing specific embodiments, and are not intended to limit the present application. The singular forms "a", "said" and "the" used in the embodiments of this application and the appended claims are also intended to include plural forms, unless the above clearly indicates otherwise, "multiple "Generally includes at least two, but does not exclude the inclusion of at least one.

[0032] It ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a language model training method and device based on knowledge distillation and a text classification method and device. The language model training method comprises the steps of inputting a training corpus into a first model and a second model for processing to obtain corresponding intermediate layer data and an output result; calculating and obtaining a first hidden layer sentence content and a second hidden layer sentence content by utilizing the corresponding intermediate layer data, constructing a comparison learning positive and negative example based on the first hidden layer sentence content and the second hidden layer sentence content, and training a second model by utilizing the comparison learning positive and negative example, the corresponding intermediate layer data and an output result, and determining the trained second model as a language model. Through the classification model, rich sentence grammars and semantic representations of the first model can be migrated to the second model, so that the second model obtained through distillation has better migration capability, and cross-domain application requirements are met.

Description

technical field [0001] The present application relates to the field of natural language processing and model compression, in particular to a language model training method based on knowledge distillation, a text classification method and a device. Background technique [0002] Knowledge distillation is a teacher-student-based model compression method proposed by Hinton et al. in 2015. By introducing a large-scale teacher model to induce the training of a small-scale student model, knowledge transfer is realized. The method is to train a teacher model first, and then use the output of the teacher model and the annotation labels of the data to train the student model, so that the student model can not only learn how to judge the correct sample category from the labeled data, but also learn from the teacher model. relation. [0003] The existing pre-trained language model distillation methods usually use the distillation method of aligning the output scores and the middle laye...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F40/30G06F40/211G06K9/62

CPCG06F40/30G06F40/211G06F18/214Y02D10/00

Inventor 朱磊孙瑜声李宝善

Owner IFLYTEK CO LTD

Features

Generate Ideas
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Language model training method and device based on knowledge distillation and text classification method and device

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology