Language model training method and device based on knowledge distillation and text classification method and device

The technology of a language model and training method is applied in the fields of electronic equipment and non-transitory computer-readable storage media, text classification devices, and language model training devices based on knowledge distillation, which can solve the problems of inaccuracy, strong subjectivity of secret determination, and easy Mistakes and other problems, to achieve the effect of retaining accuracy, improving applicability and reliability

Pending Publication Date: 2020-08-07
北京万里红科技有限公司
View PDF12 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The traditional classification work relies on manual classification and lacks the technical support means of information-based intelligent auxiliary classification decision-making. The classification is highly subjective and prone to mistakes and inaccuracies
However, due to the difficulty in obtaining a large amount of labeled data in the confidential field, the machine learning classification algorithm that requires a large amount of labeled data cannot achieve good results in the confidential field, and the traditional text classification method is also difficult to effectively apply to the confidential field.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Language model training method and device based on knowledge distillation and text classification method and device
  • Language model training method and device based on knowledge distillation and text classification method and device
  • Language model training method and device based on knowledge distillation and text classification method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the present invention. Rather, they are merely examples of apparatuses and methods consistent with aspects of the invention as recited in the appended claims.

[0032] It should be noted that although expressions such as "first" and "second" are used herein to describe different modules, steps, data, etc. of the embodiments of the present invention, expressions such as "first" and "second" are only for A distinction is made between different modules, steps, data, etc., without implying a particular order or degree of importance. In fact, expressions su...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a language model training method based on knowledge distillation, a text classification method, a language model training device based on knowledge distillation, a text classification device, electronic equipment and a non-temporary computer readable storage medium. The language model training method based on knowledge distillation comprises a first word vector layer parameter determination step and a language model training step. The text classification method comprises the steps of obtaining a to-be-classified text; based on the to-be-classified text, obtaining a keyword code list of the to-be-classified text through extraction; obtaining a word vector of each keyword corresponding to the to-be-classified text through a language model according to the keyword code list; and through the text classification layer, obtaining a classification result of the to-be-classified text. According to the method, a knowledge distillation method is adopted, the dependence on a labeled sample is reduced while the accuracy of the model is reserved, and the reasoning speed is increased by simplifying the structure of the model, so that the applicability and reliability ofthe text classification method in an intelligent auxiliary secret setting system are improved.

Description

technical field [0001] The disclosure relates to the technical field of data information processing and analysis methods in the security field, and in particular to a language model training method based on knowledge distillation, a text classification method, a language model training device based on knowledge distillation, a text classification device, electronic equipment and non-temporary non-volatile computer readable storage medium. Background technique [0002] Confidentiality work refers to the protection of secret information from being leaked through certain means and preventive measures. It is an important task to maintain information security and an important means to protect the core interests of all aspects of social security from infringement. Classification work is the source and basic work of secrecy work, and doing a good job in classifying work is the premise and basis for doing a good job in secrecy work. With the rapid advancement of informatization con...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/33G06F16/35G06F40/30G06N3/04G06N3/08
CPCG06F16/35G06F16/3344G06F40/30G06N3/084G06N3/044
Inventor 张小亮王秀贞戚纪纲杨占金其他发明人请求不公开姓名
Owner 北京万里红科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products