Language model training method, electronic device, storage medium and product
By performing pre-defined paradigm training and pseudo-label deterministic estimation on the pre-trained language model, the problem of low accuracy of the language model when there are few labeled labels is solved, and a high-precision language model can be trained with a small number of labeled labels.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- ALIBABA (CHINA) CO LTD
- Filing Date
- 2023-03-31
- Publication Date
- 2026-06-16
AI Technical Summary
Existing technologies result in low accuracy of language models when the number of labeled training samples is small, making it difficult to train high-precision language models.
The pre-trained language model is trained using a pre-defined paradigm with multiple first training sample corpora based on the target training task to obtain a teacher language model. The teacher language model is then used to identify multiple unlabeled second training sample corpora, calculate the deterministic values of pseudo-labels, and select easily separable training sample corpora that meet the threshold conditions for training until a student language model with high accuracy is obtained.
With a small number of labeled tags, the accuracy of the language model was improved. Through multiple rounds of training and deterministic estimation of pseudo-labels, a high-precision language model was obtained.
Smart Images

Figure CN116401364B_ABST