Unlock instant, AI-driven research and patent intelligence for your innovation.

Regularization-based social prejudice removing language model and application

A language model and bias technology, applied in biological neural network models, natural language data processing, special data processing applications, etc., to achieve the effects of ensuring fairness, improving fairness, and improving training effects

Active Publication Date: 2020-10-09
ZHEJIANG UNIV OF TECH
View PDF2 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In an experimental setting, researchers use a test dataset for evaluation to verify the effectiveness of the algorithm, but the test set is usually a random subsample of the original training dataset and thus may contain the same bias
Such data and algorithm bias has become a growing problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Regularization-based social prejudice removing language model and application
  • Regularization-based social prejudice removing language model and application
  • Regularization-based social prejudice removing language model and application

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, and do not limit the protection scope of the present invention.

[0026] Such as Figure 1 ~ Figure 3 As shown, the embodiment provides a method for constructing a language model based on regularization to remove social bias, including the following steps:

[0027] Step 1, define the social bias of the language model.

[0028] For text data, it is difficult to quantify social bias due to the high complexity of the data. In the present invention, when the language model performs text prediction, due to the social prejudice existing in the original training text library, the phenomenon that the language model reflects or amplifies the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a regularization-based social prejudice removing language model and an application. The method comprises the following steps: (1) after cleaning a PTB corpus text library, screening words which may have social prejudice in the PTB corpus text library, and marking the words; (2) establishing a language model comprising three layers of LSTM networks, a full connection layer and a softmax layer; (3) training a language model by utilizing the PTB corpus text library, and taking the total loss Loss consisting of the loss Loss of the text generation task and the loss Loss ofthe de-social prejudice regularization item as the final loss during training; and (4) in each training stage, evaluating the social prejudgement effect of the language model according to the distribution state of the social prejudgement score of the prediction text output by the language model relative to the social prejudgement score of the PTB corpus text, and obtaining a final language model when the distribution state is satisfactory. The language model improves the fairness of prediction output.

Description

technical field [0001] The invention belongs to natural language processing models, and in particular relates to a regularization-based language model and its application without social prejudice. Background technique [0002] Artificial intelligence governance has become a topic of widespread concern in recent years, and the fairness of deep learning is the most critical issue in artificial intelligence governance. How to effectively deal with discriminatory biased data in training datasets is a major problem currently facing machine learning. Biased training data sets are generally considered to be one of the important factors affecting the fairness and justice of machine learning. Most machine learning models are trained on large labeled datasets. For example, in natural language processing, standard algorithms are trained on corpora containing billions of words. Researchers typically construct such datasets by scraping websites such as Google Images and Google News, u...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/31G06N3/04G06F40/205G06F40/216G06F40/263G06F40/289G06K9/62
CPCG06F16/31G06F40/289G06F40/216G06F40/205G06F40/263G06N3/044G06N3/045G06F18/214
Inventor 陈晋音缪盛欢徐思雨陈治清徐国宁
Owner ZHEJIANG UNIV OF TECH