With the increased adaptation of natural language processing models in industrial applications such as hiring and recruitment, chatbots, social media monitoring, and targeted advertising, pretrained language models (PTM) need fair and equal behavior across all ranges of demographic groups. ELECTRA has substantially outperformed BERT by predicting the original identities of the corrupted tokens over all input tokens rather than just the small subset that was masked out. Considering such enhancement and the 1/4 less amount of computing required by ELECTRA, it can be one of the most suitable industrial applications. Therefore, it is crucial to understand its underlying architecture and tokenization protocol to identify any potential discrimination towards specific groups. This paper presents a fair operation from ELECTRAs' pretrained network that shows the accurate classification of token replacements. This result is achieved via using a dataset with racially and gender-associated personal names, finetuning ELECTRA with the general language understanding evaluation (GLUE) benchmark, which analyzes the interactions of encoders and decoders using the Contextualized Embedding Association Test (CEAT) and sentiment association test. In addition, this paper will demonstrate that ELECTRA can achieve Bias-aware Fair prediction with higher accuracy on downstream tasks after fully trained. This project is investigating the prediction of generator and discriminator on an initial word's token using the Named Entity Recognition (NER), and Part of Speech tagging (POS)
Related links
Details
Title
Investigating Gender and Racial Bias in ELECTRA
Publication Details
2022 International Conference on Computational Science and Computational Intelligence (CSCI), pp.127-133
Resource Type
Conference proceeding
Conference
International Conference on Computational Science and Computational Intelligence (CSCI) (Las Vegas, Nevada, USA, 12/14/2022–12/16/2022)