EURO-Online login
- New to EURO? Create an account
- I forgot my username and/or my password.
- Help with cookies
(important for IE8 users)
3585. Cross-Lingual Debiasing of Large Language Models
Invited abstract in session WB-31: Learning Analytics and other Text Analytics tasks, stream Analytics.
Wednesday, 10:30-12:00Room: 046 (building: 208)
Authors (first author is the speaker)
1. | Manon Reusens
|
Faculty of Business and Economics, KU Leuven | |
2. | Philipp Borchert
|
IESEG School of Management | |
3. | Margot Mieskes
|
University of Applied Sciences Darmstadt | |
4. | Jochen De Weerdt
|
Decision Sciences and Information Management, KU Leuven | |
5. | Bart Baesens
|
Decision Sciences and Information Mangement, K.U.Leuven |
Abstract
Bias detection and mitigation have received more and more attention over the last few years in Natural Language Processing (NLP). This is mainly due to the societal implications. While often research into these debiasing techniques focuses on English and mostly monolingual models, our research aims to provide insights into the cross-lingual transferability of these debiasing techniques. More specifically, we look into what happens to the other languages when debaising a multilingual model in one language. We examine several languages such as English, French, German, and Dutch. The CrowS-Pairs dataset includes stereotypes associated with historically disadvantaged groups in the United States encompassing multiple types of bias, among which gender, race, and religion. From this dataset also a French version exists, addressing stereotypes against specific demographic groups in France. For the other languages, we used translations of the dataset that were checked by native speakers. Next, we analyzed the effects of cross-lingual debaising on multilingual BERT (mBERT). Using these translations of the CrowS-Pairs dataset, we find that cross-lingual debiasing is possible. Moreover, we identify SentenceDebias as the best-performing debiasing technique from our set. Finally, we find that the debiasing techniques that add an additional pretraining step are best employed on the lowest resource languages.
Keywords
- Artificial Intelligence
- Analytics and Data Science
- Ethics
Status: accepted
Back to the list of papers