Result filters

Metadata provider

Language

Resource type

Tool task

Project

  • Identification and Prevention of Unwanted Gender Bias in Neural Language Models

Keywords

  • machine translation

Active filters:

  • Keywords: machine translation
  • Project: Identification and Prevention of Unwanted Gender Bias in Neural Language Models
Loading...
1 record(s) found

Search results

  • Debiasing Algorithm through Model Adaptation

    Debiasing Algorithm through Model Adaptation (DAMA) is based on guarding stereotypical gender signals and model editing. DAMA is performed on specific modules prone to convey gender bias, as shown by causal tracing. Our novel method effectively reduces gender bias in LLaMA models in three diagnostic tests: generation, coreference (WinoBias), and stereotypical sentence likelihood (StereoSet). The method does not change the model’s architecture, parameter count, or inference cost. We have also shown that the model’s performance in language modeling and a diverse set of downstream tasks is almost unaffected. This package contains both the source codes and English, English-to-Czech, and English-to-German datasets.