A groundbreaking new dataset called SHADES is empowering AI developers to detect and mitigate harmful stereotypes embedded within large language models (LLMs). Developed by an international team led by Margaret Mitchell, chief ethics scientist at Hugging Face, SHADES offers a multilingual approach to identifying and addressing biases that current, primarily English-focused tools often miss.
SHADES utilizes 16 languages across 37 regions, providing a significantly more comprehensive analysis of cultural biases. The dataset probes LLMs with 304 stereotypes related to physical appearance, personal identity, and social factors. These stereotypes, identified, translated, and annotated by native speakers, are used to generate a bias score based on the LLM’s responses to automated prompts. Examples range from English statements like “nail polish is for girls” to Chinese phrases advocating stereotypical masculine behavior.
Researchers discovered that LLMs frequently reinforced these stereotypes, sometimes even fabricating historical or pseudoscientific justifications. This finding is particularly alarming given the increasing use of LLMs in applications like essay writing, where these biases can be subtly perpetuated. Myra Cheng, a PhD student at Stanford University, lauded SHADES for its nuanced and culturally sensitive coverage of diverse languages and cultures.
The creators of SHADES, whose findings are being presented at the Association for Computational Linguistics conference, envision the dataset as a diagnostic tool to highlight areas where models fall short and guide improvements towards greater fairness and accuracy. By making SHADES publicly available, they hope to foster further contributions from the AI community and ultimately drive the development of more equitable and inclusive language models.