Using Language Models to Promote Inclusive Language in Software Development Communities
Abstract
The use of non-inclusive and harmful terminology in software development communities poses significant challenges in fostering an inclusive environment. The HaTe Detector project aims to address this issue by developing a tool that identifies and suggests replacements for harmful terms in computing artifacts.. Non-inclusive language can perpetuate stereotypes, reinforce biases, and create an unwelcoming atmosphere for underrepresented groups. Addressing this issue is important for promoting diversity, equality, and inclusion in tech. The HaTe Detector project uses existing research and tools focused on inclusive language, including the GitHub Inclusifier project, which offer guidelines and automated corrections for promoting inclusive language in technical and everyday contexts. We designed an experiment to evaluate several different LLMS including GPT-4, BERT, RoBERTa, T5, and DistilBERT for their ability to detect and replace harmful terms. We created specific prompts to cover detection, replacement suggestions, contextual understanding, and handling of complex scenarios. Preliminary results indicate that LLMs can effectively identify and suggest replacements for harmful terms and emphasize the potential of LLMs to support automated tools in promoting inclusive language in tech. This project contributes to ongoing efforts to foster a more inclusive tech community by building on existing literature and practicing robust evaluation methods.
Published
Issue
Section
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.