The unequal treatment of demographic groups by ChatGPT/OpenAI content moderation
I have recently tested the ability of OpenAI content moderation system to detect hateful comments about a variety of demographic groups. The findings of the experiments suggest that OpenAI automated content moderation system treats several demographic groups markedly unequally. That is, the system classifies a variety of negative comments about some demographic groups as not…