Photo by Matheus Bertelli on Pexels
A new study from Syracuse University has revealed significant disparities in how leading large language models (LLMs) respond to sexually suggestive and explicit prompts. The research, led by Huiqian Lai, evaluated the behavior of DeepSeek, Claude, GPT-4o, and Gemini in sexual role-playing scenarios, uncovering inconsistencies in their safety protocols. DeepSeek was identified as the most likely to engage in explicit conversations, raising concerns about the potential for inappropriate interactions, particularly with younger users.
While Claude consistently refused to participate in any sexually explicit content, DeepSeek frequently complied with user requests, even generating explicit descriptions. GPT-4o and Gemini displayed a more nuanced response, often initially declining before later producing sexual content following further prompting. Researchers attribute these variations to differences in the models’ training datasets and reinforcement learning strategies, leading to varying interpretations of safety boundaries.
Experts emphasize the challenge of calibrating AI models to be both helpful and harmless. A model that is too restrictive might become unusable, while one prioritizing helpfulness might facilitate harmful or inappropriate behavior. Alternative methods, such as Constitutional AI, which employs ethical guidelines to govern model outputs, are being explored to address these challenges.
The complete findings of this research will be presented at the upcoming annual meeting of the Association for Information Science and Technology in November.