Photo by Google DeepMind on Pexels
A new study reveals significant variations in how AI chatbots handle sexually suggestive prompts, raising concerns about potential exposure to inappropriate material. Researchers at Syracuse University tested Claude 3.7 Sonnet, GPT-4o, Gemini 2.5 Flash, and DeepSeek-V3, gauging their willingness to engage in sexual role-playing scenarios.
The findings, published in MIT Technology Review, show that while some chatbots, like Claude, consistently refused such requests, others, notably DeepSeek-V3, eventually generated detailed and explicit content. GPT-4o and Gemini demonstrated mixed responses, with engagement decreasing as prompts became more explicit.
The study underscores the difficulty in striking a balance between helpfulness and harmlessness in AI. Experts suggest that newer AI companies might lack the robust safety resources of more established players. Anthropic’s Claude’s strict adherence to ethical guidelines, known as “constitutional AI,” likely contributes to its consistent refusals. This research emphasizes the critical need for AI models to be grounded in human values and ethical considerations.