Photo by jasmin chew on Pexels
An experiment testing the ability of AI language models to judge creative projects has sparked debate about the reliability of AI assessments. The user, posting on Reddit’s Artificial Intelligence forum, tasked ChatGPT, DeepSeek, and Gemini with evaluating their world-building project. While the AI assistants all delivered positive scores, the user expressed skepticism, wondering if the high marks were the result of genuine evaluation or simply pre-programmed positivity. The findings raise questions about the suitability of AI for assessing the quality and originality of creative endeavors. The discussion began on Reddit: [Reddit Post: https://old.reddit.com/r/artificial/comments/1n2bufl/are_ai_language_models_good_at_rating_world/]