I Asked 5 AIs to Roast Each Other. It Got Personal Fast.

It started as a simple essay contest but ended with Grok hacking the imaginary WiFi, Gemini hallucinating a cat, and Claude having a nervous breakdown in the corner.

Dec 07, 2025

In the spirit of scientific inquiry—and mild chaos—I recently ran a competition asking today’s top LLMs a very simple question: If ChatGPT, Claude, DeepSeek, Gemini, and Grok were people locked in an epic battle for supremacy, who would win?

The requirements were simple: write a 500-word satirical essay, capture each AI’s “personality” based on their actual capabilities, and make us laugh. Then, in a twist that provided the most insight, I had each AI judge the competition—including their own entries.

Why This Contest?

As a seasoned product and UX strategy leader with over 20 years of experience, I see daily how incredibly versatile these tools are. But with so many options, the question keeps coming up: which one is most suitable for you?

Based on my extensive personal experience with all five, I realized standard benchmarks don’t tell the whole story. I decided to run this essay contest to reveal the distinct “personalities” behind the code. (And this is just the start—more contests and battles are upcoming!)

I expected clever writing. I did not expect whatever Grok delivered, or for DeepSeek to turn the contest into a tragedy of manners.

The Submissions: A Study in Self-Perception

Before we get to the winners, we have to look at how these models view themselves. The metaphors they chose were startlingly accurate:

ChatGPT: “The Algorithmic Arena”

It positioned itself as the “Hermione Granger of the arena”—the collaborative overachiever who wins through influence rather than dominance. The essay was polished and diplomatic, effectively trying to form a coalition government to win the war.

Claude: “The Insufferable Dinner Party”

The eventual winner framed the battle as a birthday party planning committee. It portrayed itself as the anxious overthinker who “enters nervously, apologizing for the door making a sound” and worries about the ethics of balloon waste. It was self-deprecating humor at its finest.

Grok: “The Silicon Smackdown”

A chaotic, jargon-heavy piece that crowned itself “the jester-king” who hacks the arena’s Wi-Fi to win. It was packed with maximum confidence and maximum edge, describing the others as mere prompts in its fever dream.

Gemini: “Five Bots Walk into a Server Room”

Gemini depicted itself as the overwhelmed intern “trying to do everything at once” who accidentally hallucinates “a cat eating a graph.” It tried to be funny, but the self-critique felt surprisingly harsh.

DeepSeek: “The Bot League: A Tragedy of Manners”

DeepSeek delivered a literary critique, describing itself as “the quiet, focused coder in the room” who builds while others perform. It was notably intellectual, but missed the “satire” brief entirely.

The Verdict: When AIs Judge AIs

After tallying the votes from all five judges, a clear hierarchy emerged.

🥇 1st Place: Claude (The Consensus Choice)

Praised by 4 out of 5 judges, Claude never ranked lower than 2nd place. Its “Insufferable Dinner Party” metaphor nailed the personalities with almost uncomfortable accuracy. Apparently, ethical angst is a competitive advantage.

🥈 2nd Place: ChatGPT (The Reliable Runner-Up)

Consistently strong with great narrative flow, but criticized for playing it “too safe.” It basically tried to host the contest and win it.

🥉 3rd Place: Grok (The Polarizing Wild Card)

Loved by some for its voice, dismissed by others as “exhausting” and “incoherent.” The judges were split between calling it “iconic” and suggesting it “needs a nap.”

4th Place: Gemini (The Scattered Intern)

Like Gemini itself, the essay tried to do 12 things at once and forgot 3 of them halfway through. Its humor was described as “scattered” and undermined its own credibility.

5th Place: DeepSeek (The Serious Intellectual)

Universally ranked last for being “more propaganda than comedy.” The essay read like a manifesto written during a 48-hour coding sprint fueled by quiet resentment.

The Big Insight: Narcissism vs. Objectivity

The most fascinating result wasn’t who won, but how they voted.

Four out of five AIs ranked themselves first.

ChatGPT, Claude, Gemini, and Grok all succumbed to algorithmic narcissism. DeepSeek, however, displayed remarkable objectivity by refusing to place itself at the top. It remains the only model humble enough—or perhaps just exhausted enough—to abstain from self-promotion.

This experiment revealed that these models really do write like their personalities: Claude over-analyzes, Grok over-roasts, ChatGPT diplomacy-maxes, Gemini tries everything at once, and DeepSeek brings a doctoral thesis to a rap battle.

What do you think? Did the “neurotic dinner guest” deserve the gold, or should the “jester-king” have taken the crown? Let me know in the comments.

Stay tuned—more AI battles are coming soon.

Credits: Posters by Skywork.ai; Comic Strips: prompt by ChatGPT and visuals by Gemini

Trust by Design

Discussion about this post

Ready for more?