Why Asking "What Model Are You?" Is Unreliable

Users sometimes ask the AI "What model are you?" to verify which model they're using. However, this testing method is fundamentally unreliable.

Common Examples

Actual ModelModel Claims To Be
Gemini 3 Pro"I am Gemini 1.5 Pro"
DeepSeek"I am OpenAI GPT-4"
GPT-4Identifies itself as GPT-3 or GPT-3.5
iFlytek SparkClaims to be developed by OpenAI
Gemini-Pro (Google)Claims to be Wenxin (Baidu) in Chinese conversations

These are not isolated cases — they are an inherent characteristic of all large language models.

Why This Happens

1. Model Names Are Assigned After Training

Models are trained on vast amounts of text data, but this training data does not contain the model's own identity information. Names and version numbers (like "GPT-4o", "Claude Sonnet 4", "Gemini 3 Pro") are assigned by the development team only after training is complete.

Analogy: Imagine teaching a baby everything humans know for years, but never telling it its name. When it learns to speak, it will know many things — but it won't know what it's called.

2. Identity Confusion Is an Inherent AI Hallucination

The academic paper "I'm Spartacus, No, I'm Spartacus" (arXiv:2411.10683) systematically studied this phenomenon, analyzing 27 major LLMs and finding that ~26% exhibit identity confusion.

Key conclusion: Through output similarity analysis, researchers confirmed that identity confusion stems from hallucination, not model copying or substitution. When two models with entirely different output distributions both show identity confusion, it's clearly an inherent LLM hallucination phenomenon.

3. AI Self-Awareness Is Highly Unreliable

Anthropic's October 2025 research Signs of introspection in large language models stated:

Even with the best experimental protocols, the most advanced models demonstrated correct introspective awareness in only about 20% of cases.

This means 80% of the time, a model's report about its own state is inaccurate. Models aren't "lying" — they're confabulating plausible-sounding but actually incorrect answers.

4. System Prompts Are the Only Reliable Identity Source

To make models "know" who they are, AI providers explicitly tell them via system prompts. For example, Anthropic's system prompt begins with:

The assistant is Claude, created by Anthropic.

This means:

  • If a third-party app changes or omits the system prompt, the model won't know its own name
  • A model's "self-knowledge" depends entirely on external configuration, not intrinsic knowledge

5. Training Data Contamination

Training datasets may contain conversations about and mentions of other models. If the data contains enough references to "GPT-4", a model may respond "GPT-4" when asked about its identity — this is just statistical pattern matching, not genuine self-awareness.

6. Limitations of Next-Token Prediction

OpenAI's research Why language models hallucinate explains the root cause: models learn by predicting the next word, but training data has no "true/false" labels. Model version numbers are "arbitrary low-frequency facts" that cannot be inferred from patterns, so the model is essentially guessing.

How to Properly Verify Model Version

⚠️
Do not judge a model's version by asking it "who are you?"

Reliable verification methods:

  • Check API response headers — Responses typically include the actual model identifier used
  • Check provider dashboard — Confirm current configuration in the provider's backend
  • Compare benchmark performance — Different models show distinct performance on professional benchmarks
  • Check Chatbox settings — Confirm which model is selected for the current conversation

References

  1. Kun Li et al., "I'm Spartacus, No, I'm Spartacus: Measuring and Understanding LLM Identity Confusion," arXiv:2411.10683, November 2024
  2. Anthropic, "Signs of introspection in large language models," October 2025
  3. OpenAI, "Why language models hallucinate," September 2025
  4. Zhu Liang, "The Identity Crisis: Why LLMs Don't Know Who They Are," 16x Eval Blog, August 2025