Beyond Hallucinations: Uncovering Linguistic Bias in Vision-Language Models

by Sirat Samyoun and Jian Du, Research Scientist, Privacy Innovation Lab

Research

Vision-language models don't just hallucinate objects—they also get stuck in language.

Look at the examples above. The models aren't inventing what isn't there. They're showing linguistic bias: a systematic over-reliance on language patterns that overrides visual evidence.

When Language Overrides Vision

VLMs are trained on millions of image-text pairs. They learn not only to see but to describe. Sometimes, however, their linguistic training overwhelms their visual reasoning. Instead of engaging with what's in the image, they fall back on statistical word associations.

Three Patterns of Linguistic Bias

From our analysis of model outputs, we've identified three clear patterns where language dominates vision:

Label Repetition Bias

"Goat, goat, goat, goat, goat, goat…" (17 times)

The model correctly identifies goats in a farm scene—but instead of describing the four actual goats, it repeats the label seventeen times. It's caught in a linguistic loop, prioritizing word frequency over visual counting.

Attribute Flooding Bias

"Boat, red, white, and blue, red, white, and blue, red, white, and blue"

Faced with a simple boat scene, the model showers it with the same color descriptors repeatedly. The adjectives recycle, burying any opportunity for richer visual description.

Lexical Root Propagation Bias

A third pattern we observe occurs when models latch onto a root word and generate unsupported variations—like seeing "airport" and outputting "airport vehicle, airport staff, airport operations" without visual evidence for each variant.

These aren't hallucinations—the models aren't inventing objects. But they're not fully engaging with what they see either. They're demonstrating a reasoning gap where language patterns dominate visual evidence.

Why Traditional Safety Misses This

Most content safety systems check for harmful or explicit violations. Linguistic bias isn't harmful—it's statistically predictable and visually disengaged. The problem isn't what's said, but how often and why it's said.

The Attribution Approach: Asking "Why This Word?"

We detect linguistic bias by tracing each word back to its visual inspiration. Our method works by:

Step 1: Extract visual and text features

For each generated term, we extract patch-level visual features and token embeddings.

Step 2: Compute attribution scores

Using Shapley values from cooperative game theory, we measure each image patch's contribution to the word choice.

Step 3: Measure confidence drop

We compare confidence when the model sees the real image versus a blank input. A small drop means the word is language-driven, not vision-driven.

Step 4: Flag the gaps

Weak attribution + minimal confidence drop = linguistic bias.

The approach is training-free, model-agnostic, and adds minimal overhead—making it practical for real-time content pipelines.

Results: Detecting Bias Across Models

We evaluated across several leading open-source VLMs (LLaVA, Qwen-VL, Llama-Vision) on a curated benchmark. Our attribution method consistently identified all three bias patterns—label repetition, attribute flooding, and root propagation—improving detection accuracy by up to 8% over existing approaches that rely solely on similarity scores or attention aggregation. The method proved particularly effective for label repetition, where the gap between visual evidence and linguistic output is most pronounced, achieving over 87% accuracy in flagging excessive repetitions.

Why This Matters

While much research focuses on hallucination detection, linguistic bias presents a distinct challenge: it doesn't introduce false facts, but diminishes descriptive richness and reinforces repetitive patterns.

In creative and descriptive applications—from ad copy to alt-text generation—these biases produce outputs that are monotonous, predictable, and visually disengaged. By detecting linguistic bias alongside hallucinations, we can:

Expand safety evaluation beyond factuality to include language diversity and visual grounding
Guide models toward more varied, attentive descriptions that engage with what they see
Build generated content that is not only accurate but also engaging and trustworthy

This work complements hallucination detection by addressing a different failure mode—not when models invent, but when they repeat, flood, and propagate language without sufficient visual cause.

Share this article

Discover more

Beyond Hallucinations: Uncovering Linguistic Bias in Vision-Language ModelsAI-generated descriptions often get stuck in linguistic loops—repeating labels, flooding attributes, propagating roots. We detect these bias patterns using attribution before they skew creative outputs.

Research

When Tools Become Prompts: Why Multi-Turn MCP Systems Break Traditional Security AssumptionsHow multi-turn MCP systems enable direct PII propagation and cross-server PII reconstruction, revealing security failures that do not appear in single-turn tool evaluations.

Research

Privacy

TikTok TechJam 2025 Highlights: Building with Joy, Coding for ChangeTikTok TechJam brings university students together to push boundaries and showcase creativity. This year's edition drew 2,000+ applications and 308 submissions, with 12 finalist teams tackling real-world challenges.

Community