TikTok for Developers
Beyond Hallucinations: Uncovering Linguistic Bias in Vision-Language Models
by Sirat Samyoun and Jian Du, Research Scientist, Privacy Innovation Lab
Research

Vision-language models don't just hallucinate objects—they also get stuck in language.

Look at the examples above. The models aren't inventing what isn't there. They're showing linguistic bias: a systematic over-reliance on language patterns that overrides visual evidence.

When Language Overrides Vision

VLMs are trained on millions of image-text pairs. They learn not only to see but to describe. Sometimes, however, their linguistic training overwhelms their visual reasoning. Instead of engaging with what's in the image, they fall back on statistical word associations.

Three Patterns of Linguistic Bias

From our analysis of model outputs, we've identified three clear patterns where language dominates vision:

Label Repetition Bias

"Goat, goat, goat, goat, goat, goat…" (17 times)

The model correctly identifies goats in a farm scene—but instead of describing the four actual goats, it repeats the label seventeen times. It's caught in a linguistic loop, prioritizing word frequency over visual counting.

Attribute Flooding Bias

"Boat, red, white, and blue, red, white, and blue, red, white, and blue"

Faced with a simple boat scene, the model showers it with the same color descriptors repeatedly. The adjectives recycle, burying any opportunity for richer visual description.

Lexical Root Propagation Bias

A third pattern we observe occurs when models latch onto a root word and generate unsupported variations—like seeing "airport" and outputting "airport vehicle, airport staff, airport operations" without visual evidence for each variant.

These aren't hallucinations—the models aren't inventing objects. But they're not fully engaging with what they see either. They're demonstrating a reasoning gap where language patterns dominate visual evidence.

Why Traditional Safety Misses This

Most content safety systems check for harmful or explicit violations. Linguistic bias isn't harmful—it's statistically predictable and visually disengaged. The problem isn't what's said, but how often and why it's said.

The Attribution Approach: Asking "Why This Word?"

We detect linguistic bias by tracing each word back to its visual inspiration. Our method works by:

Step 1: Extract visual and text features

For each generated term, we extract patch-level visual features and token embeddings.

Step 2: Compute attribution scores

Using Shapley values from cooperative game theory, we measure each image patch's contribution to the word choice.

Step 3: Measure confidence drop

We compare confidence when the model sees the real image versus a blank input. A small drop means the word is language-driven, not vision-driven.

Step 4: Flag the gaps

Weak attribution + minimal confidence drop = linguistic bias.

The approach is training-free, model-agnostic, and adds minimal overhead—making it practical for real-time content pipelines.

Results: Detecting Bias Across Models

We evaluated across several leading open-source VLMs (LLaVA, Qwen-VL, Llama-Vision) on a curated benchmark. Our attribution method consistently identified all three bias patterns—label repetition, attribute flooding, and root propagation—improving detection accuracy by up to 8% over existing approaches that rely solely on similarity scores or attention aggregation. The method proved particularly effective for label repetition, where the gap between visual evidence and linguistic output is most pronounced, achieving over 87% accuracy in flagging excessive repetitions.

Why This Matters

While much research focuses on hallucination detection, linguistic bias presents a distinct challenge: it doesn't introduce false facts, but diminishes descriptive richness and reinforces repetitive patterns.

In creative and descriptive applications—from ad copy to alt-text generation—these biases produce outputs that are monotonous, predictable, and visually disengaged. By detecting linguistic bias alongside hallucinations, we can:

  • Expand safety evaluation beyond factuality to include language diversity and visual grounding
  • Guide models toward more varied, attentive descriptions that engage with what they see
  • Build generated content that is not only accurate but also engaging and trustworthy

This work complements hallucination detection by addressing a different failure mode—not when models invent, but when they repeat, flood, and propagate language without sufficient visual cause.

Share this article
Discover more
Beyond Hallucinations: Uncovering Linguistic Bias in Vision-Language ModelsAI-generated descriptions often get stuck in linguistic loops—repeating labels, flooding attributes, propagating roots. We detect these bias patterns using attribution before they skew creative outputs.
Research
Want to stay in the loop?Subscribe to our mailing list to be the first to know about future blog posts!
By providing your email address and subscribing, you consent to TikTok sending you email notifications whenever a new article is posted on our blogs. You may opt out at any time using the unsubscribe link in each email. Read our full Privacy Policy for more information.
TikTok for Developers