u/Independent_Box_8206

I've been testing how ChatGPT/Claude/Gemini fabricate arXiv citations, and the most common failure mode surprised me. Sharing in case it's useful to others here.

The intuition is that fake citations have fake IDs — you paste the ID into arXiv, get nothing, done. That's the easy case.

The harder case: the model invents a plausible title, then attaches a REAL arXiv ID that belongs to a completely unrelated paper.

Concrete example from my testing:

Claimed: "Hierarchical Sparse Attention for Million-Token Context Windows" (arXiv:2403.18291)

Reality: 2403.18291 is "Towards Non-Exemplar Semi-Supervised Class-Incremental Learning"

The ID resolves. The arXiv link works. It passes every eyeball check and most reference-manager validation, because those typically only check whether the ID exists — not whether the ID's actual paper matches the claimed title.

So "does this ID exist" is the wrong question. The right one is "does the paper at this ID match what was cited."

I built this title-vs-ID cross-check into a small free tool (link + 60s demo in comments, to respect self-promo rules). But I'm more interested in the research angle:

Has anyone characterized the distribution of these fabrication modes? (fully-fake / real-ID-wrong-title / real-paper-wrong-metadata / author-year-no-anchor)
Since most fabrications likely cite non-arXiv venues, would Crossref / Semantic Scholar cross-checking catch substantially more? A recent Lancet analysis found fabricated citations in biomedical literature rising ~12x since 2023, and that's all PubMed/journal territory.
What's a principled way to set the title-match threshold? Too strict and you flag real papers cited by shorthand ("BERT", "FlashAttention"); too loose and you miss the fabrications.

Curious if anyone's worked on this or seen good prior art.

[P] AI doesn't just fake citations — it attaches REAL arXiv IDs to fake titles

Invisible Cage