u/Independent-Flow3408

My vibe coded project hit 50K lines and Claude started hallucinating functions. Fixed it. Here's how.

Hit a wall that I think a lot of people here will recognise.

Started a project, Claude was flying. Added features for 3 weeks. Somewhere around 50,000 lines the whole thing started breaking down. Claude would suggest functions that didn't exist. Reference files it hadn't seen. Confidently import things from the wrong module.

I thought I was doing something wrong. Tried better prompts. Tried breaking the project into smaller files. Tried clearing context and starting fresh every session.

Nothing worked consistently.

Eventually figured out the real problem:

I was sending Claude 80,000 tokens of raw source code every session. The whole repo. Most of it completely irrelevant to what I was actually asking. Claude was getting lost in the noise.

The fix was obvious once I saw it.

Don't send the whole codebase. Send the map.

Function signatures. Import relationships. Type definitions. The skeleton of what exists and where — not the full implementation.

2,000 tokens instead of 80,000.

Results across my projects:

→ Hallucinated function references: basically stopped happening
→ Claude finds the right file: 13% → 78% of the time (measured this properly)
→ Number of prompts to complete a task: cut by 40%

The workflow now:

  1. Run a scan of the codebase → generates a compact signature map
  2. That map auto-injects before every Claude / Cursor session via MCP
  3. Claude orients on the map first, then reads full files only when needed

Claude Code specifically became way more reliable after this. It stops guessing where things are.

If your vibe-coded project is growing and Claude is starting to lose the plot — this is probably why. The context window isn't the problem. What you're putting in it is.

Built a small CLI that automates the scan/inject part if anyone wants it: github.com/manojmallick/sigmap

Genuinely curious if others hit this at the same project size or earlier. What was the breaking point for you?

reddit.com
u/Independent-Flow3408 — 8 days ago

Local coding models need better repo context, not just bigger context windows

Local coding models have a repo-context problem.

When using llama/qwen/mistral/gemma for coding, the hard part is often not the model itself. It is getting the right files/functions into context without dumping too much raw source.

Long context helps, but it does not solve retrieval.

If the model never sees the right file, it still guesses.

I’ve been building SigMap, a zero-dependency CLI that creates a compact repo map for coding workflows.

Instead of sending raw source first, it extracts:

  • function signatures
  • classes/interfaces
  • exports
  • import relationships
  • ranked file matches per query

The workflow is simple:

repo map first → find likely files → read full source only where needed

Benchmarked across 18 repos / 90 tasks:

  • 81.1% hit@5 vs 13.6% random baseline
  • ~6× better file retrieval
  • 96.9% token reduction in the benchmark setup
  • 41.4% fewer prompts per task

No embeddings. No vector DB. No npm dependencies.

This is not meant to replace LSPs, grep, agent search, MCP tools, or full-file reads.

It is meant to give local coding models / agents a cheap first-pass structure map before deeper inspection.

Repo: https://github.com/manojmallick/sigmap

Benchmark suite: https://github.com/manojmallick/sigmap-benchmark-suite

Curious how people here handle repo context with local coding models.

Are you mostly using grep/search, RAG, repo maps, MCP tools, or just relying on longer-context models?

Edit: Good point from the comments — SigMap core is model-agnostic. The docs currently look too focused on proprietary assistants, so I’ll add clearer examples for VSCodium/Open VSX, Continue, Cline/Roo Code, Aider, OpenHands, and local Ollama/llama.cpp workflows.

u/Independent-Flow3408 — 14 days ago
▲ 228 r/AI_Tools_Land+11 crossposts

Built a CLI that cuts AI coding token usage by 97% — 10k downloads, looking for feedback

[Updated]

Been building SigMap for 2 months. It fixes one specific problem: AI coding agents can burn a lot of context during repo orientation: searching broadly, opening full files, and rediscovering structure across sessions. SigMap generates a compact signature map first, so the agent can find likely relevant files before reading full source.

Results from benchmarking 18 real repos:

  • 81.1% hit@5 vs 13.6% random baseline
  • 96.9% token reduction
  • 41.4% fewer prompts per task
  • Task success: 10% → 59%
  • Tokens: 80,000 → 2,000

npx sigmap — zero deps, 10 seconds, no config

https://github.com/manojmallick/sigmap

What would you add to make this more useful?

u/Independent-Flow3408 — 16 days ago

EDIT: Updated framing based on community feedback. The accurate stat is retrieval accuracy: 13.6% → 81.1% hit@5 across 18 repos, 90 tasks. The hallucination framing was imprecise. Full benchmark methodology: https://github.com/manojmallick/sigmap-benchmark-suite

Been building SigMap for 2 years. It fixes one specific problem: AI coding tools read your entire codebase every session (~80k tokens). SigMap extracts only function signatures and sends ~2k tokens instead.

Measured results across 18 repos / 90 tasks:

  • 81.1% retrieval hit@5 vs 13.6% random (6× lift)
  • 96.9% token reduction
  • 13/18 repos overflow context without SigMap → 0/18
  • 41.4% fewer prompts per task
  • Task success: 10% → 59%

npx sigmap — zero deps, 10 seconds, no config

→ Git: https://github.com/manojmallick/sigmap

→ Benchmarks: https://github.com/manojmallick/sigmap-benchmark-suite

→ Docs: https://manojmallick.github.io/sigmap

What would you add to make this more useful?

u/Independent-Flow3408 — 16 days ago

We completed an empirical study evaluating context extraction strategies across 405 diverse open-source repositories spanning 30+ programming languages.

Study Overview:

  • 405 repositories analyzed (30+ languages)
  • 2,025+ benchmark operations
  • 1.6M+ source files, 108M+ lines of code
  • 99.6% execution success, 100% data completeness

Key Findings:

  1. Language organization matters more than project size

    • Token reduction ranges: 76.5% to 99.9%
    • Size variation: 5 files → 38,667 files
    • What matters: code idioms, framework patterns, monorepo structure
  2. Monorepo patterns identified

    • 45 monorepos (18.8% of dataset)
    • Specialized handling yields 2–3% improvement
    • Indicates optimization potential for large-scale systems
  3. Language-specific breakdown

    • Python: 96.2% ± 1.8% (most consistent)
    • Go: 95.2% ± 2.1%
    • Rust: 94.8% ± 2.4%
    • Java: 94.5% ± 2.6%
    • JavaScript: 92.1% ± 4.2% (higher variability due to framework diversity)
  4. Methodology validation

    • Extended dataset (405 repos) shows identical 96.2% avg to earlier version (240 repos)
    • Confirms findings generalize across samples
    • Consistent behavior across languages

What's Included (Open Science):

This research includes:

  • Complete datasets (CSV, JSON, JSONL, SQL formats)
  • Research papers with methodology
  • Reproducibility scripts (clone, benchmark, finalize)
  • Hardware specs documented (c2-standard-8)
  • Expected variance < 2%
  • Step-by-step reproduction guide
  • CC-BY-4.0 license

Resources:

Related open-source materials:

  • Context extraction implementation
  • Documentation and setup guides
  • Benchmark dataset (405 repositories + reproducibility package)

(Links available on request to avoid self-promotion issues)

For researchers interested in:

  • Context extraction effectiveness
  • Language-specific code compression patterns
  • LLM integration in software engineering
  • Empirical software engineering methodology
  • Reproducible research practices

Questions and feedback welcome. Happy to share details or discuss methodology.

reddit.com
u/Independent-Flow3408 — 24 days ago

We completed an empirical study evaluating context extraction strategies across 405 diverse open-source repositories spanning 30+ programming languages.

Study Overview:

  • 405 repositories analyzed (30+ languages)
  • 2,025+ benchmark operations
  • 1.6M+ source files, 108M+ lines of code
  • 99.6% execution success, 100% data completeness

Key Findings:

  1. Language organization matters more than project size

    • Token reduction ranges: 76.5% to 99.9%
    • Size variation: 5 files → 38,667 files
    • What matters: code idioms, framework patterns, monorepo structure
  2. Monorepo patterns identified

    • 45 monorepos (18.8% of dataset)
    • Specialized handling yields 2-3% improvement
    • Significant optimization opportunity
  3. Language-specific breakdown

    • Python: 96.2% ± 1.8% (most consistent)
    • Go: 95.2% ± 2.1%
    • Rust: 94.8% ± 2.4%
    • Java: 94.5% ± 2.6%
    • JavaScript: 92.1% ± 4.2% (highest variability due to framework diversity)
  4. Methodology validation

    • Extended dataset (405 repos) shows identical 96.2% avg to published version (240 repos)
    • Confirms findings generalize across different samples
    • Robust methodology across languages

What's Included (Open Science):

This research includes:

  • Complete datasets (CSV, JSON, JSONL, SQL formats)
  • Research papers with methodology
  • Reproducibility scripts (clone, benchmark, finalize)
  • Hardware specs documented (c2-standard-8)
  • Expected variance < 2%
  • Step-by-step reproduction guide
  • CC-BY-4.0 license

Resources:

This work is part of the larger SigMap project:

  1. SigMap Tool (github.com/manojmallick/sigmap)

    • Context extraction implementation
    • Multi-language support (30+)
    • Production-ready
  2. SigMap Documentation (manojmallick.github.io/sigmap/)

    • Setup guides
    • API reference
    • Integration examples
  3. SigMap Benchmark Suite (github.com/manojmallick/sigmap-benchmark-suite)

    • 405-repo evaluation dataset
    • Research papers
    • Complete reproducibility package

For researchers interested in:

  • Context extraction effectiveness
  • Language-specific code compression patterns
  • LLM integration in software engineering
  • Empirical software engineering methodology
  • Reproducible research practices

Questions and feedback welcome. All code and data are open-source for academic use and beyond.

reddit.com
u/Independent-Flow3408 — 24 days ago

This benchmark is part of the larger SigMap project:

  1. SigMap Tool (github.com/manojmallick/sigmap)

    • Production-ready context extraction
    • Multi-language support
    • Full API for integration
  2. SigMap Documentation (manojmallick.github.io/sigmap/)

    • Setup guides
    • API reference
    • Integration examples
  3. SigMap Benchmark Suite (github.com/manojmallick/sigmap-benchmark-suite)

    • 405 repository evaluation
    • Research papers
    • Complete datasets

For implementation details → check the Tool repo
For usage guides → check the Docs
For evaluation & research → check the Benchmark Suite

reddit.com
u/Independent-Flow3408 — 24 days ago
▲ 0 r/codex

This benchmark is part of the larger SigMap project:

  1. SigMap Tool (github.com/manojmallick/sigmap)

    • Production-ready context extraction
    • Multi-language support
    • Full API for integration
  2. SigMap Documentation (manojmallick.github.io/sigmap/)

    • Setup guides
    • API reference
    • Integration examples
  3. SigMap Benchmark Suite (github.com/manojmallick/sigmap-benchmark-suite)

    • 405 repository evaluation
    • Research papers
    • Complete datasets

For implementation details → check the Tool repo
For usage guides → check the Docs
For evaluation & research → check the Benchmark Suite

reddit.com
u/Independent-Flow3408 — 24 days ago