u/HonestBackground9830

▲ 17 r/Rag

I built a Go CLI that compiles compiles documents into GraphRAG knowledge bases which are zero-infra Docker containers.

Hey everyone,

I was tired of setting up Python, Redis, Pinecone, and FastAPI just to get a decent RAG agent running. I wanted something that felt more like a static site generator—where I compile my knowledge once, and then serve it anywhere with zero infrastructure.

So I built Kash.

It’s a Go CLI that takes your raw documents (PDFs, Markdown, txt) and compiles them into an embedded GraphRAG brain (using chromem-go for vectors and cayley for knowledge graphs). The final output is a lightweight Docker container (base size ~50MB) that you can ship and run anywhere.

Key Features:

  • Zero Infrastructure: No external databases required. Everything is embedded directly into the binary/container.
  • Provider Agnostic (BYOM): Works with any OpenAI-compatible API (Ollama, LiteLLM, Anthropic via proxy, OpenAI, etc.).
  • Hybrid RAG: Uses both Vector similarity + Knowledge Graph traversal for much better context retrieval.
  • Three Interfaces out of the box:
    • REST API: Drop-in OpenAI replacement (plugs into Open WebUI, LibreChat, AnythingLLM).
    • MCP Server: Exposes your knowledge base as a tool directly inside IDEs like Cursor and Windsurf!
    • A2A Protocol: JSON-RPC for multi-agent frameworks like CrewAI (WIP).

🚀 Example: Running the Stargate Expert Agent

To show how this distribution model works, I compiled an expert agent pre-loaded with declassified CIA Stargate project documents.

You can run it on your machine right now with one command. You just bring your own API keys for the runtime queries—the vector and graph data is already baked into the image!

bashdocker run -p 8000:8000 \
  -e LLM_BASE_URL="https://api.openai.com/v1" \
  -e LLM_API_KEY="sk-your-key-here" \
  -e LLM_MODEL="gpt-4o" \
  -e EMBED_BASE_URL="https://api.voyageai.com/v1" \
  -e EMBED_API_KEY="pa-your-key-here" \
  -e EMBED_MODEL="voyage-4" \
  redlord/stargate-expert:latest

Once it's running, it exposes an OpenAI-compatible endpoint at http://localhost:8000/v1. You can chat with it via curl:

bashcurl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "What was the primary purpose of the Stargate project?"}]
  }'

Or better yet, connect it to Cursor via MCP by adding http://localhost:8000/mcp to your Cursor settings!

Try it yourself

If you're interested in building your own expert agents from your company docs, wikis, or study notes and distributing them as Docker containers, the code is fully open-source (MIT).

GitHub Repo: https://github.com/akashicode/kash

Would love to hear your thoughts, feedback, or any issues you run into!

reddit.com
u/HonestBackground9830 — 1 month ago