I built a tool that converts technical PDFs into RAG-ready knowledge bases (Obsidian, AnythingLLM, LangChain)
Tired of cleaning PDFs manually before feeding them into RAG pipelines. Built a tool to automate it. Upload PDF → get clean Markdown, heading-aware chunks, and Obsidian vault with backlinks. Each chunk knows where it sits in the document: ```json { "heading_path": "Chapter 3 > Functions", "tokens": 487, "has_code": true } ``` Also has a CLI for batch processing and direct export to AnythingLLM.