u/CreativeKeane

Hey all, so I am currently exploring and playing around with Karpathy's LLM Wiki using Claude Code with Ollama and other routed models.

I want to create some agents and provide them with tools/plugins, libraries, MCPs, or harnesses to assist in mainly document/file curation and ingestion.

What are some tools that you guys are using for those things? Also, if there are any other useful tools, please let me know.

I don't mind creating some custom scripts for them if required. I prefer either free or affordable alternatives, but I'm open to paying if the paid tools are invaluable.

Honestly, it's fairly close to and similar to the preliminary steps for RAG, so I'm sure folks encountered the same questions before.

Here are the tools I would be interested in and some options I am looking at for each category:

  1. Web Search - Abilities for an agent or LLM to search for information online, with references, and extract it into markdown or text. The agent does the searching on its own.
    • Current contenders: Kindly MCP, Perplexica + SearxNG, or CoexistAI
  2. Web Scraping - Abstraction of content from the entire webpage or website (if it sees associated links) if given an explicit URL.
    • Current contenders: Crawl4AI (Unclecode)
  3. Transcript Extraction from YouTube Videos - Feed LLM a YouTube link, and it extracts or pulls the transcript from the YouTube video.
    • Current contenders: Tubelab MCP, youtube-rag-scraper(rav4nn), youtubetranscribes
  4. Document Extraction/Ingestion - Take documents in various formats like Word Doc, Excel, PDF, and convert them into Markdown (that can further be processed or chunked)
    • Current contenders: Markitdown (microsoft),
  5. Documents with complex tables - May Requires manual page extraction, but the idea is similar to #4, how do you extract information from complex tables or tables of scanned documents.
    • Current contenders: OCR (Arrase), MistralOCR, LlamaParse
reddit.com
u/CreativeKeane — 25 days ago