u/Acceptable-Object390

▲ 2 r/ollama

RTX vs Apple Silicon

Local AI hardware is basically a religious war with better benchmarks.

NVIDIA RTX GPUs are the sports cars: fast VRAM, CUDA, absurd token throughput if the model fits.

Apple Silicon is the weirdly elegant camper van: unified memory means you can often fit much larger models locally, especially on something like an M4 Max with up to 128GB RAM.

So the tradeoff is simple:

RTX = faster kitchen

Mac = bigger fridge

I run Qwen 3.6 27B locally on an RTX 5090 inside Thoth because 32GB VRAM is the sweet spot for my daily driver setup: fast, private, and no API round trips.

But Thoth is designed local-first, not NVIDIA-first.

Ollama, llama.cpp, OpenAI-compatible local endpoints, the point is that your AI should run where you want it to run.

Your machine. Your models. Your memory. Your data. Cloud optional. Local by default.

reddit.com
u/Acceptable-Object390 — 17 hours ago
▲ 2 r/OpenSourceeAI+3 crossposts

.md files are not Memory

A folder of .md files is not memory.

It’s a storage dump.

Useful AI memory needs more than “search old notes and pray”:

- semantic recall, so related ideas surface even when wording differs

- entities, different terms for the same thing don’t become random blobs

- relationships, so the system knows how things connect

- provenance, so it can trace where facts came from

- correction + forgetting, because stale memory is worse than no memory

- background consolidation, because raw chat logs are mostly sludge

Thoth uses a local personal knowledge graph + FAISS semantic search + graph expansion + document ingestion + wiki export.

So yes, you can still get readable notes.

But underneath, the assistant isn’t just rifling through markdown like a raccoon in a filing cabinet.

It’s building structured personal context it can retrieve, update, connect, and reason over.

That’s the difference between “I saved your notes” and “I actually know what matters.”

Relevant references:

  1. FAISS docs: efficient similarity search and clustering of dense vectors.

    https://faiss.ai/

  2. Microsoft GraphRAG: combines text extraction, network analysis, LLM prompting, and summarisation for richer understanding of text datasets.

    https://www.microsoft.com/en-us/research/project/graphrag/

  3. GraphRAG survey on arXiv: graphs encode heterogeneous and relational information, making them useful for retrieval-augmented generation.

    https://arxiv.org/abs/2501.00309

  4. Thoth README memory features: personal knowledge graph, typed relations, FAISS semantic recall, graph expansion, document extraction, wiki export, Dream Cycle refinement.

    https://github.com/siddsachar/Thoth

u/Acceptable-Object390 — 2 days ago
▲ 1 r/softwarearchitecture+1 crossposts

Thoth Custom Tool Builder Architecture

Custom Tools in Thoth let you turn a repo, folder, or current Developer workspace into a reusable AI tool, without hand-writing manifests.

The flow is deliberately controlled:

Source → Inspect → Review → Test → Enable

https://github.com/siddsachar/Thoth

Thoth can inspect repo files and READMEs, propose useful commands, classify risk, validate command templates, run smoke tests, and then let you decide where the tool is available.

Developer-only? Fine.

Promoted to normal chat? Also fine, but never silently.

The core principle: repo code is not trusted by default.

Custom Tools are opt-in, reviewable, testable, disableable, and removable without deleting the source repo.

u/Acceptable-Object390 — 7 days ago
▲ 0 r/LangChain+2 crossposts

Thoth Developer Studio Architecture

Thoth’s Developer Studio is not trying to be a full IDE.

V3.22.0 adds Developer Studio. It’s a Codex-style agent workbench for real repo work:

link a local Git repo

inject compact workspace context

review code and diffs

make scoped edits

run tests

prepare branches, commits, and PRs

keep every risky action behind the right approval mode

https://github.com/siddsachar/Thoth

The important bit: the agent doesn’t just “chat about code”. It works inside a governed repo workspace with:

approval policies

sandbox/local execution modes

inspector snapshots

change ledgers

persistent todos

GitHub/PR helpers

safe revert paths

The goal is simple: give AI enough context to be useful, but enough boundaries to stay trustworthy.

u/Acceptable-Object390 — 6 days ago
▲ 5 r/OpenSourceeAI+4 crossposts

Thoth v3.22.0 just dropped and it turns the app into a real developer workbench

Developer Studio gives you a dedicated coding surface with repo linking, code threads, diffs, todos, test detection, Git operations, and a live inspector that stays in sync during long runs.

Custom Tools let you convert any repo into a tool. Thoth can inspect it, propose commands, validate them, test them, and promote them into your normal chat workflow.

Docker Sandbox adds a safe execution mode with persistent containers, network controls, and clean import paths so you can experiment without risking your actual repo.

Plus a long list of upgrades across workflows, Home status, chat streaming, Settings, onboarding, embeddings, and overall stability.

u/Acceptable-Object390 — 7 days ago
▲ 3 r/ollama

Thoth v3.22.0 just dropped and it turns the app into a real developer workbench

Developer Studio gives you a dedicated coding surface with repo linking, code threads, diffs, todos, test detection, Git operations, and a live inspector that stays in sync during long runs.

Custom Tools let you convert any repo into a tool. Thoth can inspect it, propose commands, validate them, test them, and promote them into your normal chat workflow.

Docker Sandbox adds a safe execution mode with persistent containers, network controls, and clean import paths so you can experiment without risking your actual repo.

Plus a long list of upgrades across workflows, Home status, chat streaming, Settings, onboarding, embeddings, and overall stability.

GitHub

u/Acceptable-Object390 — 7 days ago
▲ 3 r/OpenSourceeAI+4 crossposts

AI Assistant are becoming the Personal AI Operating layer

Most AI tools are single-purpose: chat, code, search, summarise, automate.

Thoth is built to act more like a personal AI operating layer.

It connects the pieces around you:

- your models: local, OpenAI, Anthropic, Google, xAI, OpenRouter

- your memory: knowledge graph, semantic recall, wiki vault

- your tools: browser, shell, Gmail, Calendar, files, documents

- your workflows: scheduled tasks, reminders, monitoring, multi-step pipelines

- your channels: Telegram, Slack, Discord, WhatsApp, SMS

- your creativity: image generation, video generation, Designer Studio

- your safety layer: approvals, tool boundaries, local-first storage

The point is not just to chat with an AI.

The point is to give AI a stable place to live:

a system that remembers, acts, automates, researches, designs, and works across your digital environment.

Models will keep changing.

Your personal AI layer should persist.

That is what Thoth is becoming:

an open-source, local-first AI operating system for your work, memory, tools, and workflows.

𓁟 Thoth

https://github.com/siddsachar/Thoth

u/Acceptable-Object390 — 11 days ago
▲ 3 r/OpenSourceeAI+3 crossposts

I’ve been building Thoth as a local-first AI assistant, and one of the biggest design goals has been simple:

Local models should not feel like second-class citizens.

A lot of AI apps technically “support Ollama” or “support local models”, but the actual architecture is still shaped around cloud APIs: huge static prompts, tool definitions dumped into the context, provider-specific assumptions, and workflows that quietly break once you move to a smaller local model.

Thoth takes a different approach. Local AI is not a bolt-on provider. It is one of the main paths the system is designed and tested against.

Github Repo

First-party Ollama support

Ollama is the first-party local runtime in Thoth.

If you want the simplest local setup, you can point Thoth at Ollama and run models from your own machine. On Linux, the launcher can even start ollama serve automatically when it is available, so the local model path is part of the normal startup flow rather than an advanced escape hatch.

The model layer is deliberately split from the rest of the assistant:

  • Ollama for local models
  • OpenAI-compatible custom endpoints for LM Studio, vLLM, LocalAI, private gateways, or self-hosted inference stacks
  • Optional cloud providers for people who want them
  • ChatGPT / Codex subscription support where available

The important bit is that the agent core does not need to care whether the model is local or cloud. It asks the model layer for a capable chat model, then the rest of the architecture adapts around the selected runtime.

Custom endpoints, not just one local runtime

Ollama is the default local story, but it is not the only one.

Thoth also supports custom OpenAI-compatible endpoints. That means you can use local or private model servers such as:

  • LM Studio
  • vLLM
  • LocalAI
  • llama.cpp-backed gateways
  • internal company inference endpoints
  • any compatible /v1/chat/completions style server

This matters because “local AI” is not one thing. Some people want a simple desktop model through Ollama. Others want a GPU box on their network. Others want a private inference gateway. Thoth tries to support the shape of local AI people actually use.

Context management is built for smaller models

Local models often have tighter context windows than frontier cloud models. So Thoth does not assume it can throw everything into the prompt forever.

The agent has explicit context management:

  • conversation summarisation around the high-water mark
  • hard trimming before the model context is exceeded
  • dynamic prompt construction based on what is actually needed
  • dynamic tool budgeting when context pressure gets high
  • memory recall that fetches relevant facts instead of dumping the whole knowledge base

That last point is important. Thoth has a local knowledge graph and vector recall system. Instead of adding every saved memory to every prompt, it retrieves the most relevant entities and relationships for the current task, then injects only that useful slice into the prompt.

For local models, this is the difference between “technically works” and “actually usable”.

Tool guides are dynamic, not one giant wall of text

Thoth has a large tool surface: browser automation, shell, files, Gmail, Calendar, tasks, trackers, documents, charts, weather, image/video generation, MCP tools, Designer Studio, and more.

A naive implementation would paste every tool guide into every prompt. That works poorly with local models. It wastes context, makes the prompt harder to follow, and increases the chance that the model picks the wrong tool.

So Thoth builds the system prompt dynamically.

The assistant receives:

  • the relevant identity and safety rules
  • the active model/provider context
  • the enabled skills
  • current task/workflow state
  • relevant memories
  • available tools
  • tool guidance that matches the situation

Under context pressure, lower-priority tool detail can be hidden or compressed so the model keeps the core instructions and the most relevant capabilities.

This is one of the less visible parts of local-model support, but it matters a lot. The prompt has to be shaped for the model that is actually running, not for an imaginary unlimited context window.

Local memory and local workflows

The local story is not only model inference.

Thoth’s memory system is local too:

  • SQLite entity/relation database
  • graph traversal for connected memories
  • FAISS vector search for semantic recall
  • Obsidian-compatible wiki export
  • document extraction into the knowledge graph
  • Dream Cycle consolidation for dedupe, enrichment, decay, inference, and insights

Workflows also run locally. Scheduled tasks, recurring automations, monitoring jobs, approval-gated pipelines, and persistent workflow threads all run from the local Thoth runtime.

So a local model can still use long-term memory, tools, workflows, and background automation. You are not just chatting with a model; you are running a local assistant stack.

Safety still works locally

Local-first does not mean reckless automation.

Thoth keeps safety gates around powerful tools:

  • filesystem sandboxing for workspace file access
  • shell command classification and approvals
  • confirmation for destructive actions
  • approval gates in workflow pipelines
  • MCP per-server and per-tool toggles
  • prompt-injection defences for web pages, documents, emails, and tool output

The model can be local, but the control layer remains explicit. The assistant should not get more dangerous just because it is running on your own machine.

Local-first testing

We test Thoth local-first because that is the harshest path.

Cloud models are usually more forgiving: bigger context windows, stronger instruction following, better tool-use reliability. If something only works on the biggest hosted models, it is not robust enough.

So local testing forces the architecture to be better:

  • prompts must be concise enough to fit
  • tool guides must be clear enough for smaller models
  • context trimming must not destroy the task
  • memory recall must return the right facts, not a pile of noise
  • workflows must survive multi-step execution
  • tool calls must be described consistently
  • provider switching must not break the agent loop

Ollama is treated as a first-class test path, not just a compatibility checkbox. The goal is that if a feature claims to work in Thoth, it should be tested against the local runtime path as well as provider models.

Why this matters

Local AI is not only about privacy, although that is a big part of it.

It is also about ownership.

Your assistant should not be tied to one model vendor. Your memory should not disappear when you switch providers. Your workflows should not depend on a single hosted API. Your tools, documents, automations, and knowledge base should remain yours.

That is the architecture Thoth is moving towards:

models are replaceable, but the assistant layer persists.

You can run Ollama locally, connect a private endpoint, use a cloud model when you want more power, or move between them. The same memory system, tool layer, workflows, safety gates, and UI remain in place.

That is the whole local AI story for Thoth: not just “we support local models”, but “the assistant is designed so local models can actually carry the product.”

u/Acceptable-Object390 — 14 days ago
▲ 7 r/OpenSourceeAI+4 crossposts

I’ve been working on Thoth, a free and open-source local-first AI assistant, and I wanted to explain how the Linux version actually works under the hood.

The short version: Thoth installs as a normal user-space Linux app, runs locally, opens in your browser by default, and keeps durable data on your machine.

The diagram breaks down the full flow:

  • one-line Linux installer
  • verified GitHub release tarball
  • XDG user install under ~/.local/share/thoth
  • launcher symlink at ~/.local/bin/thoth
  • browser-first startup with optional native window/tray support
  • local NiceGUI web app
  • LangGraph ReAct agent core
  • Ollama/local model support
  • optional cloud/provider models
  • local memory graph, FAISS recall, and Obsidian wiki export
  • workflows, browser automation, shell access, Designer Studio, channels, MCP tools, and safety gates

One thing I wanted to avoid was making Linux support depend on Docker or a heavy desktop runtime. The baseline path is deliberately simple:

curl -fsSL https://raw.githubusercontent.com/siddsachar/Thoth/main/installer/install-linux.sh | bash

That downloads the latest Linux tarball from GitHub Releases, checks the SHA256 from the release manifest, installs it into the user’s XDG paths, and creates the thoth command.

On launch, Thoth starts the local app server, picks an available local port, opens the UI in the system browser, and keeps app data in ~/.thoth. If desktop libraries are available, native window/tray support can be used too, but the default Linux path doesn’t require it.

The overall philosophy is:

Your data stays local by default. Models are your choice. Tools are explicit. Destructive actions are approval-gated.

Thoth can run fully local through Ollama, or you can opt into providers like OpenAI, Anthropic, Google, xAI, OpenRouter, etc. Durable data like memories, documents, workflows, conversations, browser profile, and wiki export remain local unless you explicitly surface them in the current conversation or tool output.

The GitHub repo is here if anyone wants to try it or inspect the code:

https://github.com/siddsachar/Thoth

Curious what people think of this Linux packaging approach - browser-first XDG tarball instead of Docker/AppImage/Flatpak - and whether there are parts of the architecture I should explain in more detail.

u/Acceptable-Object390 — 14 days ago
▲ 9 r/LangChain+2 crossposts

This release introduces the first real foundation for Buddy Companion, a local animated presence that reacts to what Thoth is doing. It also improves model selection, Vision handling, and startup reliability on Linux and Windows. The focus is expression, clarity, and stability across the whole app.

GitHub Repo

Below is a deeper breakdown for anyone who wants the technical details.

Buddy Companion Foundation

Buddy now has a real subsystem behind it. This includes:

  • a prompt‑generated Buddy architecture with a thread‑safe event bus
  • a deterministic behavior brain
  • persistent config and pack validation
  • Hatch art and motion generation
  • a canvas playback engine with effects
  • one in‑app Buddy that lives in the sidebar
  • a separate desktop overlay surface for systems that support it

Buddy listens to Thoth’s internal events. It reacts to chat streaming, thinking, tool calls, approvals, workflows, notifications, and voice state. The identity stays unified under Preferences so Buddy does not introduce a second name or persona. The UI focuses on state, personality, and motion.

A new route called /buddy-overlay supports the desktop Buddy window where native overlay helpers are available.

Motion, Packs, and UI Polish

This release ships with bundled first‑party motion packs: glyph, lumen, ember, pixel, sprout, and orbit. Hatch‑generated custom packs are copied into Thoth’s served assets so they behave like native packs.

Key improvements:

  • better prompts for Hatch generation so backgrounds, padding, and edges key cleanly
  • smoother transitions between idle, thinking, working, approval, success, and error
  • MP4 playback crossfades state changes and avoids jitter when loops restart
  • idle motion replays periodically without looking busy
  • Buddy can be dragged out of the sidebar and snaps back when released near the dock
  • Buddy returns home on restart instead of remembering stray positions

Settings for Buddy now use a dense layout similar to the Models tab. Pack selection uses preview tiles and clears stale overrides when switching back to bundled packs.

Hatch save and recovery

This part got a lot of attention:

  • saving Buddy settings now preserves newly generated Hatch art and motion
  • generated packs become selectable user packs
  • still‑only art remains valid when motion generation fails
  • users can delete generated packs
  • motion retry regenerates from the selected still without overwriting the pack manifest
  • new motion requests use provider‑compatible 5 second clips
  • full Buddy generation runs as a background job with progress and notifications
  • internal prompts stay private so user concepts do not turn into pose sheets
  • transparent stills are composited onto a stable background before video generation
  • older Hatch packs with overwritten manifests are recovered on load

Stopping a workflow immediately moves Buddy out of the running state.

Desktop Overlay Reliability

The desktop Buddy overlay is more stable now:

  • approval, denial, workflow, and error bubbles stay visible even in Quiet mode
  • bubbles survive rapid state changes
  • the overlay waits for the transparent document to paint before revealing
  • fallback window creation paths help when transparency or hidden‑window hints fail
  • startup guards prevent transient None values from crashing the overlay
  • no more snapshot pushes into deleted NiceGUI clients

Workflow state cleanup is also more accurate. Denials, timeouts, cancellations, and stops clear Buddy’s workflow state immediately. Successful multi‑step workflows emit a clear done state.

Models, Vision, and Settings Reliability

A lot of polish landed here:

  • Settings loads the provider catalog lazily and caps rows so huge catalogs do not crash
  • timers clean up properly when clients disconnect
  • local Ollama chat models appear even when their family is not in Thoth’s curated lists
  • Brain and Vision pickers now make it clear that catalog rows must be pinned first
  • Codex Vision pins keep their image‑input capability during Quick Choice refreshes
  • Codex Responses transport preserves multimodal image blocks
  • Vision model changes are validated against Quick Choices, local models, and provider catalogs
  • invented or unavailable model names are rejected with actionable guidance

There is now an explicit vision_model setting.

Linux and Startup Reliability

Linux users get a much more predictable startup path:

  • the launcher resolves symlink chains correctly so ~/.local/bin/thoth always starts the right version
  • packaged launches report startup log tails and child process exit details
  • THOTH_STARTUP_TIMEOUT is configurable
  • clearer hints for missing or broken native dependencies like OpenCV, FAISS, or NumPy
  • camera and screenshot capture degrade gracefully instead of blocking startup

Installer UX is also improved. Source builds support a simple bash build_linux_app.sh <version> command. Success messages now mention ~/.local/bin/thoth when the bin directory is not on PATH. Maintainer docs now distinguish unreleased tarball testing from the one‑line installer path.

Optional native packages like TorchCodec are detected and logged with concrete recovery commands. Transformers treats broken optional packages as unavailable instead of letting them crash startup.

Windows repair hardening

The Windows installer now replaces the embedded Python runtime during repair or upgrade. This prevents corrupted or manually installed packages from surviving an over‑the‑top reinstall.

Summary

v3.21.0 brings:

  • a real Buddy Companion foundation
  • cleaner motion, better UI, and more reliable generation
  • clearer model and Vision selection
  • stronger Linux startup and better diagnostics
  • safer Windows repair behavior

It is a mix of expression, stability, and quality of life improvements across the entire app.

u/Acceptable-Object390 — 15 days ago

Thoth is built around a simple product belief: ease of use and power shouldn’t be trade-offs.

Most AI tools force users into one of two camps. Some are simple, polished, and approachable, but they hide the deeper controls that advanced users need. Others are flexible and powerful, but they feel technical from the first click. Thoth is designed to bridge that gap.

The interface starts with the most familiar pattern: a conversation. Users can ask questions, drag in files, speak naturally, schedule reminders, browse the web, manage email, or work with documents without needing to understand the underlying system. For everyday use, Thoth feels like a helpful assistant that just gets things done.

But underneath that simple surface is a much deeper layer.

Thoth uses progressive disclosure to reveal complexity only when it becomes useful. A user can begin with a natural-language request, then gradually move into reusable skills, tool workflows, scheduled automations, approval gates, multi-step pipelines, browser control, shell access, model switching, and knowledge graph memory. The same product supports both quick tasks and serious power-user workflows.

This is the core UX principle behind Thoth: start simple, scale with the user.

The architecture is designed around three connected layers:

  1. Everyday UX: chat, natural-language actions, drag-and-drop files, voice input, and one-click workflows.
  2. Adaptive UX Engine: guided defaults, smart suggestions, memory-aware context, reusable skills, and approval gates.
  3. Power User Control: workflow pipelines, tool orchestration, browser and shell automation, model/provider switching, knowledge graph access, wiki integration, and plugin extensions.

The important part is that these aren’t separate modes or separate products. They’re part of one coherent interface. A beginner can stay in the simple layer forever. A technical user can go deeper. And someone can move between both as their needs grow.

Thoth’s goal isn’t to make AI feel simpler by removing capability. It’s to make advanced capability feel approachable.

That’s why the product is local-first, open-source, and built around user-owned data. The user keeps control, while the interface helps manage complexity instead of exposing it all at once.

reddit.com
u/Acceptable-Object390 — 16 days ago
▲ 4 r/AIDiscussion+6 crossposts

Thoth is built around a simple product belief: ease of use and power shouldn’t be trade-offs.

Most AI tools force users into one of two camps. Some are simple, polished, and approachable, but they hide the deeper controls that advanced users need. Others are flexible and powerful, but they feel technical from the first click. Thoth is designed to bridge that gap.

The interface starts with the most familiar pattern: a conversation. Users can ask questions, drag in files, speak naturally, schedule reminders, browse the web, manage email, or work with documents without needing to understand the underlying system. For everyday use, Thoth feels like a helpful assistant that just gets things done.

But underneath that simple surface is a much deeper layer.

Github Repo

Thoth uses progressive disclosure to reveal complexity only when it becomes useful. A user can begin with a natural-language request, then gradually move into reusable skills, tool workflows, scheduled automations, approval gates, multi-step pipelines, browser control, shell access, model switching, and knowledge graph memory. The same product supports both quick tasks and serious power-user workflows.

This is the core UX principle behind Thoth: start simple, scale with the user.

The architecture is designed around three connected layers:

  1. Everyday UX: chat, natural-language actions, drag-and-drop files, voice input, and one-click workflows.
  2. Adaptive UX Engine: guided defaults, smart suggestions, memory-aware context, reusable skills, and approval gates.
  3. Power User Control: workflow pipelines, tool orchestration, browser and shell automation, model/provider switching, knowledge graph access, wiki integration, and plugin extensions.

The important part is that these aren’t separate modes or separate products. They’re part of one coherent interface. A beginner can stay in the simple layer forever. A technical user can go deeper. And someone can move between both as their needs grow.

Thoth’s goal isn’t to make AI feel simpler by removing capability. It’s to make advanced capability feel approachable.

That’s why the product is local-first, open-source, and built around user-owned data. The user keeps control, while the interface helps manage complexity instead of exposing it all at once.

In short: Thoth is designed to be easy enough for everyday use, but powerful enough to become a personal AI operating layer for serious work.

u/Acceptable-Object390 — 16 days ago