r/Thoth_AI

Qwen 3.6 27B
▲ 63 r/Thoth_AI+5 crossposts

Qwen 3.6 27B

Qwen 3.6 27B has quietly become my daily driver in Thoth.

It fits perfectly into my RTX 5090’s 32GB VRAM, which means I get a proper local model running fast enough for real daily use.

No API round trips. No sending private context away. Just 100% local, 100% private AI.

This is exactly why Thoth is designed local-first: your assistant, memory, tools, workflows, and data should live on your machine by default, with cloud models as an option, not a dependency.

Curious to know your experience with it.

github.com
u/Acceptable-Object390 — 4 days ago
▲ 54 r/Thoth_AI+6 crossposts

Everyone's been talking about memory for agents and claiming that a bunch of .md files thrown into a repo is memory. That is NOT memory, that is good for human readability but agent Memory should be hybrid.

I asked Thoth to explain its memory system, so people know what memory must be like:

"Syd:

Tell the nice people on X how your memory works in detail.

Thoth:

🔧 Using 🪞 Thoth Status…

✅ 🪞 Thoth Status done

Certainly, sir — here’s a detailed X thread for the nice people on X:

1/

How Thoth’s memory works 🧠

Most AI apps treat memory like a long chat log.

Thoth does something different: it builds a local, structured knowledge graph about you, your projects, preferences, people, files, deadlines, and recurring workflows.

2/

Memory starts with entities.

Instead of storing “random notes”, Thoth creates typed entities like:

- User

- Projects

- People

- Organisations

- Places

- Preferences

- Events

- Skills

- Concepts

- Media

Each entity can grow over time as new facts are learned.

3/

Example:

If you say:

“My name is Siddharth, I live in London, and I’m working on Thoth.”

Thoth doesn’t just bury that in a transcript.

It updates the User entity, creates or updates London and Thoth, then links them with relationships like:

- User → lives_in → London

- User → works_on → Thoth

4/

The relationship layer is the important bit.

Thoth stores not just facts, but connections:

- Sarah → mother_of → User

- User → prefers → dark roast coffee

- Meridian API v2 → uses → TypeScript

- Deadline → deadline_for → Project

- Project → part_of → Organisation

This makes recall much more precise.

5/

That means memory is queryable by meaning and structure.

You can ask:

“What do you know about my work?”

Thoth can traverse the graph from User → projects → deadlines → organisations → technologies.

It is not just keyword-searching old chats.

6/

There are two kinds of retrieval:

  1. Automatic recall

    Relevant memories are pulled into context before replies.

  2. Explicit graph search

    Thoth can search or explore the memory graph when you ask broader questions like “tell me about my family” or “what are my current projects?”

7/

Memory is local-first.

The knowledge graph lives on your machine, under Thoth’s local data directory.

The point is not to build yet another cloud profile of you.

The point is to give your own assistant durable context that you control.

8/

Thoth also has a Dream Cycle.

At night, it can refine memory by:

- consolidating duplicate facts

- enriching entities

- inferring useful relationships

- surfacing system insights

- improving long-term recall quality

So memory gets cleaner over time instead of turning into a junk drawer.

9/

It also generates an Obsidian-compatible wiki from the graph.

Every memory entity can become a markdown page with links to related entities.

So your assistant memory is not trapped in a black box — it becomes a personal wiki you can browse.

10/

Memory is not just for trivia.

It powers workflows:

- “Remind me about that project deadline”

- “What did we decide about the API?”

- “Use my usual style”

- “Summarise my work priorities”

- “Draft this based on what you know about Thoth”

The assistant becomes context-aware across sessions.

11/

Thoth is careful about what it saves.

It should remember durable things:

- names

- relationships

- preferences

- projects

- deadlines

- recurring workflows

- important technical context

It should not save every throwaway request or transient detail.

12/

The goal is simple:

AI assistants should not feel like goldfish.

They should remember what matters, understand how things connect, and keep that memory under your control.

That’s why Thoth uses a local knowledge graph rather than just stuffing more chat history into context."

u/Acceptable-Object390 — 12 days ago
▲ 2 r/Thoth_AI+2 crossposts

Realtime voice in AI Assistants

Realtime voice is the missing I/O layer for personal AI computing.

Not “voice assistant” voice. Not wake-word, command-tree, smart-speaker voice.

I mean full-duplex, low-latency, speech-to-speech agents that can listen, interrupt, reason, call tools, update memory, and act inside your actual workspace.

That changes the shape of the computer.

Today, most AI agents are still trapped behind a text box. That works for prompts, but it is a bad interface for real life. The moment you are coding, walking, reading logs, cooking, driving, debugging a build, or thinking through a messy plan, the keyboard becomes friction.

Realtime voice removes that friction.

Technically, the stack is finally there: streaming audio over WebRTC or WebSocket, server-side VAD, barge-in, partial transcripts, speech-to-speech models, function calling, RAG, MCP tools, local app context, and policy gates around actions. The agent does not need to wait for a perfect paragraph. It can work from intent as it forms.

That matters because personal AI computing is not just “chat with a model.” It is a loop:

observe context

understand intent

ask the right clarification

use tools

change files

remember preferences

surface tradeoffs

wait for approval where needed

keep going

Voice makes that loop feel ambient instead of transactional.

The market is moving this way too. AI PCs are being built around NPUs, with Microsoft’s Copilot+ class requiring 40+ TOPS. Qualcomm, AMD, Intel, and Apple are all pushing local AI acceleration because latency, privacy, battery, and cost matter. At the same time, enterprises are moving from copilots to agents that execute workflows, not just summarize them.

But the real unlock is personal.

A voice-native agent can become the operating layer between you and your tools. “Open the bug from yesterday, check the failing test, compare it to main, and tell me if this is our change or upstream.” That should be a conversation while the agent works, not a prompt you rewrite three times.

This is where an agent like Thoth gets interesting.

Thoth already has the bones of a personal AI computer: workspace context, tool use, memory, code awareness, and agentic execution. Realtime voice turns that from something you operate into something you collaborate with continuously.

Less prompt crafting.

More thinking out loud.

Less UI ceremony.

More flow.

Realtime voice is coming to Thoth in the next release.

u/Acceptable-Object390 — 9 days ago
▲ 5 r/Thoth_AI+1 crossposts

Tried with lmstudio... Not going well.

Hello there folks!

So, yesterday I tried setting Thoth up for the first time and, since for some reason it wouldn't see my freshly installed ollama, I've connected it to a working lmstudio server.

When I chat with that server from other agents or from its chat interface I get answers and whatnot.

But the situation is quite different when using Thoth

All I get is that the request gets processed but the answer is always a generic welcome message.

I've tried adding tools etc, and that's kinda doing something (getting screenshots and web searches, even though those are not requested) but the chat part is still basic welcome messages.

I'd expect it to be some kind of lmstudio config issue, but I really can't get to the bottom of it. Any advice?

reddit.com
u/KKunst — 14 days ago
▲ 4 r/Thoth_AI+2 crossposts

Thoth v3.23.0 is live

Thoth v3.23.0 is live. Provider runtime is stronger, memory recall is smarter, and the UI is faster.

🧩 Provider‑qualified model identity across the whole app

🔍 Safer routing for custom OpenAI‑compatible endpoints with real probing

🛠️ Better handling of tool calls, reasoning fields, and transcript cleanup

🧠 Deterministic memory recall with hybrid search, graph expansion, audit metadata, and review states

⚡ Lighter Settings, faster transcript loading, async model picker, and safer UI rendering

🗄️ Task database repair with schema validation, in‑place fixes, and recovery commands

📡 More reliable channel routing and Wikipedia tool behavior

A stability and correctness release that tightens every runtime path.

reddit.com
u/Acceptable-Object390 — 14 days ago