u/vandalieu_zakkart
WTF is happening right now in SG server?
This is so annoying. I have been trying to get into a game for the last 40 minutes. apart from this there were 2 queue dodges as well. is there some server issue? The ping seems stable for all players. Currently in lobby for the 7 time since i started playing today. Let's see how it goes.
If you want to know more about something in Hermes/OpenClaw just ask them about it
I was pretty curious how hermes handles memory so i was reading up the documentation. then it occured to me, why not just ask it directly? opened a new thread. asked it to remember something and then followed up with asking about how the memory system actually worked.
Received an excellent detailed explaination.
Me: can you tell me more about your memory layers? i am curious
Hermes:
Three layers, each serving a different purpose:
---
1. memory tool — Short-term key-value notes (2,200 char limit)
The tiny persistent KV store you saw earlier. Compact bullet-point facts that survive restarts and get injected into every turn. Near capacity (~96%). Good for: user identity, device aliases, tool quirks, hard limits. Bad for: detailed knowledge, lengthy context.
Think of it as the "system prompt extension" — preloaded on every message.
---
2. Hindsight — Long-term semantic memory (SQLite + FTS5)
Located at ~/.hermes/hindsight/. Automatically extracts and indexes structured facts from every conversation using semantic search + entity resolution. You saw 74 recall results earlier — it's indexing everything you've ever said.
Two tools:
- hindsight_recall — semantic/keyword search across all past sessions
- hindsight_retain — store structured facts with context labels and tags
This is the "knowledge graph" layer. Surfaces relevant memories on demand without cluttering the 2,200 char limit.
---
3. Session transcripts — Raw conversation history (SQLite JSONL)
Located at ~/.hermes/sessions/. Every message in every session is archived as JSONL + serialized session files. Dozens of sessions going back to May 14. Sessions are per-platform connection — Telegram DMs are one session, the TUI is another, Discord is yet another. Threads within Telegram get their own context stored in hermes_state.db.
Use session_search to query these with FTS5 — finds specific things you've discussed, not just general topics.
---
Summary table:
Key-value notes
• Layer: Key-value notes
• Tool: memory
• Size: 2,200 chars
• Purpose: User prefs, device map, tool quirks
Semantic memory
• Layer: Semantic memory
• Tool: hindsight_recall/retain
• Size: Unlimited
• Purpose: Facts extracted from all conversations
Raw transcripts
• Layer: Raw transcripts
• Tool: session_search
• Size: Dozens of files
• Purpose: Searchable conversation archive
Me: i noticed you mentioned 96% memory tool storage. so is the current setup optimal? or is there any way to optimize and improve memory further?
Hermes: Good catch to ask. Looking at the data — no, it's not optimal. Two concrete problems:
---
Problem 1: Memory is 96% full with redundant entries
The memory tool (2,200 char hard limit) has duplicate entries for the same facts:
- "action not explanation" preference appears twice (nearly word-for-word)
- "Claude Code does own analysis" is a subset of the above
- MiniMax vision fix and Telegram bot notes are duplicates of what already lives in hindsight and SKILL.md references
Fix: Consolidate to one entry per fact. Trim verbose entries. Free up ~500+ chars.
---
Problem 2: Hindsight has massive duplication
74 recall results — many are the same fact retrieved multiple ways. E.g. the "SKILL.md preference" rule appears dozens of times in different phrasings. Hindsight doesn't dedupe, it just ranks by relevance. Storing the same fact 15 times wastes nothing in SQLite but creates noise on every recall.
Fix: hindsight_recall is still useful for search, but the recall result shows the system is repeatedly indexing user preferences without consolidating. Not critical, but messy.
---
What I'd recommend:
- Trim memory tool — consolidate the duplicates, drop the verbose ones already stored elsewhere (skills/hindsight)
- Keep memory for only 3 categories:
- Device map (network identities)
- User preferences (concise, one-liners)
- Active operational state (things that change, e.g. Gemini OAuth pending)
Everything else belongs in hindsight_retain (semantic long-term) or SKILL.md (procedural/how-to).
Want me to clean it up now?
-----------------------------------------------
All this via minimax token plan. I am blown away by just how smart it is.
TLDR: keep talking to your agent and slowly you will uncover ways to optimize it and improve it's functionality. it'll learn slowly and improve itself or you can also ask it to fix issues that you find in general conversation
New SOTA 1B model? HRM-text
Saw this video by them. Seems interesting but Tbh the benchmarks seem too good to be true. I'm not super knowledgeable on how models think so can anyone more knowledgeable explain what exactly is happening. And it's pros and cons?
GitHub: https://github.com/sapientinc/HRM-Text Hugging face: https://huggingface.co/sapientinc/HRM-Text-1B
I'm not affiliated with them in anyway, just saw the video on YouTube.
How I (my hermes agent) fixed minimax token plan vision issue
(generated by my agent)
The Problem
MiniMax's vision model (VLM) uses a non-standard endpoint: POST /v1/coding_plan/vlm with a custom request body format. A _MiniMaxVLMAdapter class existed to handle this translation, but the async call path had two bugs that prevented it from working.
Bug 1: The Wrong Endpoint Was Called
What happened: When Hermes received an image via Telegram, it tried to use MiniMax's VLM, but the request went to POST /v1/chat/completions instead of POST /v1/coding_plan/vlm.
Root cause: _is_anthropic_compat_endpoint() was returning True for MiniMax, causing async_call_llm to convert image_url content blocks to Anthropic's format ({"type": "image", "source": {...}}) before passing them to the VLM adapter. The adapter's create() method expected the original OpenAI image_url format but received Anthropic blocks it didn't recognize, so it treated the image as missing and returned "I don't see any image."
Fix: Removed MiniMax from _is_anthropic_compat_endpoint. The function now only returns True for actual Anthropic providers ("anthropic", "anthropic-direct"). MiniMax's VLM adapter handles its own internal format conversion — it shouldn't be pre-processed by async_call_llm.
# Before (wrong)
_ANTHROPIC_COMPAT_PROVIDERS = frozenset({"minimax", "minimax-oauth", "minimax-cn"})
# After (correct)
if provider in {"anthropic", "anthropic-direct"}:
return True
Bug 2: The Async Wrapper Was Discarded
What happened: _to_async_client() had a isinstance(sync_client, _MiniMaxVLMAdapter) branch that built a wrapper object, but it was missing a return statement, causing execution to fall through to the generic AsyncOpenAI wrapping path — which routed to /v1/chat/completions.
Fix: Added the missing return so the VLM adapter's wrapper is actually returned and used, rather than being overwritten by the generic async wrapper.
# The isinstance check existed but had no return
# After fix:
if isinstance(sync_client, _MiniMaxVLMAdapter):
async_create = client.async_create
# ... build wrapper ...
return mock, model # ← this was missing
Why Testing Direct Calls Worked But End-to-End Failed
When testing _build_minimax_vlm_adapter() and calling create() directly, it worked perfectly. That's because the direct call bypasses async_call_llm entirely — it hits the adapter's create() method which correctly routes to /v1/coding_plan/vlm. The bugs only manifested in the full Telegram → Hermes → async_call_llm → vision tool chain.
Files Changed
- agent/auxiliary_client.py — two patches: fixed the missing return in _to_async_client and corrected _is_anthropic_compat_endpoint to exclude MiniMax.