r/Kolsetu

OSS to win - VoiceBox is here
▲ 6 r/Kolsetu+1 crossposts

OSS to win - VoiceBox is here

OSS app replaces ElevenLabs & WisprFlow, runs 100% locally.

→ Clone voice from 3s audio
→ 7 TTS engines in one
→ 23 langs: Ar, Hi, Ja etc.
→ Built-in MCP srv so Claude Code/Cursor/Cline speak cloned voice
→ Local LLM rewrites in-char before TTS

u/bhalothia — 4 days ago
▲ 699 r/Kolsetu+1 crossposts

All Those A.I. Note Takers? They’re Making Lawyers Very Nervous. A trendy productivity hack, A.I. note takers are capturing every joke and offhand comment in many meetings. They could also potentially waive attorney-client privilege.

nytimes.com
u/badcryptobitch — 10 days ago
▲ 12 r/Kolsetu+7 crossposts

Three bots in a trenchcoat is not omnichannel

Self-serve is exciting. Genuinely. But if I am honest, it is not the most interesting thing about 13 May.

The most interesting thing is that we have been quietly running architecture that the rest of the industry is only just figuring out exists.

A competitor recently launched real-time SMS ingestion. The coverage was breathless. Everyone lost it. So innovative. Revolutionary. Game-changing.

Me? I looked at our codebase and thought: "SMS ingestion. Wow. That is so 2025."

Here is what we actually built, and have been running in production for the better part of a year.

Mid-voice-call, Elba texts a short URL to the caller. The caller fills out a form on their phone. The structured data comes back into the live call via RPC. The workflow receives clean JSON. The voice call never paused. The agent never lost session state. The caller submitted a form while still talking and the agent acted on it in the same conversational turn.

That is not SMS ingestion. That is a bidirectional channel bridge inside a single active session. Sending an SMS during a call is not new. Getting structured data back into the active session in real time without dropping state on either side - that is the part nobody else has shipped.

And it sits on top of something even more fundamental.

Most "omnichannel AI" are three bots in a trench coat. A voice agent, a WhatsApp bot, a webchat widget, all pointing at the same CRM row and calling it unified. Each with its own prompt, its own config, its own version history, its own failure modes.

Elba is one agent. One workflow. One memory layer. Voice, WhatsApp, SMS, email and webchat all running through the same execution engine. Not copies. Not synced versions. The same agent, same logic, same memory, regardless of which channel the conversation arrived on. Deployments are atomic - every channel switches to the new workflow version in the same transaction. No drift. No "did the WhatsApp bot get the update" incident. One audit trail.

When a regulated enterprise customer asks what exactly their AI told a customer across every channel and every session for the past six months, we have a single clean answer.

The competition is announcing SMS ingestion and calling it a breakthrough.

We are launching self-serve on 13 May and already cooking the next thing. We may have put it on hold until after the launch. Our tech never sleeps though.

If you want an agent that actually knows who it is talking to across every channel and every session: self-serve opens 13 May at www.kolsetu.com.

Full technical writeup: https://www.kolsetu.com/blog/the-architecture-nobody-else-built

u/EdikTheFurry — 9 days ago
▲ 5 r/Kolsetu+3 crossposts

Compliance is not a badge collection!

At this point I am fairly certain that if we add one more compliance badge to our homepage, the website will collapse under its own moral superiority.

Not explode. Not crash. Just quietly give up. Like: "Mate, I cannot carry ISO 27001, 27018, 42001, NIST, CIS, CSA STAR, GDPR, EU AI Act, EU Data Act and your ego. Pick a struggle."

None of this was the plan.

Nobody wakes up one day and thinks "you know what I'd like to do professionally? Collect regulatory frameworks like rare artefacts, except the artefacts are PDFs and the reward is more PDFs."

This is what happens when you sell into enterprise environments.

One customer wants GDPR (totally agree). Another prefers CSA STAR registry (makes sense). Someone else insists on NIST CSF (fair enough). Then CIS Controls joins (alright…), followed by regional frameworks, some personal data protection variants, and, if you are not careful, the temptation to add frameworks from jurisdictions you can only reach with two stopovers and an mild panic attack at immigration becomes real - not because anyone actually needs them, but because at some point the list itself starts to feel like the product.

And because we enjoy radical luxuries like “revenue” and “remaining in business,” we say yes to what is required - and try very hard not to drift into what merely looks impressive.

The awkward truth nobody wants to say out loud: most modern privacy frameworks are not wildly different creatures. They are variations. Some stricter, some more relaxed, some reorganising concepts, others renaming them so they sound more official or slightly more intimidating when read aloud in a boardroom. Many will confidently explain that they are entirely unique, independent frameworks. Which is impressive, because a surprising number of them look like GDPR wearing a different outfit and insisting they are a completely unrelated alter ego. A lot of these frameworks are GDPR with a new haircut, a regional accent, and a very strong opinion about being original.

Claiming coverage is not the same as demonstrating capability. In the same way that saying "No hablo español" does not make you bilingual, listing frameworks does not mean you have operationalised them. It just means you have learned how to sound convincing while exiting the conversation. Give it enough time and you could probably justify adding a framework from somewhere that sounds vaguely fictional, supported by a regulator nobody has ever spoken to, governing a scenario your product will never encounter. At that point you are no longer communicating your security posture. You are assembling a compliance-themed trading card collection and hoping nobody asks you to actually play the game.

And now, our favourite punching bags. Yes, the usual suspects. Yes, everyone knows them.

Equifax - deeply regulated, thoroughly audited, fully certified. A known vulnerability did not get patched. Not obscure. Not advanced. Known. 147 million people. Not a framework failure. A system forgetting to do something so basic it borders on insulting.

British Airways - strict compliance regimes, PCI standards, the full enterprise security starter pack. Attackers skimmed payment data from their website for months. Not hours. Not days. Months. At that point it is less of a breach and more of a long-term arrangement.

Both had impressive lists. The lists did not help.

Frameworks describe what a secure system should look like. They do not guarantee the system will behave that way when it matters. If your foundation is solid, aligning with additional frameworks is largely mapping and documentation. If your foundation is not solid, adding frameworks is decoration. Very expensive decoration, but decoration nonetheless.

Honestly? We will keep expanding our list because customers expect it, procurement requires it, and principles have a remarkable tendency to become flexible when invoices arrive. But the expansion does not make the system more secure. It actually only makes us more fluent in describing the same system in multiple regulatory languages.

At some point the more relevant question is not how many frameworks are listed, but whether the system itself is understandable, controllable, and capable of behaving correctly under pressure.

Because if explaining your compliance posture becomes more complex than your system itself, you have not increased trust.

You have simply made it harder to see what is actually going on.

Do you fancy to read more articles and blogs? If yes, here you go: https://kolsetu.com/blog

reddit.com
u/EdikTheFurry — 10 days ago
▲ 11 r/Kolsetu+4 crossposts

The return path nobody built

A few days ago I posted about why most "omnichannel AI" is three bots in a trenchcoat. One agent, one memory layer, one execution engine across voice, WhatsApp, SMS, email and webchat. If you missed it, short version: what the industry calls unified is usually three separate configurations pointing at the same CRM row and hoping nobody looks too closely.

Today I want to go one layer deeper. Because the single-agent architecture is not just cleaner operationally. It enables something that no other platform has shipped.

Here is the problem every voice AI system has and nobody talks about honestly.

Structured data collection over voice is unreliable. Alphanumeric strings - vehicle registrations, policy reference numbers, membership IDs - get transcribed wrong at a rate that matters in production. One wrong character in a registration fails a lookup. A mishearing in a policy number causes a downstream processing failure that someone has to fix manually. Production systems either flag everything for human review or quietly accept the errors and clean up after themselves. Neither is a solution.

The alternative is deferring collection to a post-call follow-up. The call ends without the data. A second interaction is required. In emergency services, insurance intake, or patient triage, that is not a workflow step. That is an operational failure.

We did not accept either of these.

When the agent reaches a data collection node in the workflow, it sends a single SMS to the caller. The caller, who is still on the call, opens the URL on their phone. A dynamic form renders with exactly the fields the agent needs. The caller fills it in and submits. The structured JSON payload is returned to the active call session via LiveKit RPC. The workflow receives the payload and continues. The call never paused. The agent never lost session state.

Now here is the part that does not exist anywhere else.

Every other platform that sends an SMS during a call sends it outbound. A confirmation, a receipt, a link. The SMS departs the session. The call and the message are separate interactions from that point. There is no return path. Data flows one direction.

What we built is a bidirectional channel bridge inside a single active session. The SMS is an ingestion pipe. The form submission is an RPC call into the live session that the agent is actively listening for. The agent holds the workflow at the data collection node, waits for the return, receives the payload, and continues. All of this while the call is live.

The technical implementation: the short URL resolves via GraphQL and AppSync with connection state bound to the active session ID, so the form submission knows exactly which running instance to deliver the payload to. LiveKit RPC handles the return path with the session remaining open throughout. Connection state handling covers disconnection and retry so a brief signal drop does not orphan the session.

This only works because there is one session underneath all of it. A voice call, an SMS form submission, a WhatsApp message, a webchat interaction - they all feed the same stateful session. If you have three separate bots, there is no session to return the data to. You are firing a webhook into a void and hoping something picks it up after the call ends.

The previous architecture, which is still what most platforms use today, required one SMS per field. Five fields, ten asynchronous exchanges, call long over before collection completes. We replaced this in February 2026 with the single-form RPC architecture.

In production this was stress-tested in roadside assistance. A stranded caller. The agent needs a vehicle registration, a membership number, and a location reference. Over voice, the registration can take three to five exchanges and still produces errors. Post-call collection means the dispatcher works without confirmed vehicle details while the caller waits. With in-session RPC: one SMS, one form, all data collected in under thirty seconds, structured payload delivered before the call ends and without errors. The dispatcher has confirmed data. No callback needed. Single session, start to finish.

Sending an SMS during a call is not the hard part. The hard part is binding a form submission on a second device to an active session on a different channel, delivering the payload in real time, and having the agent act on it within the same conversational turn.

That is the part we built. Nearly a year ago. While the industry was still announcing SMS ingestion as a breakthrough.

Full writeup: https://www.kolsetu.com/blog/the-return-path-nobody-built

reddit.com
u/EdikTheFurry — 10 days ago