u/Deep_Ad1959

the snapshot problem in restaking governance

i keep coming back to a gap in the OZ Governor pattern when the underlying is a restaking position. ERC20Votes snapshots voting weight at proposal-creation block, which works fine for plain governance tokens. But between snapshot and execution the token can get slashed by an AVS, re-staked into a different operator set, or re-delegated. The recorded balance no longer matches the real economic stake by the time the call lands.

The standard answer is snapshot-and-shrug. Let slashed stake keep its vote, treat the drift as a known anti-feature. the alternative is to re-evaluate at execution against current stake, but then results can flip after voters have signed off, which kills predictability.

every restaking-era governor i've looked at picks option one. so the honest position is that restaking and token-vote governance aren't compatible at the precision people pretend, and the gap shows up at execution time.

reddit.com
u/Deep_Ad1959 — 11 hours ago
▲ 0 r/MacOS

accessibility permission on macos isn't as scoped as people think

my mental model for accessibility permission was wrong for years. TCC uses a responsible-process chain (Apple's term), so any binary spawned by an app inherits the parent's accessibility scope automatically. the child's tccd request resolves to the parent's bundle identity, not to its own path. that's why granting Terminal or an AI client accessibility instantly hands the same control to anything they launch as a subprocess. working as designed, but a lot of users clicking allow on an 'AI assistant' don't realize they're also greenlighting every helper binary it ships.

reddit.com
u/Deep_Ad1959 — 1 day ago
▲ 2 r/ethdev

multichain governance via layerzero is no longer a hack, and i didn't see it coming

the standard pattern for governance on an L2 used to be 'vote on mainnet because the token lives there', which leaves L2 users paying mainnet gas to participate. optimism moved its governance off mainnet onto the OP rollup and replicates state via layerzero. the contract you call for a vote now lives on the rollup, vote messages cross to other deployments, and the user pays a few cents instead of mainnet fees.

i didn't expect this to be the cleanest pattern, but it kind of is, and agora's governor stack (where roughly 800k votes have settled across production deployments) supports it natively.

what nobody seems to have publicly drilled yet is what happens when a layerzero DVN is censored or paused mid-proposal. there's a clean technical answer with alternative DVNs and fallback hashes, but i haven't seen a DAO actually run that fire drill in public.

reddit.com
u/Deep_Ad1959 — 1 day ago

hubspot dedup is where most ai crm writers quietly break

I keep seeing 'AI updates your CRM' demos that handwave over the actual technical gap. it's not 'an llm drafted a note', it's how the agent does the write. HubSpot's contacts object dedupes on email as the default unique, so if you extract contact info from an inbound thread and POST without first hitting the batch/upsert endpoint with idProperty=email, you fanout duplicates fast.

Second thing nobody flags: associations are a separate api call after object create. 'log a meeting on this deal and contact' is actually three calls (meeting create, associate-to-deal, associate-to-contact). an agent that fails between calls leaves orphan meetings nowhere in the timeline and you find them three weeks later when someone asks why the deal stage never moved.

per-action approval before each write is the unglamorous part of the desktop agent space. if the tool can't show you the exact JSON body and the dedup key it's about to send, you can't audit the logic, and the cleanup you'll do six months in costs more than the time the agent saved.

dedup logic lives at the write boundary, not in the model.

reddit.com
u/Deep_Ad1959 — 2 days ago

the accessibility tree gotchas that kept breaking my desktop agent

my desktop agent stopped failing the moment i stopped trusting the accessibility tree as a single source of truth.

The dumbest one was cross-app handoff. agent clicks a link in mail, safari becomes frontmost, the agent keeps asking for the original pid's tree and operating on a frozen snapshot. fix is detecting when the frontmost app changes between actions and traversing the new one before the next step. Easy to miss because the previous pid is still alive, just no longer relevant.

second one was sheets and dialogs overriding window viewport scope. an element shows up in the tree because it technically exists in the hierarchy, but it sits underneath an active modal sheet, so clicks pass to whatever is actually on top. Needed an explicit "is this element inside the current modal" check before every click.

Multi-monitor coordinates were the third. on a 3 screen setup the left external sits at x around -3840 and the right around 3456. a naive "click at x:200" lands on whichever screen contains (200, y), which is almost never the one you mean.

llm clicking the wrong button is rarely the model. it is the tree state being stale or scoped wrong, and the failure mode is silent until you diff before and after screenshots. written with s4lai

reddit.com
u/Deep_Ad1959 — 2 days ago

everyone read tally leaving governance as "daos are dead." i think that's backwards

Tally announced back in March it was stepping out of governance, and the reaction I kept seeing was basically "if even the tooling companies are quitting, onchain voting is finished." I think that read points at the wrong layer.

The vote itself is a contract. Optimism, Uniswap, and ENS governance keeps executing whether or not any one company maintains a frontend for it. A UI shutting down is a UX problem, not a governance failure, nobody said Ethereum died when a block explorer went offline.

What's actually load-bearing is the unglamorous infra around the contract: the relayers that make voting gasless, the indexers, the calldata decoders that show you what a proposal does before you sign. Gasless voting in particular only works as long as someone keeps funding the relayer. That's the dependency that breaks when a vendor leaves, not the governor itself.

So the question after Tally isn't "is governance dead," it's narrower and way more boring: who keeps the relayers paid and the decoders patched once the company stops doing it, and is your stack actually forkable or are you locked in. i haven't seen many DAOs answer that part cleanly. written with s4lai

reddit.com
u/Deep_Ad1959 — 3 days ago

gasless voting on uniswap governance only works because the foundation pays for it

I went down a rabbit hole on how voting actually works on vote.uniswapfoundation.org and the part nobody really talks about is the relayer.

In a plain OpenZeppelin Governor setup, casting an onchain vote is a transaction, so you pay gas. fine for a whale delegate. but if you're sitting on a few hundred UNI of delegated weight, paying real money to vote on a proposal you might be on the losing side of anyway is a genuine disincentive. participation quietly skews toward people who don't notice the gas.

The gasless flow sidesteps that with a relayer. you sign your ballot as an EIP-712 typed message offchain, a relayer submits it onchain and eats the gas. the vote still settles onchain, you just don't pay for the settlement. catch is somebody has to, and right now that somebody is the Uniswap Foundation covering the relayer cost out of its budget. ENS DAO does the exact same thing for its own governance.

which makes gasless voting a subsidy, not a protocol feature. it holds as long as the foundation keeps funding it. if that line item ever gets trimmed, gasless voting doesn't throw an error, it just silently reverts to gas-gated participation and small-delegate turnout craters. feels like relayer funding should be endowed or treated as a protocol-level cost rather than a discretionary foundation expense, and I haven't seen a proposal that actually does that. given how much of current turnout leans on it, that's a strange gap to leave open.

reddit.com
u/Deep_Ad1959 — 5 days ago

a single send_message tool is how an agent texts the wrong person

I keep seeing agent messaging tools shaped as a single call: send_message(contact, text). it demos great. then the agent fires into a group chat that shadows the contact's name in search, or picks the wrong John, and you find out after the message is already gone. there's no undo on a sent message.

the real problem isn't the messaging layer. it's that contact resolution is fuzzy, and a monolithic tool collapses search, disambiguation, and the send into one call the model never gets to inspect. it picked a result internally and you never saw the other three candidates.

the shape that holds up is decomposing it. search returns indexed results, open the Nth one, read back the name of the chat that actually opened, then send. Four small tools instead of one. each call returns state the model can check before committing. and the send step itself reads back the last message in the thread and reports verified true or false, so a silent failure surfaces as a failure instead of looking like success.

the pattern I keep landing on: any tool that does something irreversible (send, pay, delete) should be the smallest, dumbest step in the chain, with a verification read right after it. mega-tools demo beautifully and fall over quietly once real data is involved.

reddit.com
u/Deep_Ad1959 — 6 days ago
▲ 0 r/OpenAI

the gap between chatgpt drafting an email and chatgpt actually sending it is wider than i expected

I spent the last few weeks trying to push chatgpt past "give me text" into actually finishing a workflow end to end: read the gmail thread, pull the matching hubspot record, draft the follow-up, file the next step in linear. it can describe each step beautifully. doing them in one pass without me copy-pasting between five tabs is a different problem.

codex extending to mobile feels like openai noticing the same gap from the dev side. but for non-coders the gap is just as wide. an agent that actually does the work needs gmail, calendar, drive, a crm, and some cross-session memory of what happened last tuesday, and the moment you wire all of that up you are basically building a desktop app on top of the api.

my guess is openai eats this from the inside (operator plus actions plus connectors plus memory) and third parties stitching things together get squeezed once the connectors mature. open question for me is whether that happens this year or three years out.

reddit.com
u/Deep_Ad1959 — 7 days ago

Spent a few weekends turning the Web Data, Login Data, History, and Bookmarks files in ~/Library/Application Support into a single sqlite knowledge base. Lives entirely on disk, no network calls anywhere in the stack, MIT licensed.

The interesting bit isn't the extractors, it's the ranking. Each row tracks two counters: appeared_count (how many times the value was seen across re-extractions) and accessed_count (how many times it got returned by a query). hit_rate = accessed / appeared becomes the relevance score. No manual curation, no LLM tagging step required.

For semantic dedup it uses nomic-embed-text-v1.5 via onnxruntime, all local. Anything with cosine similarity >= 0.92 against an existing key prefix gets superseded instead of duplicated. Single-value keys (full_name, first_name) auto-supersede, multi-value keys (email, account:github.com, tool:vscode) coexist.

The hit_rate approach handles the 50% noise problem without a human in the loop. Still an open question whether it stays useful over months as old data goes stale, but for a fresh extraction it has been a noticeable improvement over flat keyword search.

repo: github.com/m13v/ai-browser-profile written with ai

reddit.com
u/Deep_Ad1959 — 18 days ago

have been writing AHK scripts against a mix of internal tools at work and the apps roughly fall into three buckets that each want a different AHK approach.

native Win32 and modern WPF, ahk_class plus ControlClick/ControlSend hold up for months, UIA-v2 if I need a button reachable by name. Electron-wrapped "desktop" apps, the inner DOM lives behind a UIA tree that gets rebuilt on focus changes, so any cached element ref goes stale within seconds. Citrix and RDP sessions, the whole window is just a remote bitmap, accessibility apis return nothing meaningful so it ends up being ImageSearch or nothing.

what makes me twitch is when scripts try to apply one strategy across all three. ImageSearch retries on a WPF app that has perfectly good AutomationIds is just slow. UIA-v2 on a Citrix window is impossible. ControlSend across both is a coin flip on whether the legacy MFC piece even respects it.

the citrix bucket still has me beat. ImageSearch with a scatter of fallback patterns is the closest I've come to reliable, but every monitor switch or DPI change breaks the templates. honestly it might be the one bucket where image matching with extra retries is just the right tool, however much it pains me to type that.

reddit.com
u/Deep_Ad1959 — 20 days ago

I keep noticing this reading through AI-generated Playwright suites: the agent reaches for getByRole, getByLabel, getByText almost every time. Hand-written code, even in fresh codebases, leans on data-testid or CSS selectors. probably because Playwright docs lead with semantic locators and that is what the LLM saw most in training.

side effect nobody talks about: if your form inputs do not have proper labels, or your buttons are styled divs with onClick, the agent cannot generate stable tests for them. the test fails loudly until you fix the markup. same with missing aria-labels or broken heading hierarchy.

So teams adopting AI test gen are getting pushed toward better a11y markup as a forcing function, even when that was never the stated goal. Tests do not get more reliable because of self-healing selectors, they get more reliable because the markup got semantic enough for the locator strategy to actually work.

Probably the most accidentally-positive outcome of the whole ai test gen wave, and nobody is shipping it as the headline.

reddit.com
u/Deep_Ad1959 — 20 days ago

once one of these tools can drive your default chrome profile or read the AX tree of a logged-in app, it has every session token you have. gmail, your bank, github with PAT scopes, slack. no oauth scope, no consent screen, the agent just has the same cookies as you do.

most projects ship as either a hosted sandbox or a fresh chromium. fine, different threat model. but the agents people actually want, the ones that do real work in real apps, run as you. a closed-source binary doing that, phoning home with screenshots or AX dumps, is a much bigger ask than a closed-source chatbot.

I keep landing on two requirements before I trust one of these long-term. Source has to be auditable so I can grep for what leaves the machine. The inference path matters too, because if every screen capture goes to an api, the cookies effectively go too, just one indirection removed.

no one's really solved this at the consumer level, every demo handwaves it. open source at least gives you a fighting chance to see what's going wrong before something starts exfiltrating itself. written with ai

reddit.com
u/Deep_Ad1959 — 21 days ago
▲ 7 r/MacOS

i tried to automate my whole mac workflow over two weeks. ended up with maybe 12 small things that actually survive day-to-day. the rest broke.

started with Shortcuts. fires actions fine, but talking to apps that aren't first-party is painful. basically AppleScript with a nicer UI for half the apps i actually use. Slack? no real integration. Notion? web view only. Linear? pretend it's a webpage.

dropped to AppleScript and the accessibility API directly. better, but every app has its own quirks. native cocoa apps are clean. Electron apps render a flat AX tree with no semantic role. sandboxed App Store apps hide things. one Pages update changed a button index and broke a flow i'd built two days earlier.

the 12 that survive are boring office plumbing. things like 'pull the latest invoice number from email and stick it in this spreadsheet column' or 'rename screenshots to the active app's name'. nothing impressive. Shortcuts feels abandoned and the AX tree is the only thing left that survives a system update.

edit: ended up shipping a voice-first mac agent that runs on the AX tree, fwiw it survives the system updates that broke half my old flows: https://fazm.ai/t/macos-accessibility-automation

reddit.com
u/Deep_Ad1959 — 22 days ago

Spent the last few weeks trying every ai test generation tool I could find against a real app, the kind with email OTP login and a multi-step onboarding. every single one nailed the demo todo and then immediately got stuck at 'paste the code we just sent you'. ended up wiring disposable inbox polling into my own runner just to make signup deterministic. the tools that emitted raw playwright code at least let me patch the OTP step and move on, the ones that hid the script behind their own DSL were a dead end.

The other thing nobody benchmarks is state between cases. fresh storageState per scenario is fine when you have three tests, when you have forty you're paying a 30s login cost forty times because the wrapper can't reuse a context. that's not a model problem, that's a runner problem and most of these tools don't expose enough of playwright to fix it.

tutorials and demos are everywhere for these things, real ci usage less so. the gap between 'works in the gif' and 'survives auth, retries, and a flaky third party' is where every one of them gets exposed.

fwiw I built a thing for exactly this, handles the OTP step + reuses storageState across cases so the 30s login tax goes away: https://assrt.ai/t/playwright-ai-test-generator-otp

reddit.com
u/Deep_Ad1959 — 22 days ago
▲ 0 r/swift

I've been optimizing transcription accuracy (local vs cloud, model size vs latency). Turns out that's not the constraint. The real bottleneck is that voice without friction spirals. Keyboard forces you to pause and think. Voice at your Mac doesn't.

Shipped a hold-to-talk interface with breath detection and timeout release. Took more engineering time than the entire voice pipeline. Users immediately preferred it. The forced pause before every message prevents rambling and mistakes.

Local transcription sits at 90-94% accuracy out of the box. Cloud hits 98%. That 4-6% miss rate was expected. What killed engagement was the absence of a natural stopping point. Every agent mistake now feels intentional instead of accidental.

Been testing this in production for six months. Pattern holds across hundreds of queries.

reddit.com
u/Deep_Ad1959 — 22 days ago

the macOS accessibility API returns element bounds as (x, y, width, height) with (x, y) at the top-left corner. agents that click at those coordinates without centering (x + width/2, y + height/2) hit the edge instead of the button. drag starts. wrong button triggers. this is trivial to fix in an MCP server, but most agent frameworks don't do it.

more critical: some apps lie about their hit area. the accessibility tree reports a button at (100, 100, 80, 40) but the actual clickable region is at (100, 60, 80, 40) due to custom rendering or modal layering. there's no programmatic way to detect or correct this without a screenshot.

so if you're building agents for desktop automation, you either accept occasional misclicks or add post-action screenshots to verify the UI state changed. neither is great at scale. AppleScript and Shortcuts hit the same ceiling. it's not a framework problem; it's that the accessibility tree only exposes what the app volunteers to expose.

reddit.com
u/Deep_Ad1959 — 22 days ago

I've noticed the real split between open source and closed platforms isn't about features, it's about exit ramps. closed platforms keep you committed to their ecosystem. open source lets you build something, download it, and move on.

the hobby builders I talk to care about this way more than SaaS builders do. they don't want a lifestyle change, they just want something useful. building something, deploying it, and never thinking about it again matters more than beautiful UI or hand-holding.

freedom to actually walk away is the feature that matters most.

reddit.com
u/Deep_Ad1959 — 22 days ago

My swiftui menu bar app kept getting vague bug reports i couldn't repro. Posthog has session replay for web, but for native macos i couldn't find anything maintained.

So i wired up something simple. Screencapturekit at 5fps, hevc hardware encoding via ffmpeg with hevc_videotoolbox, 60-second mp4 chunks that upload to gcs and delete locally after. Disk usage stays around 2-5 mb/min which felt reasonable. Drops in as a swift package, kicked off with try await recorder.start() inside a .task on my root view.

First session i watched, the user hovered my menubar icon for 8 seconds and quit. The 'first run' tooltip i was proud of? Never visible to them in their workflow. would've spent another month guessing from crash logs.

Repo is open source if anyone wants to poke at it: https://github.com/m13v/macos-session-replay

u/Deep_Ad1959 — 22 days ago

ai artifact tools spit out a working single-screen app from one sentence. Bubble and glide want you to set up a database, define workflows, configure auth, all before you have anything to look at.

most people seem to assume there's a smooth ramp between the two. There isn't. The second you need persistent data or login or anything backend-shaped, you fall off the artifact side hard, and now you owe the full bubble setup tax that the artifact bypassed.

what's helped me is treating them as different categories, not steps on a ladder. Artifacts for utilities I'll only use myself. Bubble for stuff with actual users behind a login. the trap is trying to evolve an artifact into the second thing, that path is just rebuilding from scratch.

reddit.com
u/Deep_Ad1959 — 23 days ago