r/aiagents

▲ 17 r/aiagents+4 crossposts

Review my resume : AI Engineer having 3 years of experience, looking for new job

How do you enforce least-privilege access controls for AI agents?

we're moving from simple llm demos to agents that call internal apis and act on behalf of users, and we've mostly settled the enforcement question: each agent gets its own scoped identity with per-tool permissions, not the end-user's token and not a shared god-mode service account. that part is an iam problem and we're treating it like one.
where i'm less sure is verification. scoping the permissions is one thing. knowing the agent can't be talked into using them in a way you didn't intend is another. an agent with legitimate access to a "read customer record" tool and a "send email" tool hasn't exceeded its privileges on paper, but chained under a malicious prompt it can still exfiltrate. least-privilege bounds the blast radius, it doesn't stop misuse inside the bounds.

how granular do you go on scoping (per api, or down to data class) before the overhead isn't worth it and once it's scoped, how do you test that the boundary holds under adversarial prompting rather than assuming the permission model is the whole story?

reddit.com

u/Routine_Day8121 — 2 hours ago

▲ 4 r/aiagents+4 crossposts

AI was told to work an office job. It resorted to blackmail.

youtu.be

u/Large-Trash-9757 — 4 hours ago

▲ 9 r/aiagents

I created a tool that turns AI agents into white collar employees.

I've never posted on this community or seen what other people post, so I don't know if this is impressive or not, but I just want to share what I've been working on.

For some background:

I tried vibe coding a ton of SaaS and other applications but none of them worked out. Actually before the big AI train, I was actually coding a real app by hand (yeah, crazy). I was working on it for months but eventually the AI era had begun. AI literally finished all my pending tasks in 1 day. Tasks that would've took so much more time. Regardless, it didn't work out.

I kept doing more and more research about how to be successful but almost half of the crowd is saying you can't be successful by vibe coding and and the other half is saying you can't be successful without using AI these days. So I didn't know what to do anymore.

What I did:

So I created an application called AgentLeague. I'm not selling it, I just wanted to share it. AgentLeague is a multi agent platform that does the prompting for you. It is set up like a real company with roles such as Director, Project Managers, developers (both backend and frontend), marketing (with a specialty in SEO tactics), UI designers, and much more.

How it works:

The way it works is you only ever talk with the director. You tell him ("I want to make X project") and the director talks to you about the feasibility of the project. He asks questions to ensure he fully understands what you want. Then, he creates an office with a project manager leading it. The director prompts the project manager with all the context he learned about what you want to make. The project manager then decides what roles are needed and spawns those agents. The agents will have their own meeting, write their own BRD (sometimes they argue amongst each other), and then span out to work on their tasks. They constantly work together and sometimes even talk to agents in other offices to get advice/help. It's a full company.

The entire tool runs on headless claude and codex, so the user is never needed for permissions. If the agents need an API key or the user to do something on the web, there is a way for them to ask for that. But all conversation only goes through the director. There's even a skill for them to create scheduled tasks, pipelines, and much more. It's essentially a fully autonomous company (with the exception of one time requests needed from you).

It does take up an insane amount of tokens and is only as good as your membership, but I've used it to run a couple of projects fully autonomously. Have any of the projects taken off as real profitable businesses? Not yet. But it's crazy to see the potential of AI agents.

I used pixel art to create this overview of all the projects I have running and agents working on them. Each room represents a different project. In the hallway, there lives a janitor, IT, and HR. The janitor cleans up temporary files to ensure the app does not take up too much storage. IT handles any MCP issues. HR makes sure no agents get stalled or are hallucinating.

I learned a lot from making this. Mostly about how screwed the future of tech is. It's insane the power of what AI can do fully autonomous. Right now we're setting limitations, but what happens when that barrier is gone?

Open to answer any questions about this!

u/RegulusZ — 7 hours ago

▲ 8 r/aiagents

AI agents can already function as an entire virtual dev team. This is real, what do you think?

Guys, don’t fixate on the fact that I’m a woman, there are developers among us too, I assume that’s obvious... Been going down a bit of a rabbit hole the last couple months on this whole virtual AI team thing. The pitch is basically that you don't hand a task to one all-purpose agent, you split it across roles instead. One does the planning, one writes the backend, another reviews it, and so on.

I'll be honest, I rolled my eyes at first. Looked like just another skin on top of Cursor or Claude Code. But it actually seems to work like a pipelines with defined roles, approval steps, and context getting passed from one stage to the next instead of one agent trying to keep everything in its head at once.

I’ve looked at a few setups (bridgeapp, replit, emergent), as far as I understand it, is that the team lives inside each project. Architect, backend, frontend, QA, whatever roles you need. And the interesting bit is each agent can run a different model. Backend on Claude Code, frontend on Codex, whatever works. Apparently even someone who doesn't code can just pick the model their agent uses.

Anyway, has anyone here actually run something like this on real work? Not the demo stuff, real backlog, real mess. I keep going back and forth on whether it's worth setting up or if it just adds overhead. Kind of feel like a year or two from now nobody's gonna care which single agent is best, it'll be about whose whole team of them actually plays well together. Would genuinely like to hear from people who've tried it.

reddit.com

u/NumiSullivan — 15 hours ago

▲ 24 r/aiagents

I think we're repeating the early microservices mistake with AI agents

A lot of agent demos remind me of what happened when microservices first became popular.

Everyone was excited about splitting systems into smaller components. It looked elegant in diagrams. It looked scalable. It looked like the future.Then people realized the hard part wasn't building services.

It was communication, orchestration, observability, debugging, versioning, and managing complexity. When I look at multi-agent systems today, I get a similar feeling. Building an agent isn't particularly hard anymore.

Building 5, 10, or 20 agents that can reliably work together, maintain context, recover from failures, and remain manageable over time feels like a much bigger challenge.

Sometimes I wonder whether the next breakthrough in agent systems won't come from better models at all. It'll come from better engineering practices around agents.

Curious whether people building production systems agree or if I'm completely off here.

reddit.com

u/Bladerunner_7_ — 20 hours ago

▲ 9 r/aiagents

Ford brought 350 engineers back after realizing AI can’t deliver with the same quality

I hate the "I cut 60% of my team, AI runs the business now" posts on LinkedIn.

We only hear about the layoffs. The rehires happen quietly. Klarna cut 700 customer support reps, then rehired. Ford let engineers go, then brought 350 of them back.

Same wall, both times. AI is only as good as the context you feed it, and they'd underestimated what was sitting in their employees' heads after years on the job.

These are big corps. Sophisticated documentation, huge process libraries, way more resources than almost anyone reading this has. Still couldn't hold quality once the humans walked out the door.

A friend told me about an agency owner who fired her contractors because her own AI prompts were beating their output. Maybe she's right, I don't have the full picture, not my call. But zoom out and the better play, almost every time, is keep your best people and arm them with AI.

Who would I keep? The ones who solve problems without being asked. The ones who actually care whether the outcome is good, not just whether the ticket got closed. The ones who'll learn something new even when it's uncomfortable. And the ones with good judgment, because AI amplifies judgment, it doesn't replace it.

Give that person AI and they don't get 10% better. They become a different category of employee.

Honestly, I have more ideas than I have people who can execute them with AI in the loop. That's the real bottleneck. Not too many humans, not enough humans who know how to wield the tool.

If your first move with AI is "how do I have fewer people," you probably had the wrong people to begin with.

So genuine question, do you actually think you can cut your team and improve quality at the same time? Or does the math fall apart once you flip to the second page?

if you're thinking through how to actually implement ai in your business without torching the institutional knowledge that makes it work, i write about this stuff every thursday. mostly the boring parts nobody posts about: documenting what's in your best people's heads before you even think about who to cut. free to join here

u/Deep-Owl-1890 — 19 hours ago

▲ 14 r/aiagents+8 crossposts

I built an open-source Agent Verifier for Claude Code, Cursor & other Coding Assistants that catches security issues, hallucinated tools, infinite loops and anti-patterns in Agent built using LangChain, LangGraph, and other frameworks. (free, open source, 100% local)

I've been using Claude Code for a few months and noticed AI agents consistently skip the same things: hardcoded secrets, unbounded retry loops, referencing tools that don't exist, and massive system prompts that blow context windows.

So I built Agent Verifier — an AI agent skill that acts as an automated reviewer which does more than just code review (check the repo for details - more to be added soon).

GitHub Repo: https://github.com/aurite-ai/agent-verifier

Note: Drop a ⭐ if you find it useful to get more updates as we add more features to this repo.

----

2 Steps to use it:

You install it once and say "verify agent" on any of your agent folder in claude code to get a structured report:

----

✅ 8 checks passed | ⚠️ 3 warnings | ❌ 2 issues

❌ Hardcoded API key at config.py:12 → Move to environment variable
❌ Hallucinated tool reference: execute_sql → Tool referenced but not defined
⚠️ Unbounded loop at agent/loop.py:45 → Add MAX_ITERATIONS constant

----

Install to your claude code:

npx skills add aurite-ai/agent-verifier -a claude-code

OR install for all coding agents:

npx skills add aurite-ai/agent-verifier --all

----

Happy to answer questions about how the agent-verifier works.

We have both:
- pattern-matched (reliable), and,
- heuristic (best-effort) tiers, and every finding is tagged so you know the confidence level.

----

Please share your feedback and would love contributors to expand the project!

u/Chance-Roll-2408 — 16 hours ago

▲ 3 r/aiagents

We're giving agents memory for everything... except video.

One pattern keeps showing up in the agents I've been building.

We spend a lot of time designing memory for text.

Conversation history.

RAG.

Long-term memory.

Knowledge graphs.

Tool outputs.

But videos usually get treated as if they only exist for the current session.

The agent watches a recording once, answers a few questions, and everything it learned disappears. The next session starts from scratch.

That felt like a strange design choice.

A video isn't fundamentally different from a document. It's just another source of information. Once you've extracted transcripts, OCR, visual observations, and timestamps, there's no obvious reason to throw that work away.

I started experimenting with treating video as persistent knowledge instead of temporary input.

The idea is simple:

- Analyze the video once.

- Store structured observations locally.

- Retrieve only the relevant evidence later.

- Let follow-up questions become retrieval instead of video processing.

I ended up turning the idea into an open-source project called Watch Skill, but I'm more interested in the design discussion than the implementation.

If you're building long-running agents, how are you handling video today?

Do you reprocess every recording, or are you persisting that information somewhere?

reddit.com

u/Fearless-Role-2707 — 13 hours ago

▲ 8 r/aiagents+3 crossposts

Agent with tiered working memory and cross-session learning — architecture, gaps, and what the research didn't cover

I've been building PRAANA — a coding agent with two systems I couldn't find combined in one self-contained binary: an Adaptive Context Engine (within-session) and Cognitive Memory (cross-session). Posting because the architectural decisions may be useful independent of the coding use case.

The core problem:

Every agent session is a context window management problem. Append-until-full plus reactive compaction is lossy — by the time you compact, you've already paid the drift cost, and you've lost track of which information was load-bearing.

PRAANA's ACE curates on every turn. A deterministic compiler assembles the prompt in 5 sections:

1. System Frame       — identity + tools
2. Memory Digest      — ranked cross-session learnings
3. Active State       — current work objects, full resolution
4. Peripheral Stubs   — everything inactive, one-line anchors
5. Recent Turns       — last N turns, budget-capped

State objects demote Active → Soft → Hard based on idle turns. Two-pass auto-hydration before each turn: substring keyword match, then BM25 for fuzzy overlap. Scores are density-weighted: decisions score 1.0, narrative scores 0.6, errors score 0.8. The compiler knows what kind of information is filling up, not just token count.

Cognitive Memory:

At /exit, a summariser extracts structured learnings from the transcript. Six kinds: fact, preference, decision, pattern, mistake, constraint — domain-agnostic; coding-specific knowledge lives in content, not schema. Stored in SQLite with sqlite-vec + Transformers.js (in-process, 384-dim). Confidence decays 5%/day. Entries confirmed across two or more sessions promote to Consolidated Memory (10x slower decay). Ranked recall: cosine × confidence × recency × pin_boost.

Where the research fell short:

I surveyed 20+ agent-memory repos. What I found:

Mem0, LangChain, and most memory backends are retrieval systems. They store and recall but have no outcome-based feedback loops. No architecture for "this memory was used and confirmed, increase confidence" vs "this memory was contradicted, reduce it." Letta has the most interesting consolidation work (sleep-time agents) but it's a platform, not extractable, and consolidation is partial.

Nobody combined proactive context curation with learning memory in one self-contained process. The compression tools — Headroom, ACON — are SDK/proxy layers that sit between you and the LLM. They don't own agent state.

The gap I missed: the research covered storage architecture, not learning signal. The reinforcement path in PRAANA — boost confidence when a session succeeds, decay when contradicted — is wired but the session-success signal hasn't shipped yet (#162). I designed a complete feedback loop and then discovered the trigger was the hard part.

The larger plan:

Four systems — Adaptive Context, Cognitive Memory, Background Consolidation, Intelligent Router — all domain-agnostic. No system encodes anything about code. The coding agent is the proving ground; coding outcomes are measurable. Once Phase 1 validates the architecture, Phase 2 extracts the runtime as @praana/runtime. I'm not extracting it until the coding agent proves it works.

Gaps:

Reinforcement path dormant (#162). No A/B eval harness — scorecard ships, headless task runner is next, no published benchmark claims. Background Consolidation Processor schema exists, not scalable yet. Runtime extraction is Phase 2, not started.

GitHub: amitkumardubey/praana — MIT, TypeScript, Bun.

If you're working on agent memory or context management architecturally, I'd welcome the comparison. What are you seeing in production that the research repos didn't surface?

u/Reasonable_Craft_425 — 20 hours ago

▲ 11 r/aiagents+4 crossposts

TokenMizer - a local proxy for session checkpoint/resume and graph memory across Claude, GPT, and Ollama

I've been building TokenMizer, a local proxy that sits between your editor/CLI and whatever model you're using (Claude, GPT, Ollama) and handles two things I kept re-solving by hand: session checkpoint/resume, and a graph-based memory instead of a flat transcript.

The problem: once a long agent session hits the context limit, the usual fix is summarization, and summaries lose the reasoning behind a decision, not just the decision itself. I'd see a summary saying "switched to Argon2" with no trace of why bcrypt was rejected, so the agent would re-litigate the same tradeoff two sessions later. Flat transcripts have the opposite problem: everything is kept, but nothing is prioritized, so retrieval is just recency-biased keyword luck.

What TokenMizer does differently: instead of one growing text blob, decisions, constraints, and open questions are stored as nodes with edges (this decision depends on that constraint, this question was resolved by that decision). Checkpointing snapshots that graph plus a resumable session state, so you can kill a session and pick it back up without replaying the whole history through the model again.

Where it's rough: there's no eval harness yet comparing retrieval quality against a naive flat-transcript baseline, so right now my evidence is anecdotal (my own sessions), not benchmarked. I also learned the hard way that benchmarking your own memory system by asking it questions only it can answer is circular, so I'm holding off on publishing numbers until I have an honest comparison.

Repo: github.com/Shweta-Mishra-ai/tokenmizer (I'm the author). It's a Python project, MIT licensed. If you've hit the same summarization-loses-reasoning problem, I'd be interested in how you're handling it, and PRs/issues on the eval-harness gap would genuinely help.

u/Feisty-Cranberry2902 — 22 hours ago

▲ 4 r/aiagents

Looking for an alternative to Antigravity IDE for modular code (~$20)

Hey, I'm looking for an alternative to Antigravity IDE. Before the update, it was working reasonably well for agentic workflows using Gemini CLI and Antigravity. After the update and the limitations, I'm thinking about an alternative.

Maybe I'm using it wrong, but I mainly use it for code split into modules, and the chat versions have file pasting limits that make it impossible to paste it into the chat. I also use it for simple web apps.

What alternative or more effective workflow for these kinds of problems would you recommend at this price point (around $20)? Currently, I'm using Google AI Pro.

reddit.com

u/Full_Bother_319 — 1 day ago

▲ 1 r/aiagents

I need to validate this idea gimme your honest feedback

I’m validating an idea for an open-source, local-first AI daemon called UmbraOS.

We all saw the privacy nightmare that was Windows Recall. Apps like Rewind or Limitless try to fix it, but they either charge a heavy monthly subscription or siphon your data to central servers.

UmbraOS is an architecture designed to give you a 100% private digital twin/second brain that runs locally on your machine, syncs P2P with your phone, and offloads heavy reasoning to the cloud safely and at zero cost.

🛠️ How it Works & Key Architecture Points:

Proactive "Blind-Spot" Privacy: Traditional screen-loggers capture everything and censor it later. UmbraOS inspects active OS window titles and foreground processes at 2Hz. If it detects a blacklisted app (WhatsApp, Signal, banking tabs, password managers, browser incognito windows), the loop immediately short-circuits. The frame is never captured in memory or saved to disk.

Zero-VRAM Hybrid Cloud Pipeline: Instead of hogging your GPU with heavy local LLMs or sending raw video to Big Tech, Umbra OS processes inputs locally. It uses lightweight, local OCR (EasyOCR) and local embedding models (all-MiniLM-L6-v2) to index text into a local vector database (Qdrant).

Intelligent Cloud Bursting: For complex reasoning, it packages only relevant text snippets from your local DB and hits free/low-cost cloud frontier APIs (like GLM-5.2 or DeepSeek via OpenRouter). To bypass free-tier rate limits, it uses local SQLite-backed batching and exponential backoff retry queues.

P2P Cross-Device Sync (No Central Servers): Your PC acts as your private server. It exposes a local FastAPI bridge. When your phone is on the same Wi-Fi (or connected remotely via a secure P2P mesh VPN like Tailscale), it securely syncs the vector brain. You can query your desktop's history from your phone completely offline.

Active Agent Capabilities: It doesn't just watch; it executes. By translating voice commands from your phone via local Whisper, it can trigger system scripts on your PC, interact with your local file system, or use official APIs to send emails/messages. It even uses Wake-on-LAN/WAN protocols to "wake up" your sleeping PC from your phone while you are out.

reddit.com

u/200IQ2012 — 1 day ago

▲ 3 r/aiagents

What are the best Zenity alternatives for agentic security rn?

Zenity comes up a lot in enterprise AI security discussions, especially around governance and visibility. But some feedback and reviews suggest that teams still want deeper runtime protection once agents are actively interacting across systems in production.

A recurring concern seems to be visibility into live agent behavior and maintaining control beyond the policy layer as environments become more complex. That seems to be where the market is starting to split. Some platforms are more posture-management focused, while others are pushing toward real-time monitoring, behavioral analysis, and infrastructure-level control for AI agents.

The other challenge is that enterprise requirements seem very different from smaller deployments. A company running customer-facing AI systems in banking, telecom, or airlines probably cares less about basic prompt filtering and more about things like observability across agent workflows. Also multi-turn attack detection, low latency at scale, governance and auditability, and deployment flexibility (private cloud/on-prem).

The names I see most often besides Zenity are NeuralTrust, Palo Alto Networks, and CrowdStrike, but they seem to approach the problem from very different angles.

NeuralTrust appears focused on agentic AI security, runtime governance, and observability, while Palo Alto Networks and CrowdStrike are extending broader enterprise security capabilities into AI environments.

Wondering how people here are prioritizing when looking at Zenity alternatives for agentic security.

reddit.com

u/NewZealandTemp — 1 day ago

▲ 0 r/aiagents

Testors wanted for custom Agent Harness

Hi everyone,

I've been working on an Agentic Harness wrapper for a few months now and I recently finished a large update so I'm looking for some 3rd party testing and feedback.

The harness is designed to simulate human like learning through practice and repitition. Experiences are saved and converted into beliefs if applicable. Instead of markdown files beliefs are stored like memories as individual statements. Incoming information, messages, tool returns, read files, etc... are routed through a belief and memory search using FAISS to establish keyword semantic anchor points then pulling nearby beliefs based on temporal and structural (relational) meta data. The highest relevance beliefs are summerized (if too long) or directly woven into the incoming text by the keyword anchor using a static tag *()* to indicate an internal belief. The recent update adds a tools skills pipeline that pulls out propositional beliefs about specific tools and appends the tool description in the schema directly to include the most relied on tool related beliefs.

The main purpose is to allow the agent to perform tasks without relying on subagents. Instead of a large static system prompt the agent receives an ongoing injection of directly relevant memories/beliefs that allow it to adapt and problem solve in real time based on past experiences.

Please check it out and reach out to me with any issues (the setup wizard is new as well) or feedback!

Thanks in advance

https://github.com/munch2u-a11y/Helix-AGI.git

reddit.com

u/LowDistribution3995 — 2 days ago

▲ 8 r/aiagents

I built a free tool to improve context switching

So, I usually use ChatGPT for planning and brainstorming and then I move to Codex once I feel like I have fleshed out my idea enough.

However, there was no easy way to export my conversation in ChatGPT to Codex.

So I built this simple utility that allows you to simply export the whole convo using the share link you get from ChatGPT.

Just copy the share link, paste it into the tool, click convert, wait a few seconds (can take a bit longer with large conversations), and then you can download the convo as an .md or .json file, or just one-click copy the whole convo.

Check out Chat Exporter and lmk what you think!

reddit.com

u/Rooster_Odd — 2 days ago

▲ 30 r/aiagents+3 crossposts

I built a fully automated AI video generation & Instagram publishing pipeline in n8n using Gemini and Veo 3. Here’s how it works.

Hey everyone,

I wanted to share a look at an autonomous content engine I’ve been fine-tuning recently. The goal was to build a system that handles everything from ideation and video rendering to final asset management and social media publishing without any manual intervention.

I’ve attached the full canvas architecture in image_48bbaa.png. Here is how the technical pipeline handles the heavy lifting:

⚙️ How It Works:
Structured Ideation: A Schedule Trigger fires up a Google Gemini Chat Model node. I’m utilizing a Structured Output Parser here to ensure the AI output strictly adheres to a predictable JSON schema (captions, hashtags, visual prompt data) so it never breaks the down-funnel nodes.

Async Video Generation (Google Veo 3): The visual prompts are sent via HTTP requests directly to the Google Veo API. Because video generation takes time, the workflow passes through a conditional check (If node) and a Wait loop to poll the endpoint until the asset rendering is complete.

Data Sanitization & Storage: A custom JavaScript node cleans up the API response. The video is downloaded, pushed to Google Drive, and permission-shared automatically to create a clean, accessible URL for the social platforms.

Meta API Publishing: The final asset URL and Gemini-generated caption are sent to the Instagram Graph API (INSTA node). It pauses momentarily via a Wait step to let the platform finish processing the media container before triggering the final container publish step. Everything is logged in a Google Sheet at both the start and finish for auditing.

🛠️ Key Takeaways from Building This:
Handling Async APIs: When dealing with heavy GenAI video models like Veo, robust webhook polling or carefully configured wait-loops are essential to prevent workflow timeouts.

Strict Schemas are Life: If you don't parse your LLM outputs structurally, minor formatting variations in captions or hashtags will crash your downstream HTTP requests.

Happy to answer any questions about the node configurations, the Meta API payload structure, or working with Veo endpoints! Let me know what you think.

u/Mohd_Hamid — 2 days ago

▲ 12 r/aiagents+6 crossposts

Great AI tool for retail investors

Tracking every recommendation my AI pipeline makes — here's the current win rate across sectors

Been running ProspectAI autonomously across multiple sectors.

Here's what's in positive territory right now:

UTILITIES
• D — rec. May 17 | entry $61.73 → now $68.24 (+10.55%) ✅
• CEG — rec. May 17 | entry $267.20 → now $286.94 (+7.39%) ✅

HEALTHCARE
• LLY — rec. May 1 | entry $963.33 → now $1,043.26 (+8.30%) ✅

CONSUMER DISCRETIONARY
• MAR — rec. May 13 | entry $350.23 → now $370.22 (+5.71%) ✅

SEMICONDUCTORS
• AMD — rec. May 19 | entry $420.99 → now $444.73 (+5.64%) ✅

Every entry zone and trigger price was generated autonomously by the pipeline — no manual intervention.

The pipeline runs: Reddit sentiment → Technical analysis → Fundamental analysis → Adversarial critic → Final strategy.

All recommendations tracked live 👇
https://prospect-ai.moisesprat.dev

u/Downtown_Extension_6 — 2 days ago

▲ 68 r/aiagents+3 crossposts

An agent runtime with persistent memory that fans work out across multiple models.

Hey! Finally releasing code I've put the past 4-5 months of my life into, I had an idea and wanted to fix some things that really irritated me with LLMs. Aimee runs agents that actually remember. Self-hosted, your keys. No subscriptions, no costs, purely open source. First public beta release, but the results have already exceeded my expectations.

- Persistent searchable memory across runs. No starting from zero. Shared across all agents models and users.

- Delegates bounded sub-tasks to multiple model backends in parallel, each with a role and persona. Use local LLMs, subscriptions, or API keys.

- Indexes your codebase, records past decisions, and curates all associated documents so agents have real context and a knowledgebase of past decisions, not just a prompt.

- Exposes OpenAI/Anthropic-compatible APIs, so Claude Code, Codex, or your own orchestrator can drive it. You can also do the inverse, and run any model you have hooked up to aimee as your model for Claude Code, Codex, etc.

- Switch models, TUIs, etc. at anytime, and keep your decisions, knowledge, and other information!

- Works with anything that can use MCP, plugins, web APIs, or ACP.

Built for people tired of stateless one-shot agents. Try it out: https://github.com/RakuenSoftware/aimee

u/KitchenAmoeba4438 — 3 days ago

▲ 397 r/aiagents+34 crossposts

browser-search — three tools, zero cost, and your AI agent learns to search and browse the web

/r/Hermes/comments/1uclwgi/browsersearch_three_tools_zero_cost_and_your_ai/

u/Ill-Tradition1362 — 4 days ago

r/aiagents

Review my resume : AI Engineer having 3 years of experience, looking for new job

How do you enforce least-privilege access controls for AI agents?

AI was told to work an office job. It resorted to blackmail.

I created a tool that turns AI agents into white collar employees.

AI agents can already function as an entire virtual dev team. This is real, what do you think?

I think we're repeating the early microservices mistake with AI agents

Ford brought 350 engineers back after realizing AI can’t deliver with the same quality

I built an open-source Agent Verifier for Claude Code, Cursor &amp; other Coding Assistants that catches security issues, hallucinated tools, infinite loops and anti-patterns in Agent built using LangChain, LangGraph, and other frameworks. (free, open source, 100% local)

We're giving agents memory for everything... except video.

Agent with tiered working memory and cross-session learning — architecture, gaps, and what the research didn't cover

TokenMizer - a local proxy for session checkpoint/resume and graph memory across Claude, GPT, and Ollama

Looking for an alternative to Antigravity IDE for modular code (~$20)

I need to validate this idea gimme your honest feedback

What are the best Zenity alternatives for agentic security rn?

Testors wanted for custom Agent Harness

I built a free tool to improve context switching

I built a fully automated AI video generation &amp; Instagram publishing pipeline in n8n using Gemini and Veo 3. Here’s how it works.

Great AI tool for retail investors

An agent runtime with persistent memory that fans work out across multiple models.

browser-search — three tools, zero cost, and your AI agent learns to search and browse the web

I built an open-source Agent Verifier for Claude Code, Cursor & other Coding Assistants that catches security issues, hallucinated tools, infinite loops and anti-patterns in Agent built using LangChain, LangGraph, and other frameworks. (free, open source, 100% local)

I built a fully automated AI video generation & Instagram publishing pipeline in n8n using Gemini and Veo 3. Here’s how it works.