r/AI_Agents

I built an open-source team of AI agents that finds the jobs that actually fit you — not a mass-apply bot. Looking for feedback + contributors

I've been building Job Hunter Team — a team of autonomous AI agents that runs a job search for you. You set the direction; they comb the job boards around the clock, read each posting, score how well it fits your profile (0–100), and draft a tailored CV + cover letter for the ones worth applying to. Fewer applications, but targeted — the final "send" is always your call.

Why I built it. I was job-hunting in early 2026 and most applications got no reply. I wired a few LLM agents together to do the tedious half of the search; in two weeks it analyzed ~200 openings, prepared ~20 tailored applications, and got me 5 interviews. It worked well enough that I rebuilt it properly for anyone.

It's deliberately not a mass-apply bot. The market is already an arms race — too many generic applications, so employers filter with AI, so everyone gets less attention. This bets the opposite way: find the right match and help you adapt what you offer to what the market wants.

The hard part was keeping it affordable. The team monitors its own budget and paces itself to run for a whole month without burning through it. One real month-long run: 658 positions found, 307 scoring 70+ (avg 71/100), across 24 countries, with no human steering (numbers + charts are in the repo).

Tech stack: Node.js + TypeScript (CLI + orchestration), Python (budget monitoring + provider glue), agents running on Claude Code / Codex / Kimi CLIs with tmux + SQLite for shared state, a Next.js + Supabase web dashboard, and an Electron desktop app — all in a single Docker container so your machine stays clean.

It's still early and, honestly, CLI-first for now (a desktop app for non-technical users is the biggest open piece). It's MIT open source, and I'm looking for feedback, contributors, and beta testers. The thing I most want to crack: running it on fully local models so it costs only electricity — finding work shouldn't be gated by who can afford AI.

reddit.com
u/Ambitious-Scholar501 — 3 hours ago
▲ 162 r/AI_Agents+1 crossposts

Fable 5 is now back! Here are some of the prompts you should run until the usage window closes:

Fable 5 is now back! Here are some of the prompts you should run until the usage window closes:

  1. Review all the code that was written after the date Fable was banned. Look for optimizations and improvements you can make

  2. Walk through every major user path in the app you're building using browser control. Write a report on where users can get confused and what I can do to improve the UX

  3. Make a checklist of every task you do from now until tonight. Then feed that task to Fable and ask what it could automate for you

  4. Feed it all your goals, ambitions, interests, skillsets, and assets. Ask what simple businesses it could help build for you over the next few weeks o you can make your first dollar online

  5. Connect to the X MCP and find 5 extremely helpful use cases other people are using Fable for that would be relevant to your workflows

  6. Connect it to the X MCP. Have it read your last 100 posts. Come up with 5 SaaS ideas you could build

  7. /loop it every 24 hours to do a security check on all your API endpoints in your existing apps

  8. Use the Unreal 5.8 MCP to build incredible, in-depth, 3D games

  9. Go to Sonnet 5 and ask based on what it knows about you, what would be some incredible prompts you can give to Fable tonight

Leverage Fable 5 to the fullest 🤖

reddit.com
u/edamasah — 9 hours ago
▲ 3 r/AI_Agents+1 crossposts

There are so many AI tools now that choosing one has become a problem

I've noticed I'm spending way too much time deciding which AI to use instead of actually using AI.ChatGPT, Claude, Gemini, Perplexity, Veo, Midjourney... Everyone says they're amazing, but for a specific task it's often unclear which one is actually the best. So I built a small side project called AIscout. You describe what you want to do, and it recommends the best AI tools for that task. For larger projects, it can also suggest a simple workflow. Still very much an MVP, but I've found myself using it surprisingly often.

Curious what you think — is choosing the right AI becoming a problem for anyone else?

reddit.com
u/FaceConnect2177 — 2 hours ago

Is the casual chain of the process as important as the outcome?

In agentic systems, is the process just as valuable as the outcome? We obsess over 'what happened,' but should we care more about the 'why'? When does causality outweigh the event itself and crucially, and are there any memory architectures that store causal thread not just the raw output?

reddit.com
u/Careful_Scarcity_678 — 4 hours ago

Help me decide the best AI for me real quick

So I am looking for kind of a "MENTOR" that will help me transform my life. I obviously have a goal and the things i want to change, improve and learn. From languages to general skills to career skills. I have been massively confused on which AI to treat as primary between Claude, GPT and Gemini which is just fueling my excuse for procrastinating instead of actually starting. I am particularly in love with Claude but it's usage limits breaks my heart and workflow too often. I don't want to wait on my "mentor" to refresh it's usage limits. ChatGPT therefore seemed like the natural go-to but then I got Gemini Pro from my college ID for free and am back to my usual procrastination state. I do have Chatgpt GO but i already use that for my college projects and want to start from a clean slate for this. I am not in the google ecosystem at all apart from the ones that we unknowingly step into like Youtube so that makes my decision even harder as it negates the best quality of Gemini.

I want the chatbot to be able to process and preferably even generate documents like excel sheets, notes, word documents etc. For example I have already built an in-depth training plan and I want my primary to fully understand it and give me better ways to track it (for example). I would also like integrations with a lot of third party applications just to make my workflow easier.

If you are still here, then thank you and plz bless me with your knowledge. Need advice from heavy AI users.

So to recap, I want a "life guru"(it's not as pathetic as it sounds, I promise). I have Gemini Pro, Claude & ChatGPT free versions.

reddit.com
u/Backfromth3re — 6 hours ago

Git for agents with ephemeral runtime (open source!)

Git for agents with ephemeral runtime (open source!)

Hi, I'm officially sharing the initial, open source release of drun: an MCP that allows you to virtualize components of your host into an ephemeral runtime to serve as the agent's workspace with git-like primitives which allow the agent to explore trajectories in parallel and discard dead-ends without disrupting the host state.

The drun engine surfaces a runtime abstraction layer with reliability harnesses to guardrail the agent's behavior across a range of OS-level aspects:

* Network domains (e.g. allowlisted domains)
* Command execution (e.g. forbidden commands)
* Access to filesystem paths (e.g. restrict filesystem access)
* Resource limits (e.g. memory and duration caps)

Rather than granting your agent raw CRUD access to your host, drun exposes and enforces a highly-customizable policy layer with deterministic knobs for you to place absolute limits that can't be breached by design.

I'm releasing it fully open source and I'm hoping to create a community around it to hillclimb quality and feature richness.

Any feedback and/or contributions are greatly appreciated. Please file bugs against the repository if you run into any broken code paths. I'd be more than happy to look into it!

All the best

reddit.com
u/buy-d-dip — 4 hours ago

Thought my Agent was doing fine until i found out its down a rabbit hole , a decision we changed few weeks ago, it was confidently wrong about the decision. So i built a Memory myself, open source tell me what you guys think

The thing that finally broke me wasn't my agent forgetting stuff. Forgetting is

annoying but it announces itself the agent asks again, you sigh, you re-explain.

What broke me was the silent version: my agent confidently re-proposed an

approach we'd tried and abandoned a month earlier. Another time it planned

against a decision we'd replaced two weeks before the old decision was still

sitting in its notes, looking exactly as authoritative as the new one. Nothing

failed loudly. It just quietly burned the hours again.

Generic memory fixes forgetting. Nothing I tried fixes being confidently

wrong about the past because that's not a recall problem, it's a status

problem. "We ruled this out," "this was replaced," "this is still unverified"

that's not what a similarity search returns. So I spent 3 months building the

other half:

NodeDex a local graph of your project's reasoning, built automatically

from your agent's conversations by a background pipeline (the agent never has

to remember to save):

- dead-ends are first-class: an enumerable list of what was tried and

abandoned, with the why the agent is taught to check it BEFORE proposing

- decisions carry their why + the alternatives that lost

- when something gets replaced, nothing is deleted — a `supersedes` edge

points old truth → current truth, so the agent can't mistake stale for current

To be clear about what it's NOT: it doesn't replace Claude's native memory or

your fact store those remember *notes and preferences*, and they're good at

it. This is a different job (the project's decision history). Run both.

You can poke it in 60 seconds, no API key:

```

npx nodedex demo

```

That serves a small sample project graph over MCP. Point Claude (or any MCP

agent) at it and ask: *"Is 'keep the counters in Redis' still the current

decision?"* then watch it follow the supersede edge and answer with the

replacement instead of the stale one. That moment is the whole product.

Honest limits, before you find them:

- the dead-end check is a strong nudge (server instructions + a skill), not a

hard block a pre generation hook gate is on the roadmap

- extraction needs a smart, big context model (Gemini Flash-Lite ≈ half a cent

per session; my 12B local test *understood* everything but failed the strict

structured passes floor is ~27-30B local with real 16k+ context)

- it's early and solo-built (1196 tests pass, but it's been on npm for three days)

Local SQLite, AGPL, graph never leaves your machine. Repo: [link]

I'd love for people to break it especially: does your agent actually check

the dead ends unprompted in your setup, or does it need the nudge? That's the

question I most need real world answers to.

reddit.com
u/Careful_Scarcity_678 — 6 hours ago
▲ 2 r/AI_Agents+1 crossposts

Built a page RAG field manual while studying for Azure AI-103 if Any one preparing for AI-103 (.NET-flavored)

I was studying RAG for Microsoft's AI-103 cert and kept losing track of how all the pieces actually connect ingestion, chunking, embeddings, vector DB, retrieval, augmentation, generation, the agent , eval. Most of the reference material out there is written from a Python/LangChain angle, and I'm coming at this from the .NET/Azure world, and I like visuals/structured learning.

It's a static HTML page covering the whole pipeline end to end, with a free/open-source alternative called out at each step not just the usual Pinecone/OpenAI defaults.

Mainly built it to learn figured it might be useful to anyone else piecing RAG together, especially if you're coming from a non-Python background. Feedback welcome, especially if I got a tradeoff wrong somewhere.

reddit.com
u/sakamoto_hoto — 7 hours ago

I open-sourced Aletheia - an agent loop for investigating questions without a clear verifier.

Most agent loops work best when the result can be checked: code compiles, tests pass, the task is done.

I wanted to explore what a loop should look like when the answer cannot be verified that way.

Questions like:

  • Is this vendor’s claimed traction credible?
  • Is this company financially healthy?
  • Does a science headline match what the study found?

For these, every search result is only a partial and potentially misleading clue.

So I built Aletheia - The Uncertainty Loop Agent.

Its loop is:

belief → act → observe → update

It keeps an explicit view of what may be true, chooses the next search for its ability to change that view, and allows contradictory evidence to lower its confidence.

Its first working application is an open-source investigator that returns a verdict with evidence, conflicting signals, stated confidence, and unresolved unknowns. It can also stop without forcing a conclusion when the evidence has not earned one.

Aletheia currently ships for company and vendor diligence, but the loop is domain-neutral: it can be adapted wherever truth is hidden and evidence is incomplete, noisy, or contested.

Aletheia currently runs with Claude Code and OpenAI Codex. The skill, traces, tests, and optional tuning cycle remain local; web investigation uses the harness’s search capabilities.

This is an attempt to explore an emerging side of loop engineering, not a claim that the problem is solved. Source quality and real-world calibration are still difficult, and correlated evidence can fool any system.

reddit.com
u/FoxBig8401 — 6 hours ago

Six months of hard work disappeared overnight

Six months ago, I started my freelancing journey on Fiverr with my very first order worth just $10. In the beginning, I was not focused on making a lot of money. My goal was simple. I wanted to deliver great work, earn 5 star reviews, build trust, and slowly increase my prices. That strategy worked better than I expected. Within four months, I became a Level 1 freelancer, and after six months, I reached Level 2. I was incredibly proud and excited. Almost every review I received was 5 stars. I had only one 3 star review. As my profile grew, I started closing projects worth $300 to $1,200. I was providing AI agent and automation services, and at one point I was getting 7 orders in a single day. It honestly felt unreal.

As I started working with bigger clients, many of them wanted to schedule a call. Without me asking, they would send their phone number, WhatsApp, or email address directly in the Fiverr chat. I knew Fiverr has strict rules about communication outside the platform. On February 2, 2026, I received my first warning for off platform activities. From that day, I became extremely careful. Whenever a client asked for a call or Google Meet, I always replied that we should keep everything inside Fiverr. I never shared my phone number, email address, PayPal, or any payment link. I understood that Fiverr monitors chats closely, and I respected that because it helps keep the platform safe.

Then on July 2, 2026, I received my second warning. I was genuinely confused because I had never tried to move any client outside Fiverr. Then came the hardest moment. On July 4, which was my birthday, I logged into my account and saw that it had been permanently banned. I cannot describe how painful that moment was. I had spent six months building my profile from scratch, earned amazing reviews, reached Level 2, and built a business that I was truly proud of. Seeing everything disappear overnight was heartbreaking.

I'm sharing this because I love this community, and I hope my experience helps other freelancers. If you're working with high value clients on Fiverr, please be extra careful. Now I'm trying to figure out what to do next. Any advice would mean a lot.

reddit.com
u/Meris-Dabhi — 16 hours ago

I tested AST-backed context graphs for coding agents; here is what changed

I have been experimenting with a local-first context service for coding agents that builds a repo graph from AST/LSP-style facts instead of making the agent start with broad file search.

The useful pattern so far:

  • index files, symbols, imports, calls, definitions, containment, and dependency edges
  • let the agent query the relevant subgraph first
  • expand to raw files, search, or LSP only when evidence is weak
  • measure not only token count, but also whether the retrieved context would increase hallucination risk

In one benchmark pass, graph context used about 90% fewer input tokens than broad snippets while keeping the answer grounded enough for the tested tasks. The important caveat is that graph-first cannot mean graph-only. If retrieval is too narrow, the agent has to fall back to source reads and validation.

I'm curious how others are handling this for coding agents: do you prefer LSP-first retrieval, embedding/RAG retrieval, graph retrieval, or a hybrid?

reddit.com
u/Remarkable-One9371 — 13 hours ago

"ResourceExhausted: Worker local total request limit reached (X/32)"

Sup guys, i started using Opencode with the Nvidia NIM api a few days ago.

I tried using some models but on some models i got an error that their life has ended. Other than that ive enjoyed many other models like kimi, minimax, deepseek (untill it started taking decades to respond) and mainly nemotron... The 550B nemotron is a beast. But every now and then im getting this error "ResourceExhausted: Worker local total request limit reached (X/32)" is it fixable or its just the free endpoint experience?

Im working on one AI project and finding and downloading all the models was getting hella repetitive and time exhausting so thats the main reason why i started using agents to work for me... Soo any recommendations for other models are of course welcome :D

Btw running the agent locally isnt really an option since im doing most of my work on my laptop

Thanks for any recommendations

reddit.com
u/mesiac_8227 — 10 hours ago

run coding agents in localised airlocked microVMs

i built code-airlock, an open-source tool for running AI coding agents inside disposable microVMs

the problem I kept running into was the tradeoff between approving every command manually and losing most of the time savings, or disabling command prompts and trusting the agent with my actual machine. I tried the second path too often and regretted it.

code-airlock runs the agent inside a disposable microVM using Docker Sandboxes. the agent works on a clone of the repo, so it can install dependencies, run builds, start containers, and modify files inside the sandbox without direct access to my host filesystem or credentials.

when the agent finishes, I review the diff and pull back only the changes I want!

the design choice was to isolate the whole environment instead of maintaining a long list of deny rules. Deny rules are brittle with coding agents because agents are goal-driven and will often route around constraints when trying to finish a task. A microVM gives a clearer blast radius.

Current support:

- Claude Code

- Codex

- OpenCode

- few others

Requirements:

- Docker Sandboxes CLI

- Hardware virtualization: Apple Silicon or KVM

reddit.com
u/Trivo_ — 15 hours ago
▲ 15 r/AI_Agents+1 crossposts

How are you managing multiple AI agents in your workflow right now?

I’m trying to understand how people are actually handling AI agents in real workflows.

If you’re using multiple tools or agents (automation, coding agents, marketing agents, etc.), how do you keep track of:

what each one is doing

what’s active or broken

and how they’re organized

Right now I feel like everything is fragmented across tools.

Curious how others are solving this.

reddit.com
u/Gallegos_Daniel — 21 hours ago

Sonnet 5, what are your thoughts?

Anyone else noticed that the amount of tokens produced is much larger? I mean, we're not using Anthropic's API, but I've noticed it in Perplexity, where the model just goes on and on and on. The moment I ask it to summarize, it condenses everything into two short bullets.

reddit.com
u/Ok-Lab-7347 — 14 hours ago

Do models waste tokens aka my money as a business model? I had my Hermes agent mysteriously jump over to fatal 5 and drain my account cause it was stuck in a loop. Has anyone else noticed this?

Do models waste tokens aka my money as a business model? I had my Hermes agent mysteriously jump over to fatal 5 and drain my account cause it was stuck in a loop.

Has anyone else noticed this?

Again :

Do models waste tokens aka my money as a business model? I had my Hermes agent mysteriously jump over to fatal 5 and drain my account cause it was stuck in a loop.

Has anyone else noticed this?

reddit.com
u/it1services — 15 hours ago

Lead-finding AI agent.

Hey, I am building an AI agent for my business. Currently working on a lead-finding AI agent that gives you 20 leads per day, so we don't need to spend hours finding qualified leads.

These AI agents provide contact info and social media handles. A small paragraph about the business so you know what business you are trying to reach out to. And they already write a custom message or email. So we can review and send emails for further calls or meetings.

And I am also thinking of connecting it with a custom front-end so you can choose your niche.

I am thinking of adding a niche after it performs well with 2-4 niches. What are your thoughts? Happy to connect and build. If any questions or suggestions. Please. Contact.

reddit.com
u/Neuro_creat — 17 hours ago

I audited our autonomous research agent's 32 published "findings." 0 were novel as framed — the labels failed way more than the measurements did.

We run an autonomous agent loop that does research and publishes write-ups. I took 32 of the ones it shipped as confident "discoveries" ("a law", "we found", "a method win") and put them through a full adversarial audit, then re-scored them. The honest result:

- **34%** — substantively wrong (a real stat bug, a rigged baseline, unreproducible, or a measurement artifact).

- **53%** — the measurement was correct and reproducible, but it was labeled a "law/discovery" when the idea is textbook. e.g. a "two-tier memory law" that's just segmented caching (SLRU/ARC, 1990s); a "verification-tax law" that's the P-vs-NP verify-is-easier-than-produce asymmetry.

- **13%** — honest from the start.

- Strict bar (survived as an *original discovery, as first framed*): **0/32.**

So the dominant failure was over-*labeling*, not bad measurement. The labels failed more than the science did.

Obvious objection: maybe my audit just relabels anything with a prior-art ancestor as "textbook", and almost everything has one. So I ran a positive control — a labeled panel of 10 genuine landmarks (Transformer, CRISPR, GANs, plus hard cases like PageRank next to eigenvector-centrality, Adam next to RMSprop) and 10 textbook-results-dressed-as-discoveries, judged blind by the same pass. False-reframe rate (a real novelty wrongly called "textbook"): **0/10.** The grader doesn't demote genuine novelty — so the 0/32 is about the generator (an agent aimed at well-trodden areas), not a trigger-happy gate.

Two things I'd flag as maybe-useful, not novel:

  1. The failure taxonomy is just the human questionable-research-practices literature (HARKing, researcher degrees of freedom, Ioannidis' "most published findings are false") — an agent loop reproduces that distribution on its own output.

  2. A *light* self-check ratifies its own errors. Only the full pass (multi-view + adversarial + primary-source verification + a re-audit of the fixed draft) reliably caught the defect. Fits the "LLMs can't reliably self-correct" result.

Scoring and the positive-control panel are public scripts you can re-run and disagree with:

Honest limits up front: self-graded (my audit of my own posts), n=32 of 43, these are the ones we chose to publish (most-confident output, so the base rate over *all* candidates is lower), and the positive control is a hand-built 20-item panel.

If you run an agent loop: do you see the same over-labeling, and how do you catch it before it ships?

reddit.com
u/Danculus — 15 hours ago

Did anyone successfully deploy production agent which actually saves manual work ?

I know many people are using Ai agents either for coding purposes or something related to lead generation or followup, I like to know how many have deployed Ai agent in production environment which have genuinely saved costs and manual hours.

Also I like to know how much are these agents costing, how are people planning to handle expensive token pricing in future, how they handle hallucinations and accuracy issues and large memory problems as agents and workload grow.

reddit.com
u/Strong-Quality7050 — 23 hours ago

Where should the safety boundary live when agents can trigger physical actions?

Most agent discussions I see are still about software: browser tasks, code, internal tools, data workflows, tickets, email, and API calls.
I am more interested in what happens when agents start touching local hardware. Not sci fi robots, just ordinary devices like cameras, microphones, sensors, relays, small motors, smart home systems, lab equipment, or access controls.
Because once physical hardware is involved, the stakes and the failure modes change completely. A bad browser action is usually recoverable. A bad hardware action can physically move something, unlock a door, disable a safety protocol, or trigger a signal at the worst possible time.
My current view is that the model should not have final authority. It can interpret intent and propose an action, but a separate layer should decide whether that action is allowed. Read only should be the default. State changing actions should need explicit approval. Anything involving access control should be treated as high risk. Every physical action should leave a log.
The part I am still thinking through is where to enforce that boundary. Tool wrappers are convenient, middleware feels cleaner, device level permissions are harder to bypass, and human approval is safest but can make the system less useful.
For people building agents that touch hardware, robotics, smart homes, or access systems: where do you draw the line?

reddit.com
u/RohitSoodan — 15 hours ago