r/AI_developers

EVIL GOOGLE?
▲ 108 r/AI_developers+2 crossposts

EVIL GOOGLE?

Google has officially announced that #Gemini CLI is being retired in favor of #Antigravity CLI, their new terminal experience.

On June 18, 2026, Gemini CLI and Gemini Code Assist IDE extensions will stop serving requests for Google AI Pro, Google AI Ultra, and free individual users.

The funny part?

Less than 11 months ago, Google published this headline:

#Gemini CLI: your #open-source AI agent”

Free. Open source. Built for developers.

Fast forward to today - sunset in less than 30 days, no more new model access, no more real future, and people are already seeing errors like:

| Model "gemini-3.5-flash" was not found or is invalid.

| /model to switch models.

Classic Google.

They sell you the open-source dream, get their fanboi hype, then move everyone to the new shiny closed experience.

The new Antigravity CLI is now the path forward if you want access Google AI quota through Google AI plans. Otherwise, you can still go the pay-per-use API key route.

Never trust Google’s “Do no evil” era too much.

Meanwhile, OpenAI Codex was open source from the beginning - just like the name suggested.

OpenAI gave us CLIP, Whisper, GPT-OSS, and Codex-rs.

Say what you want about OpenAI, but these are the tools that actually pushed the AI developer ecosystem forward.

Also, Claude Code was open source for a while too XD

Hope you learn something new today.

u/usamanoman — 2 days ago
▲ 19 r/AI_developers+13 crossposts

How do you actually test a voice AI agent without calling it yourself every time?

So we've been working on a voice bot that handles customer calls and honestly the testing part has been brutal. We were literally calling the thing ourselves to check if it broke after every change.

Eventually we just wrote a framework that synthesizes fake caller audio, pipes it into the agent, and checks if the response is sane — latency, hallucinations, whether it handles interruptions, etc. Runs locally against a SQLite db, no cloud stuff.

It connects over websockets, can mock twilio streams, works with elevenlabs and vapi agents too. You can also plug in ollama as the judge so the whole thing runs offline.

We open sourced it: https://github.com/unforkopensource-org/decibench

Curious how others here handle this. Are you just vibing and hoping production doesn't break or is there a better workflow I'm missing?

u/Tricky_School_4613 — 2 days ago
▲ 54 r/AI_developers+9 crossposts

My boyfriend and I are building an open-source AI coding workspace for microcontroller!

Hey everyone :)

My boyfriend and I have been working on an open-source project called Exort.

It’s a desktop app for developing microcontrollers with the help of an AI agent. We used OpenCode as the AI agent, and Exort now supports all Arduino boards.

The best part is that it’s totally free to use.

Check it out here:
Repo: https://github.com/Razz19/Exort

Your support would really help Exort and us a lot ❤️

u/moonlikee — 3 days ago
▲ 10 r/AI_developers+3 crossposts

Towards uniformity

We have more and more developers who use AI coding assistants and just prompt, review, re-prompt, re-review, ... and finally do PR with what they get from AI and PR are approved/merged.

We also see more and more POs who say they use AI to describe their ideas, to get new ideas, integrate AI suggesctions and let AI write stories they review and send to dev.

But does it mean that all future apps in a functional domain will progressively by internally similar, at the same level, at the same quality, with the same uniformity ?

What will differentiate the PI values of an app as compared to another ?

Are we exposing security of the "now similar" apps (same attacks, AI knowing the code and its weaknesses) ?

reddit.com
u/Spare_Dependent6893 — 3 days ago
▲ 1 r/AI_developers+1 crossposts

Most teams ship prompts like its 2008. I built something better.

Most teams ship prompts the same way they used to ship CSS in 2008. Tweak, eyeball a few outputs, push to prod, wait for users to complain, repeat. Prompts are production code. They deserve the same testing infrastructure your Python does.
 
That's why I built PromptLabs.
 
How the loop works, in five steps:  

  1. You provide the input. Either an intent ("classify customer support emails as billing, technical, account, or other") or an existing production prompt plus the failure modes you've been seeing.
  2. EvalGen writes your test suite. It picks 5 to 8 categories of inputs that will exercise the prompt (happy path, edge cases, adversarial), fires one parallel LLM call per category, and dedupes the result. So you get real coverage, not 50 reworded copies of the same easy case. The same call also writes the scoring rubric. Then it splits the test set into train and holdout. The holdout never leaks into optimization.
  3. Runner executes the prompt across every target model in parallel. Choosing between Sonnet 4.6, GPT-5, and Gemini 3? All three run at once on the same eval set. Results in minutes, cost per eval plotted on the same chart.
  4. Judge scores every output, criterion by criterion. LLM-as-judge with reasoning attached, so you can see exactly why a score is what it is.
  5. Optimizer proposes a diff, not a regeneration. It looks at where the prompt failed, then returns specific line edits (insert this clause after line 3, delete this sentence, reword this paragraph). You read it like a pull request. The new version is scored on the holdout set. The loop checks for convergence or overfitting, and either accepts the result or loops back to step 3 with the new prompt.

 
The accepted prompt is served over HTTP. Your production code fetches the latest version at request time, so you can iterate without redeploying.
 
Three things that make this different from tools you've probably tried:
 
The eval set is real, not theater. Stratified by category with parallel generation and dedup, so you get coverage of edge cases instead of fifty rewordings of the happy path. Most tools either skip eval generation entirely, or give you one LLM call that quietly produces 40 near-duplicates.
 
Train and holdout stay separate, and the loop enforces it. The trajectory chart shows the gap widening the moment you start overfitting, and the loop halts itself when it does. The "best version" pick uses a lower confidence bound so a lucky high-variance run can't game the leaderboard. Most "optimizer" tools you've seen don't even have a holdout set.
 
The Optimizer evolves your prompt, it doesn't replace it. A diff is reviewable. You can accept some edits and reject others. The domain knowledge you spent six months baking into your prompt isn't thrown out every iteration. DSPy-style frameworks regenerate; this one refines.
 
If you've been gluing promptfoo + dspy + langfuse together to do what should be one workflow, this is one tool that does the whole thing. If you're treating prompts like config strings instead of like the production code they are, you're leaving accuracy on the table and inviting silent regressions you wont see until they hurt.
 
MIT, local, your keys.
 
https://github.com/temm1e-labs/promptlabs

reddit.com
u/No_Skill_8393 — 4 days ago
▲ 13 r/AI_developers+6 crossposts

I just made an AI that can switch to over 9 personalities including Tung Tung Tung Sahur!

i made this AI called ShiftAI, a voice AI, but it is not for assisting, it has the ability to switch personalities. it has over 9 personalities like: Mean, Depressed, philosophical and it can even turn into tung tung tung sahur! you can change its personality by saying: change your personality to (the one you want) all of the personalities are on the site and a better explanation. the site was made with HTML and CSS (obviously, and the app you DOWNLOAD was made with python + tkinter, uses Groq API for respnses. And also the site might look messy on a phone and I used tkinter which I'm pretty sure won't work on phones so if you're on a phone you unfortunately can't get this app. link in the comments and would love feedback!!!

u/Next-Ad-4052 — 7 days ago

[For hire] AI/ML fresher seeking job

Hi guys, I'm an AI/ML engineer with 6 months of intern experience in 2 remote roles 3-3 moths each , I have ppo but I'm seeking better job opportunities, I have worked with RAGs, LLMs, AI workflows, ML pipelines, AI SaaS, databricks , AWS (sagemaker) etc

If you feel you might have something for me feel free to DM , I can share my resume there.

Thanks

reddit.com
u/Narrow-Win-969 — 7 days ago

Is there a conservative AI Developer who wants to help with a completely ethical project?

An insanely good use of AI would be to get it to go through as much data is available about every entry level women's and gender studies course in every accredited college or university in America, have it summarize the material and keep track of categories of discussions and claims and how often they were made.

Literally everything they teach is lies so if you can quantify everything that is taught, you have a list of nonsense that you can sit right next to an explanation no for why it's nonsense that no one without a serious personality disorder could deny.

You show people that, you get a cult of lunatics kicked out of the universities. Feminists can't actually defend their beliefs, they just use whatever power they can to bully anyone who disagrees with them into silence. They can't bully away the fact that everything they teach is objectively wrong.

reddit.com
u/SteelFox144 — 9 days ago
▲ 11 r/AI_developers+1 crossposts

How to Approach Learning in this Day and Age of AI?

I've always enjoyed learning about how systems work and how programmes operate but, in reality, I don't know syntax or specific file workflows.

I am still an amateur and all I really know how to do is write basic python with little or no libraries. The best thing I have every 'shipped' is a 2nd year winning python robot that used a competition custom built library that did all the action hardware for me. All we did in was plug in the motors and assigning 'R.motors[0] = 10' etc.

Not difficult, obviously. We just wrote navigation and logic.

I want to learn more and be able to make robots and programs but I don't know where to start with AI. Maybe 5 years ago, I'd know the route. Learn the syntax over months and build over years. But it's not like that anymore, it's not that simple because AI can do that anyway.

But it's also not "make this app" and then paste the errors into AI- that's vibecoding, right?

So far, with like 3 true but very unfinished projects under my belt, I have got AI to generate a tech stack based of inputs like 'this should be obtained from this API and then this should go to the frontend' rather than naming real libraries or packages because I simply don't know them.

People say, which I don't really believe is truly strong or sustainable advice, to get AI to generate it but then get it to explain it. When it generates an entire codebase, it can't really explain it within the size and window of a single reply, or even a full chat.

So, I suppose, the question I am asking is: Where do I actually learn in this day and age?

Thanks

reddit.com
u/Remarkable_Yak_8564 — 10 days ago
▲ 2 r/AI_developers+2 crossposts

[Synthetic][PAID][self-promotion] Made-to-order training data generator with web search and exports

Disclosure: I’m on the Abliteration team.

We just shipped a training-data generator for people who need specific examples rather than another generic public dataset.

You describe the examples you want and it generates structured synthetic data. If the dataset needs current or real-world facts, you can turn on web search. Exports are live for Hugging Face, Kaggle, S3, and OpenAI.

The first use cases we built around are classifier and eval datasets for trust and safety: grooming detection, harassment detection, security research evals, jailbreak and edge-case sets, and similar work where teams need examples that general-purpose models often refuse to generate.

I marked this as synthetic and paid because the outputs are generated and this is a commercial tool.

Product: https://abliteration.ai/

Synthetic data page: https://abliteration.ai/use-cases/synthetic-data

Launch video: https://x.com/abliteration_ai/status/2054675554138194178

For people who curate datasets: what export format or per-row provenance metadata do you usually need before a generated dataset is usable?

u/Effective_Attempt_72 — 8 days ago
▲ 20 r/AI_developers+5 crossposts

My Orange Pi 5 Plus just ran a full SEO campaign for a local business, $0 in hosting costs making $100/month for a pilot program!

The Orange Pi 5 Plus in my stack serves as the Ollama model server, running qwen3.5:4b at about 7GB RAM. This week it powered something I'm pretty proud of.

My AI crew ran a complete local SEO campaign for a tattoo shop in San Diego. Here's what it generated automatically:

→ 8 keyword-optimized landing pages targeting San Diego

tattoo searches

→ 3 blog posts with proper meta descriptions and local

keyword targeting

→ 4 weeks of Google Business Profile posts

→ Review request SMS templates for the artists to send

after every appointment

→ Competitor research logged to PostgreSQL

The Orange Pi handled all the inference. Total AI cost for the content generation: about 30 cents in Claude API calls for the final polish pass.

A traditional SEO agency charges $1,500-2,500/month for this. The whole pipeline runs automatically every Monday morning via cron job.

Just posted a full video walkthrough showing exactly how it works, the agent architecture, the PostgreSQL schema, the Squarespace implementation, everything.

https://www.youtube.com/watch?v=a0NXVsqu5jQ

What are you all running on your Orange Pi?

u/Weird_Night_2176 — 11 days ago
▲ 4 r/AI_developers+3 crossposts

Is anyone else drowning in AI context management on large codebases?

Working on a fairly large Azure microservices system (.NET, 40+ services, 5+ years old). We've adopted AI coding assistants across the team and there's genuine productivity gain for individual tasks.
 
But there's a problem nobody seems to talk about: every new chat session is a blank slate.
 
Our codebase has years of accumulated decisions:
• We use a specific handler pattern for vendor integrations
• Auth service has a specific cache-aside setup with historical reasons
• Service boundaries that look weird but make sense given our deployment constraints
• Interface conventions that all the senior engineers know but aren't written anywhere useful
 
When I open a new AI chat, none of that context exists. I either paste a context dump (expensive, eats token budget) or the AI generates code that's syntactically correct but architecturally wrong for our system.
 
We've tried:
• System prompts with architecture descriptions - partial help
• Cursor rules files - limited
• Just re-explaining every session - waste of time
 
I'm actually building a tool to solve this (happy to share more if there's interest) but first wanted to know — is this a widespread problem or specific to how we work?
 
How are experienced devs handling context management with AI assistants on mature codebases?

reddit.com
u/killerexelon — 11 days ago

CodeBase Understanding

It's probably not new to you all that AI is incredible for development speed.

Developers are shipping features faster than ever. In some teams, AI is already writing a large percentage of the codebase.

But I’m curious about something…

As AI-generated code grows, how important is code understanding and code quality becoming for engineering teams?

What I’m seeing more and more:

Developers shipping code they don’t fully understand

Code reviews becoming more superficial ("looks fine, ship it")

Team leaders losing visibility into what’s actually happening

Technical debt growing faster over time

Especially in production systems, this feels risky because every small mistake can become expensive later.

So I’m curious how other engineering leaders see this:

Do you think deep code understanding and ownership still matter as much when AI writes a large part of the code?

Or are we moving toward a world where understanding the codebase becomes less important?

Would love to hear how CTOs, Engineering Managers, and Tech Leads are thinking about this.

reddit.com
u/Ok-Condition7148 — 11 days ago
▲ 42 r/AI_developers+28 crossposts

This one is for all the broke college CS students out there <3

If you're like me, you don't want to pay $20 a month for claude code :(

It's an amazing tool I love, but a recurring expense is the last thing I need. That's why I find myself jumping from tool to tool, using the daily or monthly free tier limits and constantly having to find new free tools.

That's where "AI For Brokies" comes in. Just a simple github repo with a readme file of some free AI tools you can use for building :)

https://github.com/Joe-Huber/AI-For-Brokies

The actual building behind this project was mostly the automatic tool adder, following an issue format! If you want to see it in action, please drop an issue explaining a tool you use and see the bot do it's magic!

Please feel free to leave a star! ⭐️ (pretty please) You can use it to save the list of tools for whenever you run out of credits!

u/Joe-Codes — 13 days ago

I want an honest feedback about my website - LifeFlow

What’s different (for us)

  • Brainstorm → Gives you multiple perspectives over your idea and a ready to perform initial path to work on your idea
  • Ask AI → productivity-focused assistant which generates perfect daily tasks for your condition and adds them to your todo list
  • Productivity score → derived from your dated todos, streak, and usage patterns (not vanity metrics only).

Links / privacy

Who it’s for
People who bounce between too many tools and want one calm place to plan the day and reflect.

What I want from you

  • Would you use this daily or bounce back to Notion/Obsidian/Todoist?
  • What’s missing on mobile/tablet (we’ve been tightening layouts)?
  • Any feature that feels gimmicky vs. useful?

Disclaimer
I’m the maker; posting for feedback, not ads. Happy to take criticism.

reddit.com
u/haajdr — 10 days ago
▲ 2 r/AI_developers+1 crossposts

Been running a distributed AI agent stack for 2 months with PostgreSQL as the persistent memory layer. Wanted to share the schema design since I haven't seen many examples of this pattern.

The setup: PostgreSQL 14 running on an Odroid XU4, accessed by 14 CrewAI agents running on a separate Jetson Orin Nano Super node. All connections go over the local network.

The schema:

conversations: stores all WhatsApp messages between me and the AI CEO. 54+ rows and growing.

agent memory: structured summaries the agents write after each session. 75+ entries. This is how agents remember context between runs without embedding search.

crew runs: logs every crew session with start time, status, and output summary.

paper trades: full trade ledger. symbol, entry/exit price, P&L, reasoning, agent that made the call.

treasury: tracks paper capital across USD, BTC, ETH, SOL.

seo campaigns, seo content, seo competitors: the content pipeline for a local business SEO service.

approvals, ventures, daily briefings: operational tables for the AI company structure.

The interesting design decision: agents write summaries to memory after each run rather than storing raw outputs. Keeps the table lean and queryable without needing a vector store. Works well for structured recall.

Anyone else using PostgreSQL as agent memory? Curious how others are handling the structured vs semantic tradeoff.

Full build documented on Youtube and build document in my bio!

reddit.com
u/Weird_Night_2176 — 14 days ago

Building a GenAI evaluation framework a few honest observations

Currently interning as an AI/ML engineer in Brussels, working on a RAG evaluation framework using DeepEval. Still in progress but already learned a lot.
A few things that surprised me so far:
• LLM-as-judge is powerful but needs careful calibration against real human judgment
• Metrics can look good on paper while answers are still subtly wrong
• The hardest part isn’t technical it’s getting stakeholders to actually trust the eval results
Anyone else built evaluation pipelines with DeepEval or similar tools? Curious what approaches others have used.

reddit.com
u/SecureShip5625 — 14 days ago