u/Proof-Possibility-54

Spring AI with local model through LM Studio

Couple of days ago I shared what I learned about Spring AI's chat memory. Today, here's what happened when I swapped the model behind it entirely.

Same Spring AI app. Same Java code. Same ChatClient, same @Tool annotations, same BeanOutputConverter for structured output. The only thing that changed: which model handled the requests.

OpenAI (GPT-4o) → Anthropic Claude Opus 4→ local Gemma 4 2B running through LM Studio.

The OpenAI → Claude switch was expected to work. Swap the starter dependency, update the config block, ship. Spring AI's provider abstraction is designed for this.

The local Gemma 4 2B switch was the interesting part. Same Anthropic starter dependency, just pointed at localhost:1234:

spring:

application:

name: spring-ai

ai:

anthropic:

api-key: ${LM_STUDIO_API_KEY}

base-url: http://127.0.0.1:1234

chat:

options:

model: google/gemma-4-e2b

memory:

repository:

jdbc:

initialize-schema: always

That's the entire config delta. LM Studio implements the Anthropic protocol, so Spring AI treats it as just another Anthropic-compatible endpoint. No separate "spring-ai-local" starter. No conditional Java code paths.

What I didn't expect — the 2B local model handled:

- Chat with memory (the same ChatMemoryAdvisor + JDBC repository setup from yesterday's post)

- Structured JSON output matching strict schemas

- Tool calling with proper parameter dispatch

- Code review (correctly identified a == vs .equals() bug in a real Java example)

Quality wasn't quite GPT-4o level, but it was meaningful enough that for what's probably 70% of business AI use cases — classification, summarization, structured extraction, simple agent loops — this would work in production. With zero per-request cost and full offline operation.

Recorded a walkthrough showing all three providers running the same demos (chat, memory, structured output, tool calling, code review) if you prefer video: https://youtu.be/lW0FMjDUzik

Repo with code: https://github.com/DmitryFinashkin/spring-ai

Has anyone here shipped multi-provider Spring AI in production yet? Curious how teams are handling provider routing — cost-based, latency-based, quality fallback, regional compliance — and what failure modes you're watching for.

reddit.com
u/Proof-Possibility-54 — 3 hours ago
▲ 2 r/SpringAIDev+1 crossposts

Spring AI with local model through LM Studio

Couple of days ago I shared what I learned about Spring AI's chat memory. Today, here's what happened when I swapped the model behind it entirely.

Same Spring AI app. Same Java code. Same ChatClient, same @Tool annotations, same BeanOutputConverter for structured output. The only thing that changed: which model handled the requests.

OpenAI (GPT-4o) → Anthropic Claude Opus 4→ local Gemma 4 2B running through LM Studio.

The OpenAI → Claude switch was expected to work. Swap the starter dependency, update the config block, ship. Spring AI's provider abstraction is designed for this.

The local Gemma 4 2B switch was the interesting part. Same Anthropic starter dependency, just pointed at localhost:1234:

spring:

application:

name: spring-ai

ai:

anthropic:

api-key: ${LM_STUDIO_API_KEY}

base-url: http://127.0.0.1:1234

chat:

options:

model: google/gemma-4-e2b

memory:

repository:

jdbc:

initialize-schema: always

That's the entire config delta. LM Studio implements the Anthropic protocol, so Spring AI treats it as just another Anthropic-compatible endpoint. No separate "spring-ai-local" starter. No conditional Java code paths.

What I didn't expect — the 2B local model handled:

- Chat with memory (the same ChatMemoryAdvisor + JDBC repository setup from yesterday's post)

- Structured JSON output matching strict schemas

- Tool calling with proper parameter dispatch

- Code review (correctly identified a == vs .equals() bug in a real Java example)

Quality wasn't quite GPT-4o level, but it was meaningful enough that for what's probably 70% of business AI use cases — classification, summarization, structured extraction, simple agent loops — this would work in production. With zero per-request cost and full offline operation.

Recorded a walkthrough showing all three providers running the same demos (chat, memory, structured output, tool calling, code review) if you prefer video: https://youtu.be/lW0FMjDUzik

Repo with code: https://github.com/DmitryFinashkin/spring-ai

Has anyone here shipped multi-provider Spring AI in production yet? Curious how teams are handling provider routing — cost-based, latency-based, quality fallback, regional compliance — and what failure modes you're watching for.

reddit.com
u/Proof-Possibility-54 — 44 minutes ago
▲ 40 r/SpringAIDev+1 crossposts

Built my first AI app entirely in Java using Spring AI

Built my first AI app entirely in Java using Spring AI — no Python involved

I've been experimenting with Spring AI (the official Spring project for AI integration) and was surprised how little code it takes to get something working.

The whole setup is one Maven dependency and a few lines of YAML config. From there I built three things on top of the same project:

  • A simple chat endpoint using ChatClient — literally prompt(), call(), content()
  • Structured output that maps AI responses directly to Java records (no JSON parsing)
  • Tool calling where the AI invokes Java methods to get real data

The tool calling part was the most interesting — you annotate a method with @Tool and Spring AI handles the function-calling protocol with the model. The AI decides when to call your code and uses the result in its response.

I recorded the whole process if anyone wants to see the code in action: https://youtu.be/SiPq1i_0YgY

Anyone else using Spring AI in production or side projects? Curious what use cases people are finding beyond chat endpoints.

u/Proof-Possibility-54 — 9 days ago