u/jfowers_amd

▲ 62 r/StrixHalo+1 crossposts

Lemonade v10.5.1: an MTP + ROCm 7.13 quick start for Strix Halo

Update to Lemonade v10.5.1, then:

# Get the model
lemonade pull Qwen3.6-27B-MTP-GGUF

# Get ROCm 7.13
lemonade backends install llamacpp:rocm

# Load the model (MTP args auto-applied)
lemonade load Qwen3.6-27B-MTP-GGUF --llamacpp rocm --ctx-size 0

Shown in the video taking a look in the mirror with the help of Pi agent.

Github: https://github.com/lemonade-sdk/lemonade Discord: https://discord.gg/5xXzkMu8Zk

PS. u/lucifer-vali fixed Fedora 43 support in this release as well :)

u/jfowers_amd — 4 days ago
▲ 32 r/LocalLLM+1 crossposts

macOS support in Lemonade has graduated out of beta!

All major Lemonade capabilities, including OmniRouter, coding, image gen, speech gen, and transcription are all available on Lemonade for macOS thanks to the hard work of u/GeramyL.

If you're on macOS and just looking into Lemonade for the first time, we're a local AI solution similar in functionality to LM Studio or Ollama.

What sets us apart is:

  • Open source, community driven, zero telemetry
  • Focused on local with no cloud upsell
  • Omni-modal with the ability to input and output images and speech
  • Developer friendly with a 3 MB portable binary, code once and deploy across Linux/Windows/macOS

I hope this release brings more macOS users into the Lemonade community. Stay tuned for the update iPhone app, which can access all of this from your phone!

GitHub: https://github.com/lemonade-sdk/lemonade

Discord: https://discord.gg/5xXzkMu8Zk

u/jfowers_amd — 6 days ago
▲ 507 r/StrixHalo+1 crossposts

vLLM ROCm has been added to Lemonade as an experimental backend

vLLM has the ability to run .safetensors LLMs before they are converted to GGUF and represents a new engine to explore. I personally had never tried it out until u/krishna2910-amd/ u/mikkoph and u/sa1sr1 made it as easy as running llama.cpp in Lemonade:

lemonade backends install vllm:rocm
lemonade run Qwen3.5-0.8B-vLLM

This is an experimental backend for us in the sense that the essentials are implemented, but there are known rough edges. We want the community's feedback to see where and how far we should take this. If you find it interesting, please let us know your thoughts!

Quick start guide: https://lemonade-server.ai/news/vllm-rocm.html GitHub: https://github.com/lemonade-sdk/lemonade Discord: https://discord.gg/5xXzkMu8Zk

u/jfowers_amd — 14 days ago
▲ 272 r/MechKeyboards+1 crossposts

Keyboard is the Keychron Q65 Max. Switches are Keygeek Y2 linears. I love the way this thing sounds, it’s insane how far pre-builds (plus customizations) have come. I got a Keychron because I wanted to go wireless and I’m happy with it so far.

u/jfowers_amd — 19 days ago
▲ 94 r/StrixHalo+1 crossposts

I’ve always liked how if I ask ChatGPT to make or edit an image, it just does it. Local AI should be this convenient! One install, one endpoint. Ask for an image of a cat and it appears. Ask for a hat on the cat, with a narrated story. Now we can easily build immersive experiences.

Lemonade's OmniRouter brings that same pattern to local through built-in tools:

  • Image generation/ editing through sd.cpp
  • Text-to-speech through kokoros
  • Transcription through whisper.cpp
  • Vision through llama.cpp

Your workflow talks to Lemonade running on your own NPU/GPU through OpenAI-compatible tool calling.

How it works:

  1. Lemonade sets up all these local AI engines for your system.
  2. Add Lemonade’s tool definitions to your workflows.
  3. When your LLM triggers a tool call it gets routed to the corresponding engine (sd.cpp, whisper.cpp, kokoros).
  4. Feed the result back into your loop.

That’s it. No custom orchestration layer, no new abstractions to learn. Check it out in this 181-line e2e Python example.

We’ve added support for OmniRouter in our reference web ui (also available as a Tauri app), which is what you’re seeing in the video. But I’m much more excited to see what people build on top.

I know my next project is going to be some kind of TTRPG-style adventure game. It’s already surprisingly fun to ask OmniRouter to be a dungeon master who illustrates and narrates the story, and I think it can be enhanced quite a bit if I build an app/harness around it.

If you find this interesting, please drop us a star and say hi!

u/jfowers_amd — 24 days ago