u/Chance_Passion_2144

Hi,

I'm looking for an iOS app that supports both loading a custom local GGUF model and setting a GBNF grammar to constrain the model's output at inference time.

I know llama.cpp supports this natively, but I haven't found any iOS app that exposes the grammar parameter. LLMFarm and PocketPal don't seem to support it.

Anyone aware of an app that does this?

reddit.com
u/Chance_Passion_2144 — 24 days ago