
Did you see this project "ext-infer"?
Just came across ext-infer, a pretty interesting project for PHP developers building AI-powered applications.
It’s a PHP 8.3+ extension that runs local LLM inference directly inside PHP using llama.cpp—no Python service, no API calls, no sidecar processes. It supports chat completions, embeddings, and reasoning models through a native PHP API.
Some highlights:
- Run GGUF models locally
- Built-in embeddings support
- Cosine similarity helpers
- Reasoning output extraction (
<think>...</think>) - Thread-safe and works with PHP workers
- Optional Apple Metal acceleration
As someone who usually sees AI integrations in PHP implemented via external APIs or Python microservices, it's refreshing to see a native approach.