u/Immediate_Chef_8218

Torrix is a self hosted LLM observability, single Docker container, SQLite, zero cloud dependencies.

Main friction: you had to install it before seeing what it does. So I built a demo mode. preloaded with 30 days of simulated LLM traces. 640 runs, 5 models, cost calculations, anomaly detection, agent traces, evals, SQL query interface. All read-only.

→ demo.torrix.ai - no signup, no Docker

→ torrix.ai - Website

→ https://github.com/torrix-ai/install - Github

What's in it:

Cost spike: 3× normal volume, every anomalous run flagged automatically
claude-3-5-sonnet vs gpt-4o-mini cost breakdown - 20× price difference, visible instantly
5-step agent trace (Orchestrator → Researcher → Synthesizer → Formatter → Validator)
Eval results on 3 test datasets
Live SQL interface against the trace data

Still a single docker run for self-hosting. All data stays local.

I run a lot of local models with Ollama. After a while I wanted to answer basic questions: which model am I actually using most? How long do requests take? When things feel slow, is it the model or my prompt?

Cloud observability tools like Helicone exist but they require your prompts to pass through their servers, which defeats the point of running locally. So I built Torrix, a self-hosted LLM observability tool that runs entirely on your machine.

HOW IT WORKS:

It's an HTTP proxy. You point your app at Torrix instead of Ollama directly, and it forwards every request, measures latency, stores the prompt and response, and shows you a dashboard. Zero code changes to your app, just swap the base URL.

 from openai import OpenAI

    client = OpenAI(
        api_key="your-torrix-api-key",
        base_url="http://localhost:8088/proxy",
        default_headers={"x-target-url": "http://localhost:11434/v1"}
    )

WHAT YOU GET

- Every request logged: model, tokens, latency, full prompt and response

- Cost estimates (even for local models if you want them)

- p50/p95/p99 latency on the analytics page

- Session grouping: tag requests with x-torrix-session to see full conversation threads

- Agent trace grouping: use x-torrix-trace to see multi-step LLM chains as one timeline

- LLM judge: auto-score any run for quality, correctness, and reasoning depth

- Dataset export: filter by score and export as CSV for fine-tuning

SETUP

Single Docker container, SQLite, no external services:

docker run -d -p 8088:8088 -v torrix-data:/data torrixai/torrix:latest

Open http://localhost:8088 to set up your account and grab an API key.

Community edition is free with no time limit, no credit card. Limits: 100 runs shown, 7-day retention, 1 user.

GitHub + install docs: https://github.com/torrix-ai/install

website: torrix.ai

Happy to answer questions. Also curious, what would be most useful to add for local LLM workflows?

Added a live read-only demo to Torrix (self-hosted LLM observability) : no Docker needed to try it