u/Emotional-Try8717

Control your AI Request

I’ve been experimenting with building an OpenAI-compatible proxy layer using Docker for my AI projects.

Main reason:
I didn’t want every service directly talking to OpenAI/Anthropic separately.

Problems I kept facing:

  • provider API keys scattered everywhere
  • hard to monitor token usage
  • no centralized logging
  • difficult model/provider switching
  • no observability for requests/latency
  • repeated backend integration logic

So I started building a small gateway that sits between apps and LLM providers.

Architecture:

App → AI Gateway → OpenAI / Anthropic / Gemini / Ollama

The goal is:

  • OpenAI SDK compatibility
  • centralized analytics
  • request logging
  • provider routing
  • self-hosted deployment with Docker

What surprised me most is how useful the OpenAI-compatible approach is.

Most existing apps/tools continue working by only changing the base_url.

Example:

from openai import OpenAI

client = OpenAI(
    api_key="local-key",
    base_url="http://localhost:8080/v1"
)

Still experimenting with the architecture and learning a lot about AI infra along the way.

Curious:
How are others handling multi-provider AI infrastructure right now?

Are people building internal gateways/proxies too?

reddit.com
u/Emotional-Try8717 — 16 days ago