u/Aromatic-Ad-6711

├─ Step 1: ✓ Reasoning verified (confidence: 70%) │ 🧪 Verification: tested (score: 100%) │ ✅ Compiled ← go build │ ✅ Executed ← go run │ ✅ Tests passed ← auto-generated tests │ ✅ Lint clean ← go vet

I've been building ARK (AI Runtime Kernel) for the past 10 months. It's an open-source runtime that sits between your AI agent and the LLM, governing every decision the model makes.

The core idea: models shouldn't control the system. The runtime should.

What it does:

When you ask ARK to write Go code, it doesn't just pass the prompt to GPT and hand you back whatever comes out. The runtime classifies the task, optimizes the prompt, generates the code, then runs a 6-phase verification pipeline before you see anything:

├─ Step 1: ✓ Reasoning verified (confidence: 70%)
│  🧪 Verification: tested (score: 100%)
│  ✅ Compiled        ← go build
│  ✅ Executed         ← go run
│  ✅ Tests passed     ← auto-generated tests
│  ✅ Lint clean       ← go vet

If the code fails compilation, ARK feeds the compiler error back to the model, forces a stronger model, and retries. If it still fails after 2 attempts, it refuses to deliver broken code. It never claims success for code that doesn't compile.

The Go-specific stuff that might interest this community:

The entire runtime is pure Go, zero external dependencies (just stdlib). 35 files, ~16,000 lines, 156 tests, race detector clean. Some things I'm proud of:

Weighted tool ranking with 6 signals (relevance, success rate, Bayesian confidence, cost, latency, memory bonus) — all computed in microseconds
Context engine that reduces tool schema tokens from 60K to ~93 (99.9% reduction) by only loading relevant tools
Per-step model routing: cheap model (gpt-4o-mini) handles tool calls, strong model (gpt-4o) handles reasoning. Cuts costs 80-90%
Cognitive Governor that verifies every output with calibrated confidence scores
Auto-fix for common model errors in generated Go code (orphan braces, missing error handling) — detects both tab and space indentation
Event emitter that writes JSONL for a separate Python memory layer to ingest

Cost: A typical task costs $0.002-$0.005. Not $0.05.

Example output:

go run ./cmd/ark run agent.yaml --task "write a function in Go that reads CSV"

✅ Task completed successfully
Steps: 1 | Tokens: 637 | Time: 5.6s | Cost: $0.002

The generated code compiles, runs, and passes auto-generated tests before you see it.

GitHub: github.com/atripati/ark

I'm a CS undergrad at DePaul in Chicago building this solo. Applied to YC S26 with it. Happy to answer questions about the architecture, the verification pipeline, or why I chose Go for this.

I built an AI agent runtime in Go that compiles and tests generated code before delivering it , 35 files, 156 tests, zero dependencies

: I built an AI agent runtime in Go that compiles and tests generated code before delivering it , 35 files, 156 tests, zero dependencies