
We are reaching the structural limits of probabilistic code generation
spent the morning trying to debug a microservice where the original dev clearly just copy-pasted an LLM output. it looked syntactically flawless but the underlying state transitions were completely hallucinated
Im getting so exhausted by the industry pretending that slapping a "critic model" on top of a generator model actually solves the correctness problem. its just probabilities all the way down. you can't brute-force reliability by just adding more layers of guesswork
if we actually want to use agentic systems in serious production engineering - and not just for spinning up standard frontend boilerplate we have to move away from pure autoregressive text prediction. Ive been reading about how some alternative architectures (like energy-based models) are being built specifically to compile and pass strict formal verification instead of just guessing the next most likely token
if a system cant mathematically prove the logic it writes is valid inside a strict compiler framework, why are we trusting it with anything resembling critical infrastructure? just feels like we are building massive software houses of cards right now because its cheap and fast.