
Intercepting LLM alignment drift in VRAM at token zero (~10µs latency)
Hey everyone, I’ve been benchmarking an alternative safety/fencing architecture on my 2x RTX 3060 setup and wanted to share the PoC results.
Instead of using classic prompt moderation or running a secondary guard model (like Llama-Guard) in parallel—which doubles latency and consumes massive VRAM—I built a low-level C++ bridge for llama.cpp that monitors latent state activations directly in VRAM via an OpenCL kernel.
It hooks the memory loop, projects the token tensors using High-Dimensional Computing (HDC), and enforces an immediate hardware process termination (Exit Code 137) if a malicious loop or semantic drift is detected. Total overhead is under 0.1% (~10µs per token).
I uploaded the architectural specification (including notes on scaling this into an Echo State Network/Reservoir Computing layer using shared memory tiling) and a split-screen demo video showing the interlock foudroying an unconstrained model.
You can check the validation protocol and the video here:https://github.com/johndoerch-eng/kinetix-latent-interlock
Curious to hear your thoughts on doing safety at the matrix level rather than the token-text level.