Hypnos-Q1: First open-weight LLM architecturally bonded to a quantum computer
Hey guys!!!,
I'm releasing Hypnos-Q1 - a 4B reasoning model that's
architecturally bonded to a real quantum computer (IBM Heron r2,
ibm_kingston). It's the first open-weight LLM where every forward pass
incorporates a measurement from actual quantum hardware via a learned
embedding-level injection.
TL;DR
- 4B params, Qwen3.5-4B base, full safetensors + 5 GGUF variants in one repo
- Real IBM Quantum measurements (SYK scrambler circuits, depths 1-3)
injected through a `<|quantum_sig|>` special token
- Verifiable attestation: IBM job IDs published, signature SHA-256 in
quantum_attestation.json so you can independently verify which quantum
jobs produced the training corpus
- Benchmarks: 79.4 GPQA Diamond, 81.1 MMLU-Pro, 89.8 ParseBench Text
What "quantum-bonded" actually means
Not marketing fluff. The base Qwen3.5-4B has a special token `<|quantum_sig|>` whose embedding is replaced at inference time by the output of a learned `quantum_proj: ℝ⁶ → ℝ²⁵⁶⁰` layer, fed real OTOC measurements from IBM hardware. Training corpus was generated with 64 unique quantum signatures from SYK scrambler circuits run on ibm_kingston.
The full Resonance flow requires Python + the QuantumAwareEmbedding
wrapper (provided in the README). The GGUF variants are
classical-deployment builds — they work as standard reasoning models
but without the live quantum injection (llama.cpp doesn't support
custom embedding wrappers).
Available quantization variants
| Variant | Size | Use case |
|---------|------|----------|
| safetensors | 9 GB | Full quantum bonding (Python) |
| Q4_K_M | 2.5 GB | Default for most users |
| Q5_K_M | 2.9 GB | Better quality, still compact |
| Q6_K | 3.3 GB | Near-lossless |
| Q8_0 | 4.5 GB | Essentially lossless |
| F16 | 7.9 GB | Reference |
Try it in 30 seconds
ollama run hf.co/squ11z1/Hypnos-Q1:Q4_K_M
Repo: https://huggingface.co/squ11z1/Hypnos-Q1
Companion preprint on the broader quantum-classical research that led
to this architecture (OTOC scrambling statistics, operational ER=EPR,
discrete time crystal, Zeno freeze on the same hardware) is in
preparation.
All quantum compute via IBM Open Plan (free tier — anyone can reproduce this).
Happy to answer questions about the architecture, the quantum
signature pipeline, or anything else.