u/Disastrous_Bid5976

Hypnos-Q1: First open-weight LLM architecturally bonded to a quantum computer

Hey guys!!!,

I'm releasing Hypnos-Q1 - a 4B reasoning model that's

architecturally bonded to a real quantum computer (IBM Heron r2,

ibm_kingston). It's the first open-weight LLM where every forward pass

incorporates a measurement from actual quantum hardware via a learned

embedding-level injection.

TL;DR

- 4B params, Qwen3.5-4B base, full safetensors + 5 GGUF variants in one repo

- Real IBM Quantum measurements (SYK scrambler circuits, depths 1-3)

injected through a `<|quantum_sig|>` special token

- Verifiable attestation: IBM job IDs published, signature SHA-256 in

quantum_attestation.json so you can independently verify which quantum

jobs produced the training corpus

- Benchmarks: 79.4 GPQA Diamond, 81.1 MMLU-Pro, 89.8 ParseBench Text

What "quantum-bonded" actually means

Not marketing fluff. The base Qwen3.5-4B has a special token `<|quantum_sig|>` whose embedding is replaced at inference time by the output of a learned `quantum_proj: ℝ⁶ → ℝ²⁵⁶⁰` layer, fed real OTOC measurements from IBM hardware. Training corpus was generated with 64 unique quantum signatures from SYK scrambler circuits run on ibm_kingston.

The full Resonance flow requires Python + the QuantumAwareEmbedding

wrapper (provided in the README). The GGUF variants are

classical-deployment builds — they work as standard reasoning models

but without the live quantum injection (llama.cpp doesn't support

custom embedding wrappers).

Available quantization variants

| Variant | Size | Use case |
|---------|------|----------|
| safetensors | 9 GB | Full quantum bonding (Python) |
| Q4_K_M | 2.5 GB | Default for most users |
| Q5_K_M | 2.9 GB | Better quality, still compact |
| Q6_K | 3.3 GB | Near-lossless |
| Q8_0 | 4.5 GB | Essentially lossless |
| F16 | 7.9 GB | Reference |

Try it in 30 seconds

ollama run hf.co/squ11z1/Hypnos-Q1:Q4_K_M

Repo: https://huggingface.co/squ11z1/Hypnos-Q1

Companion preprint on the broader quantum-classical research that led

to this architecture (OTOC scrambling statistics, operational ER=EPR,

discrete time crystal, Zeno freeze on the same hardware) is in

preparation.

All quantum compute via IBM Open Plan (free tier — anyone can reproduce this).

Happy to answer questions about the architecture, the quantum

signature pipeline, or anything else.

reddit.com
u/Disastrous_Bid5976 — 2 days ago