u/wunderstrudel

Modern E2EE systems

Hi cryptography nerds!

I am a full-stack / end-to-end developer "specializing" in SaaS.
I have previously work with E2EE systems but sadly it has been about 10+ years.
So I am a little rusty / outdated on the matter.

Recently I started a new project / platform where I wanna try playing around with making a modern E2EE / zero knowledge platform.
I have done quite a bit of research trying to figure out how to best follow modern standards. (Not including post-quantum encryption).
So I was hoping to lay out my findings in here and hopefully some of you guys would be willing to audit it.

Thanks a lot in advance!
(Let me know if I left out some important information)

------------------------------------------------------------------------------------------------------

Trust model

  • Server sees: UUIDs, timestamps, operational enums, public keys, signatures, opaque HPKE ciphertext blobs.
  • Server never sees: user private keys, password/recovery KEKs, tenant private keys, plaintext content of anything user-typed.
  • Browser compromise = game over. Explicitly out of scope. The user's password is their cryptographic root.
  • TLS assumed for transport.

------------------------------------------------------------------------------------------------------

Primitives

Purpose What Params
Password hardening Argon2id t=3, m=65536, p=4, dkLen=32 minimum
KDF HKDF-SHA256 inside HPKE + standalone for on-disk priv-key wrap
Symmetric AEAD AES-256-GCM 12-byte nonce, 16-byte tag
Hybrid PKE HPKE Base (RFC 9180) DHKEM(X25519, HKDF-SHA256) / HKDF-SHA256 / AES-256-GCM
Signatures Ed25519 identity, manifest, inter-service auth, grant signatures

Forbidden: RSA, AES-CBC, AES without an authenticated mode, scrypt, hand-rolled ECIES, P-256, separate encrypt+MAC. HPKE specifically (rather than rolling our own X25519+HKDF+AES-GCM, which is structurally what HPKE is) for the "audit against RFC 9180 test vectors" property.

------------------------------------------------------------------------------------------------------

Key hierarchy

How a user gets from a password to being able to open anything:

            password                 recovery code
                │                          │
                ▼ Argon2id                 ▼ Argon2id
            KEK_password             KEK_recovery
                │                          │
                ▼ AES-GCM (one of two wraps, same plaintext)
        ┌──────────────────────────────────────┐
        │     member priv (X25519)             │
        │     member signing priv (Ed25519)    │
        └──────────────────────────────────────┘
                │
                ▼ HPKE.Open (unwraps for each tenant the member belongs to)
        ┌──────────────────────────────────────┐
        │     tenant priv (X25519)             │
        └──────────────────────────────────────┘
                │
                ├──▶ HPKE.Open on tenant-recipient ciphertext
                │    (names, descriptions, payloads, audit log, etc.)
                │
                └──▶ HPKE.Open on the tenant's vault entries,
                     then HPKE.Seal under service pubkey for one call
                     (the structural plaintext exception)

------------------------------------------------------------------------------------------------------

Three recipient classes

Every HPKE envelope is sealed under exactly one of three recipient classes. The access boundaries are structural — they fall out of who holds which private key, not from RBAC:

                      │ Browser (signed-in │ Backend
                      │  member of tenant) │  service
──────────────────────┼────────────────────┼─────────────
Tenant-recipient      │  decrypt           │  no access
Service-recipient     │  encrypt only      │  decrypt
Member-recipient      │  decrypt (only the │  no access,
                      │   target member;   │   ever, by
                      │   others encrypt)  │   construction

The service holds exactly one private key — its own. It has no path to decrypt tenant-recipient or member-recipient envelopes, and that's not policy, it's that the bytes don't exist on that machine. Member-personal data lives in a separate table with no column shape that could ever hold a service-recipient envelope, so a future bug can't accidentally add that path.

------------------------------------------------------------------------------------------------------

Wire format

All HPKE envelopes are self-describing:

versioned_envelope =
   'v' '1'  ||  key_id (16-byte UUID)  ||  enc (32 bytes)  ||  ciphertext_with_tag
  • key_id resolves into the right keys table based on the recipient class implied by the call site.
  • enc is the RFC 9180 encapsulated key (X25519 ephemeral public).
  • AAD is not on the wire — reconstructed from authenticated context at decrypt time.
  • No nonce on the wire — HPKE derives the AEAD nonce from its KeySchedule + sequence; we use sequence 0 (one Seal per ciphertext).

I picked envelope-embedded key_id over a row column to keep ciphertext self-describing. Both are RFC 9180-compliant; the trade is "rotation queries scan a blob prefix" vs "indexed column lookup." Reason this is the wrong trade?

------------------------------------------------------------------------------------------------------

AAD discipline (this is where I most want eyes)

Every Seal/Open passes AAD that binds the ciphertext to its specific row / field / tenant / purpose. AAD is reconstructed from authenticated context at Open time — never read from storage. Example recipes:

  • Generic tenant-readable data: tenant_data:v1:{tenant_id}:{resource_type}:{resource_id}:{field_name}
  • Wrapped tenant priv for member X: tenant_priv_wrap:v1:{tenant_id}:{tenant_key_id}:{member_id}
  • Service-recipient envelope: service_envelope:v1:{service_key_id}:{tenant_id}:{secret_id}:{binding_id}:{purpose_id}:{grant_id}:{expires_at}

Every HPKE info label is namespaced and domain-separated. HKDF inside HPKE mixes info into key derivation, so even with the same recipient pubkey, ciphertext sealed for one context cannot be Open'd against another's label.

The rule I'm enforcing: every AAD must include enough authenticated context that any cross-row / cross-tenant / cross-purpose substitution is detectable on Open. Nervous I've missed a substitution attack on a specific recipe.

Authorization scopes are plaintext canonical JSON, Ed25519-signed by the issuing user's identity key, byte-preserved end-to-end. Not encrypted, because the server has to enforce them at access time. The signed bytes bind a tenant-pubkey-hash, allowlists, and expiry. Is the signed-but-readable shape right here, or is there a cleaner primitive?

------------------------------------------------------------------------------------------------------

The one structural plaintext exception

Exactly one spot where plaintext briefly touches a server: a backend service must occasionally decrypt user-supplied secrets to forward them to a third party on the user's behalf. No architectural shape where "service makes the call on your behalf" and "service never sees the secret" are simultaneously true.

1. Browser opens secret under tenant priv (browser has tenant priv).
2. Browser HPKE-Seals secret under the SERVICE pubkey. AAD binds it
   to (service_key_id, tenant_id, secret_id, binding_id, purpose,
   one-of[execution_id, grant_id + expires_at]).
3. Ciphertext travels through intermediate hops — none of which can
   decrypt (they hold no service private key).
4. Service verifies its inter-service JWT, reconstructs AAD from
   JWT-authenticated values + its own service_key_id, HPKE.OpenBase.
   Plaintext lives on ONE stack frame for ONE call. Goes out of
   scope on return.

If that service private key leaks AND an attacker has DB access, AAD reconstruction is trivial — every field is a plaintext column or a JWT claim derivable from one. Defense-in-depth at the protocol layer I'm missing?

------------------------------------------------------------------------------------------------------

Rotation (full-re-encryption model)

Tenant-key rotation is eager full re-encryption, browser-driven:

  1. Admin browser unwraps current tenant priv.
  2. Generates new X25519 keypair.
  3. HPKE-Seals new tenant priv under every current member's pubkey (one wrap row per member).
  4. Atomic txn: insert new key as current, mark old as superseded, insert all new wrap rows.
  5. Browser walks every encrypted row, decrypts under old key, re-encrypts under new key. Resumable across sessions.
  6. When zero rows reference the old key: delete every wrap row for the old key, retire the old key, browser zeroes its copy.

------------------------------------------------------------------------------------------------------

What I'd love eyes on, in priority order

  1. AAD recipes — substitution / replay attack I'm missing? Especially the service-recipient envelope.
  2. Service-key-leak threat — defense-in-depth at the protocol layer?
  3. key_id in envelope vs row column — reason one is materially worse?
  4. Rotation correctness — race in steps 5/6 leaving a ciphertext referencing a retired key?
  5. PQ migration plan — is the suite_id swap actually as clean as I think?
  6. Signed-but-readable authorization scopes — cleaner primitive available?
reddit.com
u/wunderstrudel — 4 days ago