Threat-modeling your already-published post history: the adversary is a cheap AI that reads all of it
Most OPSEC threat modeling looks forward: what do I post carefully from now on. This is about the half prevention can't reach, and the threat model people tend to miss.
Threat model. Asset: years of ordinary public posts on a pseudonymous account (Reddit/X). Adversary: someone motivated to link that pseudonym to your real identity (a harasser, an employer, a hostile party in a dispute), now armed with an off-the-shelf LLM. What changed: re-identification used to need a human spending hours; a model now reads your whole history cheaply and notices the intersection you can't feel from inside your own feed. Staab et al (ICLR 2024) measured roughly 85% top-1 on inferred attributes from plain Reddit text.
The mechanism is the mosaic. Re-identification rarely comes from one careless post. It stacks weak signals (a commute, a slang word, a posting-time slot that betrays your timezone) until they intersect at one person. Judge each finding by risk contribution, not by how revealing it feels alone: twenty-eight posts mentioning a neighborhood landmark are worse than one post naming your employer once, because the twenty-eight intersect.
How to actually work it:
- Pull your export (Reddit: request a copy; X: download archive) and read it adversarially, by category (location, employer, family, schedule, identity links). Ask "does this narrow who I am," not "is this embarrassing."
- On X the leak is usually metadata, not words: the self-set location field, posting-time concentration, image EXIF/GPS, outbound links, the reply graph. A text-only mental model is dangerous reassurance.
- Remediate generalize-first, not mass-delete. Deletion is not erasure (caches, archives, screenshots persist), and removing one post rarely removes the pattern. Edit the highest-contribution items and change what you publish next.
The trap specific to this crowd: the obvious way to run the audit is to paste your history into a capable AI and ask what it reveals. If the account is a pseudonym you keep apart from your legal name, and the AI is logged into your real-name account, you just handed one provider both halves of the link you were protecting. For a strictly-anonymous account, keep the analysis local and offline.
Happy to get into any category, or the local-versus-cloud trade-off.
i have read the rules