u/Intrepid_Frosting_66

How are you handling PII when sending user data to an LLM? Curious what others are doing

Working on Effitrio, an AI supported web app that helps freelancers, small business owners to manage invoices and expenses.

Somewhere early on I stuck that the LLM needs to reason about real business data, but I wasn't comfortable sending actual client names, emails, and amounts to an external model.

What I ended up building: before the LLM call, any PII field gets swapped to a stable placeholder token like; __PII_1__, __PII_2__ and so on. The real values sit in an AsyncLocalStorage map that lives only for the duration of that call. The LLM sees tokens, reasons with tokens, returns tokens. After generateText() finishes, I run a restore pass and swap everything back.

Not sure if this is the cleanest approach or if I'm overengineering it.

Are you masking at the prompt level, running a local proxy, or just trusting your LLM provider's data policies? Genuinely curious what others have landed on.

reddit.com
u/Intrepid_Frosting_66 — 4 days ago