▲ 31 r/datascience
How does your team handle the security issues of coding agents on real data?
Been thinking about this a lot lately. We use coding agents daily on real datasets.
Two things I read recently that made me uncomfortable:
- Prompt injection : basically the agent read some website to files on Internet, then some hidden instructions it'll just execute and can exfiltrate data to external server?
- Slopsquatting: LLMs hallucinate package names that don't exist. Attackers pre-register the most-hallucinated names on PyPI with malware.
This is a few I can think of but it makes me wonder how other teams manage it? Do you believe those are real risks or some security researchers fantasy?
u/SummerElectrical3642 — 3 days ago