u/Affectionate_Ear2151

Can users or developers access or delete prompt caches in hosted AI platforms?

Hi everyone!

I’m researching privacy risks in multimodal conversational AI systems, and I’m especially interested in prompt caching.

From what I understand so far, prompt caching usually happens on the provider’s server, using cached token/KV representations rather than a normal client-side cache.

My main question is: do any current hosted AI platforms allow users or developers to directly access, modify, delete, or control the internal prompt cache?

I know some APIs provide limited cache-related controls, but from what I understand, these features mostly let developers influence caching behaviour, set TTLs, or view token counts. They do not seem to allow access to the actual cached content or KV cache itself.

I’m mainly asking from a privacy point of view. If sensitive data is sent to an AI model and becomes part of a server-side cache, can it be removed or controlled directly? Or is the only realistic solution to detect and remove sensitive data before sending it to the model?

Any help or sources would be really appreciated.

reddit.com
u/Affectionate_Ear2151 — 3 days ago

Can prompt caches in hosted AI systems be accessed or deleted?

Hi everyone!

I’m researching privacy risks in multimodal conversational AI systems, and I’m especially interested in prompt caching.

From what I understand so far, prompt caching usually happens on the provider’s server, using cached token/KV representations rather than a normal client-side cache.

My main question is: do any current hosted AI platforms allow users or developers to directly access, modify, delete, or control the internal prompt cache?

I know some APIs provide limited cache-related controls, but from what I understand, these features mostly let developers influence caching behaviour, set TTLs, or view token counts. They do not seem to allow access to the actual cached content or KV cache itself.

I’m mainly asking from a privacy point of view. If sensitive data is sent to an AI model and becomes part of a server-side cache, can it be removed or controlled directly? Or is the only realistic solution to detect and remove sensitive data before sending it to the model?

Any help or sources would be really appreciated.

reddit.com
u/Affectionate_Ear2151 — 3 days ago

Can prompt caches in hosted AI systems be accessed or deleted?

Hi everyone!

I’m researching privacy risks in multimodal conversational AI systems, and I’m especially interested in prompt caching.

From what I understand so far, prompt caching usually happens on the provider’s server, using cached token/KV representations rather than a normal client-side cache.

My main question is: do any current hosted AI platforms allow users or developers to directly access, modify, delete, or control the internal prompt cache?

I know some APIs provide limited cache-related controls, but from what I understand, these features mostly let developers influence caching behaviour, set TTLs, or view token counts. They do not seem to allow access to the actual cached content or KV cache itself.

I’m mainly asking from a privacy point of view. If sensitive data is sent to an AI model and becomes part of a server-side cache, can it be removed or controlled directly? Or is the only realistic solution to detect and remove sensitive data before sending it to the model?

Any help or sources would be really appreciated.

reddit.com
u/Affectionate_Ear2151 — 3 days ago

Can prompt caches in hosted AI systems be accessed or deleted?

Hi everyone!

I’m researching privacy risks in multimodal conversational AI systems, and I’m especially interested in prompt caching.

From what I understand so far, prompt caching usually happens on the provider’s server, using cached token/KV representations rather than a normal client-side cache.

My main question is: do any current hosted AI platforms allow users or developers to directly access, modify, delete, or control the internal prompt cache?

I know some APIs provide limited cache-related controls, but from what I understand, these features mostly let developers influence caching behaviour, set TTLs, or view token counts. They do not seem to allow access to the actual cached content or KV cache itself.

I’m mainly asking from a privacy point of view. If sensitive data is sent to an AI model and becomes part of a server-side cache, can it be removed or controlled directly? Or is the only realistic solution to detect and remove sensitive data before sending it to the model?

Any help or sources would be really appreciated.

reddit.com
u/Affectionate_Ear2151 — 3 days ago