u/EfficientStation8970 — reddlx

▲ 1 r/OpenSourceAI

[ Removed by Reddit ]

[ Removed by Reddit on account of violating the content policy. ]

reddit.com

u/EfficientStation8970 — 19 hours ago

▲ 4 r/u_No_Reason2341+3 crossposts

Just one minute.

Hi everyone who would like to stay and read my article.

First, I'm not an AI existential risk advocate.

On the contrary, I think AI agents are remarkable for their productive efficiency, high capability in coding, and more.

But power always comes with danger.

Their overconfidence and blind pursuit of a goal frequently result in unintended consequences

(like file deletion, data loss, etc.).

I am currently working hard on solving this problem.

Here are the details of my project:

- Inject my executable into CLI agent applications (Opencode, Claude Code, etc.)

- Interrupt every single command execution and audit its safety through keyword filtering

- A sudo-like login system that allows perilous command execution by agents

- Encrypted data to prevent manipulation by agents

- And more.

Thank you guys for reading until the end.

Here is my link, and you guys can check it out if you're interested.

(Currently the readme is empty because the official version and the demo version are very different in many ways, and I want to rewrite it after implementation, so...)

https://github.com/rosettastone0501-cpu/Hlin

u/EfficientStation8970 — 19 hours ago

▲ 4 r/u_EfficientStation8970+3 crossposts

My recent thoughs on AI safety

I think agents are very unmanageable.

Even though those agent applications are trying to constrain perilous behavior from agents (like deleting files, leaks, etc.),

we can still hear a lot of news about similar incidents on the internet.

I'm currently working hard to solve this phenomenon by constraining command execution permissions — to prevent agents before they take action.

I'm not good at writing articles, so this is all.

I'll put the details about my project in the comments if you guys are interested.

Thanks for reading.

reddit.com

u/EfficientStation8970 — 22 hours ago