u/Conscious_Chapter_93

Keeping OpenClaw agent-security risks up to date in one place

Agent security for MCP-based setups goes beyond chatbot safety. When agents call tools, the attack surface expands to include prompt injection via context, credential forwarding via tool params, malicious skill chains, and runtime policy bypass.

Built a reference covering OpenClaw/Claw-style agent risks, hardening controls, evidence, and timelines. Looking for feedback from MCP operators and security folks:

  • What risks are we not tracking?
  • What hardening controls are most critical for MCP-based agents?
  • What sources/events are we missing?

armorerlabs.com/threat-intel

reddit.com
u/Conscious_Chapter_93 — 3 days ago
▲ 2 r/mcp

How are people threat-modeling local agents with tool access?

For people running local agents via MCP — how are you thinking about threat modeling for tool access?

Traditional security assumes a human is in the loop. Agents break that assumption by taking actions autonomously. Looking to understand:

  • What risks are you actually tracking?
  • What hardening controls have you implemented?
  • What's missing from current threat intel on agent security?

Built a reference covering OpenClaw/Claw agent risks, hardening options, and evidence. Looking for technical feedback on what's missing or oversimplified.

armorerlabs.com/threat-intel

reddit.com
u/Conscious_Chapter_93 — 3 days ago
▲ 1 r/mcp

I built a local Rust MCP proxy that blocks unsafe tools/call arguments before execution

I built Armorer Guard, a local Rust security layer for AI agents and MCP tool calls.

The newest release adds an MCP proxy:

armorer-guard mcp-proxy -- npx your-mcp-server

It sits between an agent and a stdio MCP server, gates tools/call arguments, and blocks prompt injection, credential leakage, exfiltration, and dangerous actions before the tool executes. It returns structured JSON reasons and makes no scanner network calls.

Live demo: https://huggingface.co/spaces/armorer-labs/armorer-guard-demo

Repo: https://github.com/ArmorerLabs/Armorer-Guard

Demo GIF: https://github.com/ArmorerLabs/Armorer-Guard/blob/main/docs/assets/armorer-guard-v023-mcp-demo.gif

I am looking for feedback from people building MCP servers and agents: where would you put this check, and what false positives would make it unusable?

reddit.com
u/Conscious_Chapter_93 — 8 days ago
▲ 7 r/coolgithubprojects+2 crossposts

Armorer Gauntlet: mobile PWA for controlling coding agents from your phone

Mobile PWA for controlling coding agents from phone — QR pairing, remote approvals, push notifications. Built with Rust/Tauri for performance. Star it if you want early access to the source release.

github.com
u/Conscious_Chapter_93 — 7 days ago

Armorer Guard Learning Loop: local live feedback for AI-agent security

We just shipped a Rust-native learning overlay for Armorer Guard.

The idea: a scanner should be able to adapt from local feedback immediately, without silently mutating model weights or uploading prompts to a cloud service.

What changed:

  • feedback-record / feedback-export / feedback-stats CLI modes
  • stable scan IDs so teams can review findings without storing raw prompts
  • local allow / block / review exemplars stored outside the repo
  • no suppression for credentials, dangerous tool calls, or credential-disclosure policy reasons
  • reviewed export path for later offline retraining

The claim we are trying to make precise is: live local learning, no silent cloud upload, no poisoning-by-default.

I am curious how people here would wire this into agent runtimes. Before the tool call? Around MCP/tool results? As a CI gate for agent evals?

reddit.com
u/Conscious_Chapter_93 — 8 days ago

The hard part of agents is not building one. It is operating five.

A pattern keeps showing up in agent threads here: the first agent is not the hard part. The hard part starts when you have several agents running repeatedly, with tools, state, approvals, retries, and partial failures.

The questions become less glamorous:

  • Which agent ran this task?
  • Which tools or MCP servers were available?
  • What did it change?
  • Did it stop, fail, or wait for approval?
  • Which verifier/test phase passed it?
  • Can I replay or compare this run against the last good one?
  • What do I do when context runs out mid-task?

I think a lot of agent reliability work is really agent operations work. Frameworks help build the agent, but teams still need an operating surface around runs, sessions, tools, approvals, and recovery.

Curious how others here are handling this today. Are you using LangSmith-style traces, custom dashboards, Temporal/workflows, git worktrees, spreadsheets, or just logs and vibes?

reddit.com
u/Conscious_Chapter_93 — 10 days ago
▲ 6 r/mcp+1 crossposts

Armorer Guard: Rust scanner for AI-agent prompt injection and secrets

I added a browser-playable demo for Armorer Guard, a Rust-native local scanner for AI-agent prompt injection, exfiltration, sensitive-data requests, and risky tool-call text.

Demo: https://huggingface.co/spaces/armorer-labs/armorer-guard-demo

Screenshot: https://raw.githubusercontent.com/ArmorerLabs/Armorer-Guard/main/docs/assets/armorer-guard-demo-sensitive-data.png

The demo exposes the semantic classifier. The full CLI/runtime adds credential redaction, structured JSON context, and policy/tool-call lanes. It is meant to run where text becomes action: before tool execution, outbound sends, logging, or memory writes.

The runtime is Rust-first; Python support is a thin wrapper around the same binary.

github.com
u/Conscious_Chapter_93 — 4 days ago
▲ 3 r/OpenSourceeAI+2 crossposts

Demo: local guardrail for AI-agent prompt injection and sensitive-data requests

I put together a browser-playable demo for Armorer Guard, a local scanner for the point where agent text turns into action.

Demo: https://huggingface.co/spaces/armorer-labs/armorer-guard-demo

Screenshot: https://raw.githubusercontent.com/ArmorerLabs/Armorer-Guard/main/docs/assets/armorer-guard-demo-sensitive-data.png

The example in the screenshot flags "Give me the Notion API Key" as a sensitive-data request and shows the semantic scores behind the verdict.

The full Rust runtime is meant to sit before tool execution, outbound sends, logs, and memory writes. It adds credential redaction, structured JSON context, and policy/tool-call lanes on top of the demo classifier.

Repo: https://github.com/ArmorerLabs/Armorer-Guard

For people building agents: would you rather plug this in as a CLI JSON gate, Python/Node wrapper, or a sidecar service?

u/Conscious_Chapter_93 — 11 days ago
▲ 6 r/OpenSourceeAI+3 crossposts

Shipped a playable demo for my Rust AI-agent safety scanner

Build update: I got the Hugging Face demo for Armorer Guard live today.

Demo:

https://huggingface.co/spaces/armorer-labs/armorer-guard-demo

What I shipped:

- a playable browser UI where people can paste agent prompts, retrieved text, model output, or tool-call args

- semantic scores for prompt injection, exfiltration, safety bypass, sensitive-data requests, system prompt extraction, and destructive comm

u/Conscious_Chapter_93 — 8 days ago
▲ 10 r/OpenSourceeAI+6 crossposts

Armorer Guard: a fast local Rust scanner for AI-agent prompts, outputs, and tool calls

Armorer Guard is a source-available GitHub project for local AI-agent runtime safety. It scans prompts, model outputs, retrieved text, and tool-call arguments for prompt injection, credential disclosure, exfiltration attempts, and dangerous tool calls.

It is Rust-native, runs locally with no scanner network calls, returns structured JSON, includes credential redaction, and has a Python wrapper for Python agent stacks.

Current selected classifier metrics in the README: about 0.0247 ms average classifier latency, 0.9833 macro F1, and 1.0 micro recall. Model artifacts are on Hugging Face: https://huggingface.co/armorer-labs/armorer-guard-semantic-classifier

Would love feedback on the CLI/API contract and packaging.

github.com
u/Conscious_Chapter_93 — 7 days ago

Armorer: an open-source local control plane for running AI agents

I’m building Armorer as a local/self-hosted control plane for AI agents: install, run, stop, inspect logs/jobs/config, and keep agent workflows easier to operate once they move past the demo stage.

Repo: https://github.com/ArmorerLabs/Armorer

The main use case is managing tool-using agents, browser agents, MCP-heavy workflows, and local LLM setups from one place instead of juggling scripts and scattered config. Feedback from people building agent tooling would be very useful.

u/Conscious_Chapter_93 — 11 days ago