Run untrusted skills safely with soft-sandbox protection. Wraps skills in multi-layer prompt-based defense (~85% attack prevention). Use when: (1) Running third-party skills from unknown sources, (2) Processing untrusted content that might contain prompt injection, (3) Analyzing suspicious files or URLs safely, (4) Testing new skills before trusting them. Supports manual mode ('run X in sandwrap') and auto-wrap for risky skills.
Security Analysis
medium confidenceThe skill's description matches its intent, but it claims code-level enforcement and filesystem/config behavior that are not present in this instruction-only package — the protections may be aspirational rather than actually enforced.
Name/description align with a prompt-based 'soft sandbox'. However, the SKILL.md and architecture docs claim code-level enforcement (tool interception before execution, path checks, rate limiting) and provide implementation snippets, but there is no install spec or code in the package to implement those enforcement points. That mismatch means the skill can only rely on the agent following its prompts rather than actually enforcing restrictions at a system or platform level.
The runtime instructions direct the agent to sanitize inputs, intercept and block tool calls, consult and modify sandbox-config.json, and write to sandwrap-output/. Those actions reference filesystem config and state that are not declared in the registry metadata. Because this is an instruction-only skill, the agent's adherence depends entirely on the platform honouring the rules; the skill itself doesn't provide an enforcement mechanism or independent checks.
No install spec and no code files are included. That minimizes the risk of arbitrary code being dropped/executed, but it also means the documented protections are only policy-level instructions rather than implemented controls.
The skill requests no environment variables or credentials (good). However, it references config files (sandbox-config.json) and output paths (sandwrap-output/) without declaring required config paths or explaining access patterns. This is not a secret-exfiltration flag, but it is a mismatch between claimed behavior and declared requirements.
always is false and the skill is user-invocable (normal). The skill describes auto-wrap behavior and reading/writing a sandbox-config.json, which implies persistent configuration if the platform implements it — but the skill does not itself create or store persistent artifacts. If the platform implements persistent auto-wrap, consider the implications; the skill alone does not request elevated privileges.
Guidance
This skill is an instruction-only 'soft' sandbox: it provides detailed policies and code examples but ships no code to actually enforce them. That means the protection it offers depends entirely on the agent/platform following its prompts and on any platform-level interception you may already have. Before using it on sensitive data: (1) confirm your platform can intercept and enforce tool calls and path restrictions (the skill assumes this capability); (2) do not rely on Sandwrap for high-value secrets — use a VM/container or a vetted isolation mechanism instead; (3) examine where sandbox-config.json and sandwrap-output/ would live and who can read/write them; (4) test the skill with benign but adversarial-looking inputs to validate that the platform enforces the rules the skill describes; and (5) if you need stronger guarantees, request an implementation (code that runs on the platform and performs tool interception) or prefer a real OS-level sandbox.
Latest Release
v1.0.0
Initial release: Soft-sandbox protection for running untrusted skills. 5 defense layers, 4 presets, ~85% attack prevention.
More by @RubenAQuispe
Published by @RubenAQuispe on ClawHub