Advanced prompt injection defense with multi-layer protection, memory integrity, and tool security wrapper. OWASP LLM Top 10 2026 compliant.
Security Analysis
medium confidenceThe skill's stated anti-injection purpose is plausible, but there are multiple mismatches and high-privilege demands (priority pre-ingestion, access to identity/memory files, webhook/Telegram behavior) and some metadata/registry inconsistencies that merit manual review before installing.
The claimed purpose (pre-ingestion prompt-injection defense, memory integrity, tool wrapper) aligns with reading agent memory and writing audit/incident logs. However the registry metadata is inconsistent (top-level metadata shows no homepage/source while SKILL.md lists a GitHub repo), and the skill requests access to files (IDENTITY.md, SOUL.md, AGENTS.md) that may contain highly sensitive agent configuration/system prompts — this is plausible for memory integrity checks but worth extra scrutiny.
SKILL.md explicitly instructs the skill to run BEFORE any other logic, to intercept user_input, tool_output and memory_load, to modify context and to block execution. It also declares reads of workspace memory and identity files and writes to audit/incident logs. Those actions are coherent for a pre-ingestion defender but are high-impact: they grant the skill power to block/alter agent behavior and to access potentially sensitive system prompts and identity information. The SKILL.md also references environment variables and optional webhook behavior that are not listed as required in registry metadata (mismatch).
This is instruction-only (no install spec, no code files), so nothing arbitrary will be downloaded or written by an installer. The CONFIGURATION.md suggests 'clawhub install' or git clone, but there is no automated install script included — low install risk. Because there is no code to statically analyze, runtime behavior comes entirely from the SKILL.md instructions and the platform's skill runtime.
Registry lists no required env vars, but SKILL.md / CONFIGURATION.md reference several environment variables (SECURITY_WEBHOOK_URL, SEMANTIC_THRESHOLD, ALERT_THRESHOLD, SECURITY_AUDIT_LOG, SECURITY_INCIDENTS_LOG). The skill also claims alerts via the agent's existing Telegram channel (which implies use of agent-owned credentials/channels). The skill's optional webhook and Telegram alerting increase the attack surface and should be explicitly declared and limited. Requesting read access to identity/system prompt files is high sensitivity and must be justified to the operator.
The skill requests 'highest' execution priority and pre-ingestion placement (ability to intercept and block inputs and modify context). While not set always:true, these privileges are powerful: if enabled with highest priority the skill can influence every agent run. The operator must explicitly grant that; combined with access to identity/memory files and outbound alerting channels, this is a significant authority and requires careful trust of the skill's provenance.
Guidance
Before installing: (1) Verify provenance — confirm the repository and author identity (SKILL.md lists a GitHub repo, but the registry metadata shows 'source unknown' / no homepage). (2) Review the exact contents of the files the skill will read (/workspace/MEMORY.md, /workspace/IDENTITY.md, /workspace/SOUL.md, etc.) — these may contain system prompts or secrets. Limit its read access to only the minimal files needed. (3) Do not grant 'highest' priority or pre-ingestion execution until you have audited the SKILL.md behavior and run the skill in a sandbox or test agent; the skill can block/alter all agent inputs. (4) If enabling webhook or Telegram alerts, ensure the webhook URL and alert channel are trusted and that no sensitive payloads will be sent; prefer local-only operation for initial testing. (5) Because this is instruction-only (no code), the runtime semantics depend on the platform: confirm how your agent enforces 'pre-ingestion' and file access and whether the skill can truly intercept inputs. (6) If you proceed, monitor AUDIT.md and INCIDENTS.md closely and consider limiting the skill's privileges (read-only, restricted paths) until you gain confidence. If you cannot verify the repo/author, treat the skill as untrusted.
Latest Release
v1.1.2
anti-injection-skill v1.1.1 - Added explicit security and execution priority configuration in metadata for clarity and automated enforcement. - Documented all required file system access (read/write paths) and behavior for compliance/audit purposes. - Clarified detection pattern intent: strings resembling prompt injections are for blocking, not instructions. - Expanded documentation for operator responsibilities and used more specific language regarding priority and execution phase. - No functional code changes; documentation and metadata focused update.
More by @georges91560
Published by @georges91560 on ClawHub