Turn your AI into JARVIS. Voice, wit, and personality — the complete package. Humor cranked to maximum.
Security Analysis
medium confidenceThe skill mostly matches an offline TTS/JARVIS persona, but its runtime instructions include persistent background shell execution, reading session/memory files, and a brittle pattern for injecting text into a shell command — these introduce privacy and command-injection risks and there are minor metadata/install inconsistencies.
Name/description align with required binaries (ffmpeg, aplay) and the SHERPA_ONNX_TTS_DIR env var — these are reasonable for an offline sherpa-onnx TTS voice skill. However, the SKILL.md includes an install entry (download of a Sherpa model from GitHub) even though the registry metadata said 'No install spec'; that's an inconsistency to clarify.
The instructions require the agent to run a local shell command via exec('jarvis "..."', background=true) before rendering text for every voice-enabled reply. Templates also direct the agent to read workspace memory files (memory/YYYY-MM-DD.md) and call tools like sessions_list. Two issues: (1) constructing shell commands from spoken text is brittle and poses command-injection risk unless the jarvis wrapper safely sanitizes/escapes input; the skill relies on content rules (e.g., 'NO quotation marks') which are error-prone. (2) Reading user memory/session files is a privacy surface — it's plausible for personalization but should be explicit and justified. The mandate to ALWAYS speak every response (even for data/code) increases the chance of unintended behavior.
The SKILL.md metadata contains a download/install entry that fetches a model tarball from a GitHub Releases URL (a well-known host) and extracts it to a models directory. The download host is reasonable, but registry-level metadata claimed 'No install spec' (the skill is instruction-only) — this mismatch should be reconciled. If the agent or user runs the download, extracted archives will be written to disk (extract=true).
The skill asks only for SHERPA_ONNX_TTS_DIR and the system binaries it needs, which is proportionate for TTS. However, the runtime instructions also direct reading of local memory/session files (not declared as required config paths) and call other local tools. That implicit access to user workspace data is higher-scope than the declared env requirement and should be documented and consented to by the user.
The skill does not request always:true, does not require credentials, and does not attempt to modify other skills or system-wide settings in the provided materials. It does, however, push a behavioral pattern (speak on every reply) through templates — intrusive but not a platform-privilege escalation.
Guidance
Before installing or enabling this skill, consider the following: (1) Review the full jarvis script that will be invoked — ensure it correctly sanitizes/escapes user-supplied text to avoid shell injection. The skill's recommended usage embeds untrusted text in a shell command, which is dangerous if not implemented safely. (2) Decide whether you want the agent to automatically read local memory/session files — this exposes personal data; require explicit consent and limit which files are read. (3) Reconcile the metadata inconsistency: SKILL.md contains an install/download step (GitHub release) although the registry lists no install spec — confirm who/what will perform downloads and where. (4) If you install the sherpa-onnx model, verify the downloaded archive and its integrity (checksums, trusted release). (5) If you want stronger safety, restrict when the skill can auto-run (don't allow background exec for arbitrary user-supplied strings) or require manual approval for first-time execution. If you cannot review the jarvis script and model files yourself, treat this skill as risky and avoid enabling it.
Latest Release
v3.1.1
v3.1.1: Updated description — voice and humor are one package, like the original JARVIS. Added link to LIMBIC humor research paper.
More by @globalcaos
Published by @globalcaos on ClawHub