Local speech-to-text using OpenAI Whisper. Runs fully offline after model download. High quality transcription with multiple model sizes.
Security Analysis
high confidenceThe skill appears to do what it claims — a local Whisper-based STT tool — with no unexplained credentials or network endpoints in the code; minor documentation mismatches and expected model-download behavior are noted.
Name, description, declared binary (ffmpeg), package dependencies (openai-whisper, torch) and the included Python transcription code all align with a local Whisper STT tool. There are no unrelated credentials or config paths requested.
SKILL.md stays within the STT task, showing venv creation and pip installation. Two small inconsistencies: the README examples call ~/.clawdbot/skills/local-whisper/scripts/local-whisper but the repository provides scripts/transcribe.py (no wrapper named local-whisper included), and the instructions use the 'uv' command (uv venv, uv pip) but 'uv' is not listed in required binaries. Also note: models are downloaded at runtime by whisper.load_model(), so an initial internet connection is required to fetch model weights.
No install spec in the registry (instruction-only), so nothing is forced onto disk by the registry. SKILL.md recommends pip installing openai-whisper and torch (torch download uses the official PyTorch index URL). This is a standard approach; the user will execute these installs locally in a venv.
The skill requests no environment variables or credentials. That is proportionate for a local transcription utility.
always is false and the skill does not request elevated or persistent platform-wide privileges. Runtime behavior will store downloaded model weights/cache on the host (normal for ML models) but the skill does not modify other skills or system-wide agent settings.
Guidance
This skill is internally coherent for local Whisper STT, but check a few things before installing: (1) The SKILL.md references a scripts/local-whisper wrapper but only transcribe.py is included — you may need to run the Python file directly or add a small launcher. (2) The instructions use the 'uv' helper tool but 'uv' is not declared as a required binary; ensure you understand or replace those commands (you can use python -m venv / pip). (3) Whisper will download model weights the first time you run it (large models are gigabytes) — that requires internet access and disk space; after download it runs offline. (4) Installing packages with pip runs arbitrary code from PyPI/torch index — install into an isolated venv and review packages if you have supply-chain concerns. (5) Audio data is processed locally, but if you later modify the code or install different packages, re-check network calls or endpoints. If those points are acceptable, the skill is consistent with its stated purpose.
Latest Release
v1.0.0
- Initial release of local-whisper: local speech-to-text using OpenAI Whisper, fully offline after model download. - Supports multiple model sizes for different speed/quality needs: tiny, base (default), small, turbo, large-v3. - Includes options for language selection, timestamps, JSON output, and quiet mode. - Provides clear setup instructions using uv-managed Python virtual environment. - Requires ffmpeg for audio processing.
More by @araa47
Published by @araa47 on ClawHub