Extract text from images using Tesseract.js OCR (100% local, no API key required). Supports Chinese (simplified/traditional) and English.
Security Analysis
high confidenceThe skill's code, instructions, and requirements are consistent with a local Tesseract.js-based OCR tool; nothing requests unrelated credentials or suspicious system access.
Name/description (local OCR for Chinese/English) align with the actual files: a Node script that uses tesseract.js. Required binary (node) and the dependency (tesseract.js) are appropriate and expected.
SKILL.md and scripts/ocr.js limit behavior to reading a user-supplied image file and running Tesseract.recognize. Instructions do not reference unrelated files, environment variables, or transmit results to external endpoints. The README and notes explicitly mention that language data is downloaded on first run.
No install spec in the registry, but package.json lists tesseract.js and SKILL.md metadata suggests installing it via npm — this is a standard, low-risk installation route. Note: at runtime Tesseract.js will fetch language model files from remote hosts (first-run download ~20MB per language); that network fetch is expected for this library and not an unexpected exfiltration endpoint.
The skill requests no environment variables, credentials, or config paths. package.json dependency only on tesseract.js is proportionate to OCR functionality.
Skill is not always-enabled and does not request system-wide configuration changes. The only persistent behavior is caching/downloading language data for later runs (expected and limited scope).
Guidance
This skill appears to do what it claims: local OCR via Node + Tesseract.js. Before installing, note: (1) you need Node and should run npm install to obtain tesseract.js (package.json lists the dependency). (2) On first run the library downloads language model files (~20MB per language) from Tesseract.js-hosted URLs — the images themselves are processed locally and the script does not post results to third-party APIs. (3) If you require full offline operation, be aware of the runtime model downloads and consider pre-downloading the traineddata files into a controlled location. (4) As with any third-party package, ensure you trust the tesseract.js version and review upstream package provenance if you have strict supply-chain requirements.
Latest Release
v1.0.0
Local OCR skill using Tesseract.js, no API key required
More by @shaw555
Published by @shaw555 on ClawHub