Extract PDF content to Markdown using MinerU API. Supports formulas, tables, OCR. Provides both local file and online URL parsing methods.
Security Analysis
high confidenceThe skill does what its name/description claim (uses the MinerU API to extract PDFs) and its scripts only require a MinerU API token and standard CLI tools; the bundle is internally coherent aside from a minor metadata mismatch about required env vars and a small tooling note.
The skill's name/description (PDF → Markdown using MinerU) matches the included scripts and docs: they call MinerU API endpoints, upload files to presigned OSS URLs, poll results and download a ZIP with parsed Markdown. One inconsistency: the registry metadata at the top states "Required env vars: none", but the SKILL.md and all scripts clearly require an API token (MINERU_TOKEN or MINERU_API_KEY). This is likely an authoring/metadata omission rather than malicious behavior, but users should be aware the token is required.
The runtime instructions and scripts operate within stated scope: reading a local PDF path (when using local flow), validating/sanitizing inputs, calling MinerU API endpoints under MINERU_BASE_URL, uploading to presigned OSS URLs and downloading results from the official CDN host. Scripts include input sanitization, ZIP validation and directory traversal checks. They do not attempt to read unrelated system files or send data to unexpected external endpoints. Minor tooling note: scripts optionally pipe responses to `python3 -m json.tool` for pretty-printing but SKILL.md does not list python3 as a recommended/required tool.
There is no install spec; this is an instruction-only skill with included shell scripts. Nothing in the bundle downloads arbitrary code at install time. Risk is low from the install mechanism itself. However, running the provided scripts will execute code included in the repo, so users should review them before executing.
The scripts require a single service credential (MINERU_TOKEN or MINERU_API_KEY) and optionally MINERU_BASE_URL. That is proportional for a MinerU API integration. The only notable mismatch is registry metadata claiming no required env vars while SKILL.md and scripts require the token—this should be corrected. No unrelated secrets or broad cloud credentials (AWS, GCP, etc.) are requested.
The skill does not request permanent/always-on privileges, does not alter other skills or system-wide configs, and is user-invocable only. Default autonomous invocation is allowed (platform normal) but the skill itself does not request elevated persistence.
Guidance
This skill appears to be what it claims: a set of scripts to call the MinerU API to parse PDFs. Before installing/running: 1) Be sure to set MINERU_TOKEN (or MINERU_API_KEY) — SKILL.md requires it even though the top-level registry metadata omitted it. 2) Review the shell scripts (they are included) and only run them if you trust the source and the MinerU endpoints listed (mineru.net, mineru.oss-cn-shanghai.aliyuncs.com, cdn-mineru.openxlab.org.cn). 3) The scripts use curl and unzip (and may use jq or python3 if present); install those if you want improved JSON handling. 4) Treat your MINERU token as sensitive — do not expose it in public repos or logs, and consider least-privilege options with the provider. 5) If you will process sensitive PDFs, verify the provider's privacy policy before uploading. Overall: coherent and low-risk for its stated purpose, with only the metadata-accuracy and tooling-notes mentioned above to fix.
Latest Release
v1.0.5
No file changes were detected in this version. - Version metadata updated without modifications to any skill files. - No user-facing features, documentation, or code have changed.
Popular Skills
Published by @A-I-R on ClawHub