Chemistry agent skill for PubChem API queries (compound info/properties, structures/SMILES/images, synthesis routes/references) + RDKit cheminformatics (SMIL...
Security Analysis
medium confidenceThe code and runtime instructions are internally consistent with a PubChem+RDKit chemistry tool; no obvious attempts to exfiltrate credentials or access unrelated systems, but there are a few operational and privacy-related issues to review before running.
Name/description (PubChem queries + RDKit analysis/retrosynthesis) match the included scripts: query_pubchem.py, rdkit_mol.py, admet_predict.py, chembl and PubMed query modules, templates.json of reaction SMARTS, and a Gradio UI. Network calls are limited to public chemistry/data APIs (PubChem, ChEMBL, NCBI) which are expected for the stated purpose.
SKILL.md and scripts instruct the agent to run local Python scripts that call external chemistry APIs and perform RDKit processing. The code writes image files under a viz directory, spawns subprocesses to run bundled scripts (e.g., query_pubchem.py), and may create files in the skill directory. Two notable operational items: chem_ui.py contains a hard-coded WORK_DIR path (/home/democritus/..., likely the packager's local path) and launches Gradio with share=True (which will try to create a public link when run). These are not malicious but are privacy/operational concerns and may cause failures or unintended network exposure.
There is no install spec and no remote downloads executed automatically. A pyvenv.cfg file is included (indicating a packaged virtualenv metadata entry) but no installer that fetches arbitrary code. scripts/opsin_name_to_smiles.py references opsin.jar and prints a wget URL if missing rather than auto-downloading it. Overall the package does not perform any high-risk remote installs by itself.
The skill declares no required environment variables, no credentials, and no config paths. All external access is to public chemical data APIs (PubChem, ChEMBL, NCBI), which matches the skill purpose. No secret-exposing env vars are requested. Note: running the Gradio UI (share=True) will expose a public endpoint if executed, which could leak data you send to the UI.
always is false and the skill does not request persistent elevated privileges or modify other skills. It writes outputs (images) to a local viz directory inside the skill workspace and invokes local subprocesses; this is normal for a tool of this type. Autonomous invocation is allowed by default but is not combined with other concerning privileges here.
Guidance
What to consider before installing/running: - Functionality and network use: The skill calls PubChem, ChEMBL, and NCBI (PubMed) APIs and runs RDKit locally—this is expected. If you need to avoid external network calls, do not run it or sandbox its network access. - Dependencies: RDKit, Pillow, pandas, gradio, and possibly Java (for OPSIN) are required. The package itself does not install them automatically; install and verify these in a controlled environment. - Gradio exposure: chem_ui.py launches Gradio with share=True and a hard-coded WORK_DIR. If you run the UI, remove or change share=True and adjust WORK_DIR to a safe path. share=True will try to create a public URL and may expose data you input to external services. - Files written: scripts create image files under a viz directory and use a WORK_DIR for the UI. Ensure the skill runs in an isolated workspace so these files don't overwrite or leak host data. - OPSIN: opsin.jar is not bundled; the script prints a wget command if the JAR is missing. The skill does not automatically fetch/execute arbitrary binaries, but if you follow that wget instruction you will download and run third-party Java code—review that separately. - Dual-use / safety: The skill includes retrosynthesis templates and multi-step planning (BRICS/disconnects and named reaction templates). These are legitimate chemistry capabilities but could be sensitive (dual-use). Consider institutional policies and legal/regulatory constraints before performing synthesis planning for hazardous or controlled compounds. - Recommendation: run the skill in a sandboxed environment (container/VM) with limited network access while you verify behavior, remove or change Gradio share=True, check WORK_DIR, and ensure required Python packages are installed from trusted sources.
Latest Release
v1.0.0
chemistry-query v1.0.0 - Initial release of full-stack chemistry toolkit integrating PubChem API queries and RDKit cheminformatics. - Supports compound lookup, molecular properties, 2D visualization, retrosynthesis, multi-step synthesis planning, and reaction simulation. - Provides structured JSON outputs, PNG/SVG molecule images, and automatic chemical name resolution. - Includes 21 named reaction templates and chaining capabilities for seamless chemistry workflows.
More by @Cheminem
Published by @Cheminem on ClawHub