Build and run Gemini 2.5 Computer Use browser-control agents with Playwright. Use when a user wants to automate web browser tasks via the Gemini Computer Use model, needs an agent loop (screenshot → function_call → action → function_response), or asks to integrate safety confirmation for risky UI actions.
Security Analysis
high confidenceThe skill's purpose (browser automation with Gemini Computer Use) is plausible and most of the code matches that purpose, but there are clear inconsistencies (registry says no env vars while the code requires GEMINI_API_KEY), a truncated/possibly-buggy script fragment, and privacy/operational risks (screenshots sent to an external model, browser automation can act on web pages) that you should understand before installing.
The name/description (Gemini Computer Use browser-control agents) matches the included script and instructions: it uses Playwright and the Google GenAI client to run a screenshot → function_call → action → function_response loop. However the registry metadata claims 'Required env vars: none' while both the SKILL.md quickstart and the script require a GEMINI_API_KEY (and optionally COMPUTER_USE_BROWSER_CHANNEL / COMPUTER_USE_BROWSER_EXECUTABLE). That registry vs implementation mismatch is inconsistent and should be corrected/clarified.
SKILL.md tells the user to set an API key and run the provided script. The runtime instructions and code capture full-page screenshots and send them (inline image/png parts) along with the user prompt to the external Gemini model (Google GenAI). This is expected for the skill's purpose, but it means screenshots (which may contain sensitive information) are transmitted off-host. The instructions also allow the model to emit function_call actions that the script executes directly in Playwright; while the script supports a user confirmation flow for 'require_confirmation', most actions will execute without prompting. Additionally, the script included in the package is truncated near the model call (it references an apparent variable/modification error), which could hide additional behavior or indicate the shipped script will fail or behave unexpectedly.
There is no automated install spec (instruction-only install). The SKILL.md instructs the user to create a virtualenv and pip install google-genai and playwright, then run 'playwright install chromium'. This is a standard, low-risk approach compared to bundled downloads from arbitrary URLs. The package includes a Python script; no external downloads or extract/install steps are declared in the skill bundle itself.
The code legitimately requires GEMINI_API_KEY to call the Gemini Computer Use model and optionally COMPUTER_USE_BROWSER_CHANNEL and COMPUTER_USE_BROWSER_EXECUTABLE to control browser selection. Those env vars are proportional to the stated purpose. However the public registry metadata incorrectly lists no required env vars, which is misleading. Also note that transmitting screenshots to the external API is intrinsic to functionality but is a privacy-sensitive operation — the skill will send image data to Google's API, and users should consider whether that exposure is acceptable for the pages/screens they automate.
The skill is not always-enabled and does not request special platform privileges. The skill is allowed to be invoked autonomously (disable-model-invocation is false), which is the platform default; combined with broad browser control capabilities, autonomous invocation increases the blast radius (the agent could autonomously navigate, click, and type). SKILL.md does recommend running in a sandboxed profile or container. There is no evidence the skill modifies other skills or system settings.
Guidance
Before installing or running this skill: - Expect to set GEMINI_API_KEY (the code will exit if GEMINI_API_KEY is not set). The registry metadata incorrectly claimed no env vars — don't trust that field alone. - Screenshots of the browser are sent to the Gemini/Google GenAI endpoint as part of normal operation. Those screenshots can contain sensitive data (credentials, personal info, 2FA codes). Only run this against pages you are comfortable sending to an external API. - Run the agent in a sandboxed environment or container and avoid pointing COMPUTER_USE_BROWSER_EXECUTABLE at a browser that uses your real profile (bookmarks/cookies/sessions) — otherwise the agent could act using your authenticated sessions. - The included Python script appears to be truncated near the model invocation (a fragment referencing 'MOD' was cut off). That may be a bug or hide additional behavior. Inspect the full script locally before running; fix the apparent variable name and ensure the model call and loop are readable. - Verify the safety confirmation flow: the script will prompt via input() only when the model provides 'safety_decision: require_confirmation'; many actions will execute without prompting. If you need stricter controls, modify the code to enforce confirmation or block lists before running. - If you are uncertain about network exposure, run the script in an isolated VM/container and review network traffic to confirm only expected calls to Google GenAI occur. - If you want to proceed, obtain the env.example referenced in SKILL.md, set GEMINI_API_KEY, inspect and possibly patch the script, and test on non-sensitive sites first.
Latest Release
v1.0.0
Initial release - Gemini 2.5 Computer Use browser-control agents with Playwright
More by @am-will
Published by @am-will on ClawHub