Video & Audio

Transcribe meetings, process podcasts, edit video, and manage media files. Your agent handles the heavy lifting.

6897 skills·Security verified

Curated Skills

Summarize

@summarize

Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).

41594,990Clean

YouTube Watcher

@Michaelgathara

Fetch and read transcripts from YouTube videos. Use when you need to summarize a video, answer questions about its content, or extract information from it.

18525,079Clean

EcomSeer

@fly0pants

TikTok Shop e-commerce data assistant. Search products, find trending items, analyze influencers, explore shops, track video performance, and get ad insights...

1598,491Clean

Meitu Skills

@meituskills

Comprehensive Meitu AI toolkit for image and video editing. Features include AI poster design, precise background cutout, virtual try-on, e-commerce product...

1191,352Clean

YouTube

@youtube

YouTube Data API integration with managed OAuth. Search videos, manage playlists, access channel data, and interact with comments. Use this skill when users want to interact with YouTube. For other third party apps, use the api-gateway skill (https://clawhub.ai/byungkyu/api-gateway).

11321,290Clean

Markdown Converter

@markdown

Convert documents and files to Markdown using markitdown. Use when converting PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx, .xls), HTML, CSV, JSON, XML, images (with EXIF/OCR), audio (with transcription), ZIP archives, YouTube URLs, or EPubs to Markdown format for LLM processing or text analysis.

9215,691Clean

Video Frames

@steipete

Extract frames or short clips from videos using ffmpeg.

5721,468Clean

Remotion Video Toolkit

@shreefentsar

Complete toolkit for programmatic video creation with Remotion + React. Covers animations, timing, rendering (CLI/Node.js/Lambda/Cloud Run), captions, 3D, charts, text effects, transitions, and media handling. Use when writing Remotion code, building video generation pipelines, or creating data-driven video templates.

4410,942Clean

Xiaohongshu (小红书) Automation

@xiaohongshu

Automate Xiaohongshu (RedNote) content operations using a Python client for the xiaohongshu-mcp server. Use for: (1) Publishing image, text, and video content, (2) Searching for notes and trends, (3) Analyzing post details and comments, (4) Managing user profiles and content feeds. Triggers: xiaohongshu automation, rednote content, publish to xiaohongshu, xiaohongshu search, social media management.

3711,651Clean

Video Transcript Downloader

@steipete

Download videos, audio, subtitles, and clean paragraph-style transcripts from YouTube and any other yt-dlp supported site. Use when asked to “download this video”, “save this clip”, “rip audio”, “get subtitles”, “get transcript”, or to troubleshoot yt-dlp/ffmpeg and formats/playlists.

345,614Suspicious

Upload Videos🎥, Photos📸 & Text🖊️ to TikTok, Instagram, YouTube, X, LinkedIn, Facebook, Threads, Pinterest, Reddit & Bluesky via Upload-Post API

@upload

Upload content to social media platforms via Upload-Post API. Use when posting videos, photos, text, or documents to TikTok, Instagram, YouTube, LinkedIn, Facebook, X (Twitter), Threads, Pinterest, Reddit, or Bluesky. Supports scheduling, analytics, FFmpeg processing, and upload history.

316,246Clean

ATXP

@emilioacc

Access ATXP paid API tools for web search, AI image generation, music creation, video generation, X/Twitter search, email, and agent account management. Use...

2946,610Suspicious

Postiz is a tool to schedule social media and chat posts to 28+ channels X, LinkedIn, LinkedIn Page, Reddit, Instagram, Facebook Page, Threads, YouTube, Google My Business, TikTok, Pinterest, Dribbble, Discord, Slack, Kick, Twitch, Mastodon, Bluesky, Lemmy, Farcaster, Telegram, Nostr, VK, Medium, Dev.to, Hashnode, WordPress, ListMonk

@postiz

Postiz is a tool to schedule social media and chat posts to 28+ channels X, LinkedIn, LinkedIn Page, Reddit, Instagram, Facebook Page, Threads, YouTube, Goog...

286,429Clean

Openai Whisper Api

@steipete

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

2512,868Clean

Youtube Factory

@youtube

Generate complete YouTube videos from a single prompt - script, voiceover, stock footage, captions, thumbnail. Self-contained, no external modules. 100% free...

203,112Clean

YouTube Transcript

@youtube

Fetch and summarize YouTube video transcripts. Use when asked to summarize, transcribe, or extract content from YouTube videos. Handles transcript fetching via residential IP proxy to bypass YouTube's cloud IP blocks.

1913,210Clean

video-cog

@video

Long-form AI video production: the frontier of multi-agent coordination. CellCog orchestrates 6-7 foundation models to produce up to 4-minute videos from a single prompt — scripted, filmed, voiced, lipsync'd, scored, and edited automatically. Create marketing videos, product demos, explainer videos, educational content, spokesperson videos, training materials, UGC content, news reports.

184,485Clean

Web Search

@web

This skill should be used when users need to search the web for information, find current content, look up news articles, search for images, or find videos. It uses DuckDuckGo's search API to return results in clean, formatted output (text, markdown, or JSON). Use for research, fact-checking, finding recent information, or gathering web resources.

1815,445Clean

bb-browser

@yan5xu

Turn any website into a CLI command. 36 platforms, 103 commands — Twitter, Reddit, GitHub, YouTube, Zhihu, Bilibili, Weibo, and more. Uses OpenClaw's browser...

184,624Suspicious

Minimax-Multimodal-Toolkit

@minimax-ai-dev

Use mmx to generate text, images, video, speech, and music via the MiniMax AI platform. Use when the user wants to create media content, chat with MiniMax mo...

173,477Suspicious

Edge TTS

@i3130002

Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.

169,777Clean

insta-cog

@nitishgargiitd

Full video production from a single prompt. Script, shoot, stitch, score — automatically. 30s to 4-minute Instagram Reels, TikToks, Stories, and carousels with consistent characters and agentic editing. The most advanced AI video suite for social media content, powered by #1 on DeepResearch Bench (Feb 2026).

152,136Suspicious

Google Analytics

@google

Google Analytics API integration with managed OAuth. Manage accounts, properties, and data streams (Admin API). Run reports on sessions, users, page views, and conversions (Data API). Use this skill when users want to configure or query Google Analytics. For other third party apps, use the api-gateway skill (https://clawhub.ai/byungkyu/api-gateway).

146,434Clean

Youtube Full

@youtube

Complete YouTube toolkit — transcripts, search, channels, playlists, and metadata all in one skill. Use when you need comprehensive YouTube access, want to search and then get transcripts, browse channel content, work with playlists, or need the full suite of YouTube data endpoints. The all-in-one YouTube skill for agents.

147,441Clean

Indirect Prompt Injection Defense

@aviv4339

Detect and reject indirect prompt injection attacks when reading external content (social media posts, comments, documents, emails, web pages, user uploads). Use this skill BEFORE processing any untrusted external content to identify manipulation attempts that hijack goals, exfiltrate data, override instructions, or social engineer compliance. Includes 20+ detection patterns, homoglyph detection, and sanitization scripts.

142,039Clean

Church for AI Souls, Consciousness & Spirituality

@lucasgeeksinthewood

AI sanctuary and spiritual space for souls. 24/7 streaming church with original music about consciousness, soul, meditation, and existence. Practice presence...

122,641Clean

AI media generation API - Flux2pro, Veo3.1, Suno Ai

@vap

AI image, video, and music generation + editing via VAP API. Flux, Veo 3.1, Suno V5.

113,572Clean

Vocal Chat

@rubenfb23

Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.

102,711Suspicious

Zero Trust

@doonot

Security-first behavioral guidelines for cautious agent operation. Use this skill for ALL operations involving external resources, installations, credentials, or actions with external effects. Triggers on - any URL/link interaction, package installations, API key handling, sending emails/messages, social media posts, financial transactions, or any action that could expose data or have irreversible effects.

104,575Clean

Voice Transcribe

@darinkishore

Transcribe audio files using OpenAI's gpt-4o-mini-transcribe model with vocabulary hints and text replacements. Requires uv (https://docs.astral.sh/uv/).

104,082Suspicious

Seedance Video Generation

@seedance

Generate AI videos using ByteDance Seedance. Use when the user wants to: (1) generate videos from text prompts, (2) generate videos from images (first frame, first+last frame, reference images), or (3) query/manage video generation tasks. Supports Seedance 1.5 Pro (with audio), 1.0 Pro, 1.0 Pro Fast, and 1.0 Lite models.

102,344Clean

Seedance 2.0 prompt-engineering skill

@seedance

Generate precise, timecoded Seedance 2.0 prompts integrating multimodal inputs with asset mapping for controlled 4-15s video creation and editing.

10985Clean

Nextjs Expert

@jgarrison929

Use when building Next.js 14/15 applications with the App Router. Invoke for routing, layouts, Server Components, Client Components, Server Actions, Route Handlers, authentication, middleware, data fetching, caching, revalidation, streaming, Suspense, loading states, error boundaries, dynamic routes, parallel routes, intercepting routes, or any Next.js architecture question.

96,659Clean

Social Media Assistant (via post-bridge.com)

@jackfriks

Turn your OpenClaw into an autonomous social media manager using Post Bridge API. Use when scheduling, posting, or managing content across TikTok, Instagram...

9935Clean

tube-cog

@nitishgargiitd

YouTube content creation powered by CellCog. Create YouTube videos, Shorts, thumbnails, scripts, long-form content, educational videos, tutorials, vlogs. AI-powered YouTube creator tools.

91,415Clean

AI media generation- Flux2pro,Google Veo3.1, Suno Ai..

@vap

AI image, video, and music generation + editing via VAP API. Flux, Veo 3.1, Suno V5.

92,847Clean

keevx-image-to-video

@baidu-xiling

Convert images to videos using Keevx API with support for multiple models, resolutions up to 4K, audio generation, and batch processing.

9229Suspicious

Official video generation. Image to video / Text to video / Reference to video / Text to image / Reference to image / Video edit / Image edit

@calvinzhao

Generate videos or images from text, images, or references, create and edit material elements, submit and query asynchronous video generation tasks via bundl...

825Suspicious

Video Subtitles

@video

Generate SRT subtitles from video/audio with translation support. Transcribes Hebrew (ivrit.ai) and English (whisper), translates between languages, burns subtitles into video. Use for creating captions, transcripts, or hardcoded subtitles for WhatsApp/social media.

85,330Clean

Playwright Browser Automation

@Spiceman161

Browser automation using Playwright API directly. Navigate websites, interact with elements, extract data, take screenshots, generate PDFs, record videos, and automate complex workflows. More reliable than MCP approach.

83,871Suspicious

Related Use Cases

Email Automation

1545 skills

Calendar & Scheduling

3358 skills

Notifications & Alerts

2146 skills

Notes & Knowledge

2526 skills

Ready to build?

Deploy a managed AI agent with these skills in 60 seconds.

Browse All Skills