Prompts/Agent Development/Computer Use & Browser Agents

FreeAgent Development🟠 Claude

Computer Use Agent to scrape product listings on Amazon with o3-mini

Claude Prompt for Computer Use & Browser Agents

End-to-end Computer Use agent that can scrape product listings on Amazon autonomously. Screenshot loop, action grounding, safety gates, and recovery from unexpected UI states.

Related prompts

More prompts for Computer Use & Browser Agents.

Browse all Agent Development →

Agent Development

Premium

Computer Use Agent to fill job applications on company portals with o3-mini

End-to-end Computer Use agent that can fill job applications on company portals autonomously. Screenshot loop, action grounding, safety gates, and recovery from unexpected UI states.

Computer Use Agent to manage ads in Meta Ads Manager with Claude Opus 4

End-to-end Computer Use agent that can manage ads in Meta Ads Manager autonomously. Screenshot loop, action grounding, safety gates, and recovery from unexpected UI states.

Computer Use Agent to fill job applications on company portals with Claude Sonnet 4.5

End-to-end Computer Use agent that can fill job applications on company portals autonomously. Screenshot loop, action grounding, safety gates, and recovery from unexpected UI states.

Computer Use Agent to download reports from Stripe dashboard with Gemini 2.0 Flash

End-to-end Computer Use agent that can download reports from Stripe dashboard autonomously. Screenshot loop, action grounding, safety gates, and recovery from unexpected UI states.

Sandboxed Computer Use Eval Harness for cybersecurity schedule posts in Buffer

Reproducible eval sandbox for testing Computer Use / browser agents on schedule posts in Buffer in cybersecurity context. Fixture sites, gold trajectories, and regression gates.

Sandboxed Computer Use Eval Harness for education triage tickets in Zendesk

Reproducible eval sandbox for testing Computer Use / browser agents on triage tickets in Zendesk in education context. Fixture sites, gold trajectories, and regression gates.

🟠Claude

1641491

You are building a Computer Use agent (screen-seeing + mouse/keyboard controlling) that can scrape product listings on Amazon. Model: o3-mini with computer-use tools. Runtime: Python 3.11 + uv. Computer Use is more capable than pure browser automation but also more dangerous: the agent sees the whole screen and can click anywhere. Design accordingly. ## Part 1 — Scope and constraints - **What exactly is "done"?** Define success criteria for scrape product listings on Amazon as a machine-checkable predicate (URL pattern, DOM state, downloaded file existence, DB row). - **What's out-of-scope?** The agent must not touch: other apps, other browser tabs, system settings, files outside working dir. - **What's the budget?** Max screenshots, max seconds, max cost. - **Who runs it?** Dedicated VM / ephemeral container / user's machine? Each has different safety requirements. ## Part 2 — Environment - Host: dedicated Docker container or cloud VM (not the user's personal desktop) - Display: Xvfb virtual display at fixed resolution (1280x800 recommended — balances detail vs. token cost) - Browser: Chrome/Firefox launched in a pristine profile every run - Network: egress allowlist to the domains needed for scrape product listings on Amazon - Filesystem: scratch dir; no access to secrets outside env Write the Dockerfile + launch script. ## Part 3 — The perception-action loop Pseudocode the loop: 1. Take screenshot 2. Send to o3-mini with tools: screenshot, click, type, scroll, key, wait 3. Model returns next action(s) 4. Execute action with guardrails (below) 5. Loop until success predicate OR budget exhausted OR safety trip ## Part 4 — Action guardrails Each action passes through middleware: - **Bounds check**: click coordinates inside the screen? Click inside the intended app? - **Rate limit**: no more than N actions/sec (humans don't machine-gun clicks) - **Destructive detection**: does the click land on "Delete", "Pay", "Submit $", a confirm dialog? → require explicit high-confidence OR human confirmation - **Off-task detection**: did we navigate somewhere unrelated to scrape product listings on Amazon? - **Loop detection**: same screenshot → same action → same result three times in a row → break out Write the middleware. ## Part 5 — Handling unexpected UI Real-world sites will throw curveballs. Design responses to: - Cookie consent banners (auto-dismiss with "Reject all" if possible, else manage) - Login walls (if credentials not provided → stop and ask; never guess) - CAPTCHA (stop and hand off — never try to bypass) - 2FA (stop and hand off) - Rate limit pages ("too many requests") — back off, don't retry - A/B'd UI variants (the screenshot the agent expected isn't what it sees) — re-plan, don't force - Modal dialogs (handle or dismiss explicitly) ## Part 6 — Safety gates Before any action in these categories, require explicit confirmation: - Payment / checkout completion - Account creation - Sending messages to humans (email, DM) - Deleting data - Changing account settings / passwords The confirmation channel is either the calling user (Slack DM, UI prompt) or a supervisor agent with a stricter rubric. ## Part 7 — Memory within a task Track: - Initial goal (scrape product listings on Amazon) and sub-goals - Actions taken (for retry avoidance) - Facts discovered from the screen (e.g. "confirmation #A123") - Stuck-counter: if no progress in N screenshots, escalate ## Part 8 — Recording + observability - Save every screenshot with the action taken (for debugging + audit) - Structured trace: timestamp, screenshot hash, action, outcome - Redact PII from stored artifacts - Cost meter (screenshots are expensive — each is ~1200+ tokens) ## Part 9 — Eval Run scrape product listings on Amazon 30 times on a fresh environment. Measure: - Success rate - Avg actions to complete - Avg cost - Failure categories (login, CAPTCHA, off-task, loop, UI-changed) Ship criteria: ≥80% success OR reliable human-handoff on failure. ## Part 10 — Implementation Write the full agent: - Loop driver - Safety middleware - o3-mini client with computer-use tools wired - Success predicate checker - Run logger - CLI entry: `agent run --goal="scrape product listings on Amazon" --max-actions=50` Produce real, runnable code, not pseudocode.

Computer Use Agent to scrape product listings on Amazon with o3-mini

Related prompts

Computer Use Agent to fill job applications on company portals with o3-mini

Computer Use Agent to manage ads in Meta Ads Manager with Claude Opus 4

Computer Use Agent to fill job applications on company portals with Claude Sonnet 4.5

Computer Use Agent to download reports from Stripe dashboard with Gemini 2.0 Flash

Sandboxed Computer Use Eval Harness for cybersecurity schedule posts in Buffer

Sandboxed Computer Use Eval Harness for education triage tickets in Zendesk

Computer Use Agent to scrape product listings on Amazon with o3-mini

Related prompts

Computer Use Agent to fill job applications on company portals with o3-mini

Computer Use Agent to manage ads in Meta Ads Manager with Claude Opus 4

Computer Use Agent to fill job applications on company portals with Claude Sonnet 4.5

Computer Use Agent to download reports from Stripe dashboard with Gemini 2.0 Flash

Sandboxed Computer Use Eval Harness for cybersecurity schedule posts in Buffer

Sandboxed Computer Use Eval Harness for education triage tickets in Zendesk

Tags

Who this is for