Claude Prompt for Computer Use & Browser Agents
Record a human performing monitor competitor pricing once; replay it autonomously with an LLM-backed resilience layer. Covers capture, parameterization, and drift detection.
More prompts for Computer Use & Browser Agents.
End-to-end Computer Use agent that can fill job applications on company portals autonomously. Screenshot loop, action grounding, safety gates, and recovery from unexpected UI states.
End-to-end Computer Use agent that can manage ads in Meta Ads Manager autonomously. Screenshot loop, action grounding, safety gates, and recovery from unexpected UI states.
End-to-end Computer Use agent that can fill job applications on company portals autonomously. Screenshot loop, action grounding, safety gates, and recovery from unexpected UI states.
End-to-end Computer Use agent that can download reports from Stripe dashboard autonomously. Screenshot loop, action grounding, safety gates, and recovery from unexpected UI states.
Reproducible eval sandbox for testing Computer Use / browser agents on schedule posts in Buffer in cybersecurity context. Fixture sites, gold trajectories, and regression gates.
Reproducible eval sandbox for testing Computer Use / browser agents on triage tickets in Zendesk in education context. Fixture sites, gold trajectories, and regression gates.
Build a record-and-replay browser agent for monitor competitor pricing. A human does the workflow once; the system captures it as a reusable recipe; Gemini 2.5 Pro replays it with resilience to UI drift. Runtime: Deno 2.
Pure recording is brittle (one CSS change breaks it). Pure LLM agents are expensive + non-deterministic. Hybrid wins.
## Part 1 — Recording phase
Capture from a human session:
- DOM snapshots before each action
- Actions taken (click, type, navigate, wait)
- Target element: a11y role + name + text content + path (not just CSS selector)
- Input parameters (which fields are literal vs. which are variables for this workflow)
Write the Chrome extension / Playwright codegen wrapper that captures this into a structured recipe file.
## Part 2 — Recipe schema
```
{
"name": "monitor competitor pricing",
"parameters": [{ "name": "...", "description": "...", "type": "..." }],
"steps": [
{
"id": "step-1",
"intent": "<one-line description of what this step does>",
"action": { "type": "click|fill|select|wait|goto", ... },
"target": {
"role": "button",
"accessible_name": "Submit order",
"surrounding_text": "...",
"dom_path_hint": "..."
},
"post_condition": "<machine-checkable: url matches, element appears, api called>"
}
]
}
```
Every step has an intent, an action, a flexible target, and a post-condition.
## Part 3 — Parameterization
During recording, the user marks which inputs are parameters. E.g. for "book flights on Google Flights" the recipe parameters might be `origin`, `destination`, `date`. Write the UI for marking + the substitution logic at replay time.
## Part 4 — Replay with resilience
For each step at replay:
1. Try the recorded target (`dom_path_hint`) — if it resolves, execute
2. If it doesn't resolve, query the page with a11y tree + the step's `intent` and ask Gemini 2.5 Pro to find the right element
3. Execute the action
4. Verify `post_condition` — if fails, retry with LLM re-planning
5. If 3 retries fail, pause for human takeover and mark the step as drifted
## Part 5 — Drift detection + recipe healing
When step N fails repeatedly:
- Log: which element shape changed, what the new a11y tree looks like
- Auto-propose a recipe update (new `dom_path_hint`) after human-verified success
- Ship the updated recipe to the recipe store
Over time, the recipe self-heals.
## Part 6 — Safety
- Destructive steps (payment, delete, submit-and-charge) are flagged during recording
- At replay, destructive steps require human confirmation OR a strict pre-check
- Never replay across user boundaries without re-authentication
## Part 7 — Storage + versioning
- Recipe store with versions
- Diff between recipe versions (what changed, which step)
- Rollback on regression
## Part 8 — Observability
Per-run trace:
- Which steps used recorded selector vs. LLM fallback
- Step-level success rates over time
- Cost per run (how often is LLM invoked)
Dashboard: steps most frequently drifted → prioritize for recipe updates.
## Part 9 — Eval
Ship criteria for a recipe:
- Replays successfully 10x in a row on clean env
- Parameterization works across N sample inputs
- Drift recovery handles at least 1 seeded UI change
## Part 10 — Implementation
Deliver:
- Recording tool (Chrome extension or Playwright codegen wrapper)
- Recipe schema + storage
- Replay engine with LLM fallback
- Recipe healing pipeline
- CLI: `record`, `replay`, `heal`
- monitor competitor pricing-specific sample recipe
Full working code.Replace the bracketed placeholders with your own context before running the prompt:
[{ "name": "...", "description": "...", "type": "..." }]— fill in your specific { "name": "...", "description": "...", "type": "..." }.