Observe
Read current page state, visible text, forms, actions, tabs, and blockers.
The web has never been agent-native. Until now.
WebCLI is your agent's local browser loop: observe page state, enumerate actions, act by numbers, recover from blockers, pause for human help, and leave a redacted transcript.
web open http://localhost:3000 --json{ "ok": true, "url": "http://localhost:3000", "state": "complete" }
web observe --json{
"summary": "Sign-in page loaded",
"actions": ["1: Sign in", "2: Create account"],
"inputs": ["3: email", "4: password"],
"state": "ready"
}
web type 3 cris@example.com --json{ "ok": true, "message": "typed into email" }
web do 1 --json{ "ok": true, "message": "clicked Sign in" }
The split-screen launch demo shows an agent driving GitHub from the terminal while the real browser moves beside it: create a fine-grained token for a throwaway repo, hit a passkey blocker, pause for human approval, resume, and print a redacted transcript.
web open github.com/settings/tokens --json{ "ok": true, "state": "complete" }
web observe --json{ "summary": "GitHub token settings", "actions": ["3: Generate new token"], "state": "ready" }
web do 3 --json{ "ok": true, "message": "clicked Generate new token" }
web state --json{ "state": "blocked", "reason": "passkey confirmation required" }
web pause "Need human approval for GitHub passkey"Paused. Waiting for human to join.
web resume --json{ "ok": true, "message": "resumed after human interaction" }
web transcript --last 20 --json{ "events": ["redacted transcript with blocker, pause, and resume recorded"] }
Real launch work still happens on websites: GitHub, Stripe, Cloudflare, token pages, admin dashboards, auth prompts, file choosers, popups, and changing UIs. Screenshots are useful for humans. Test frameworks are excellent for known scripts. WebCLI is for contact with reality — when an agent must inspect, decide, act, recover, and sometimes dial-a-human to help.
WebCLI turns browsing into a repeatable shell-native loop an agent can follow without guessing or hallucinating.
Read current page state, visible text, forms, actions, tabs, and blockers.
Pick from numbered actions instead of inventing selectors or coordinates.
Click, type, submit, choose, press, scroll, or navigate from the terminal.
Detect weird states, auth prompts, dialogs, file choosers, and degraded sessions.
Pause cleanly when human judgment is required. Join the session, fix it, then resume.
Print a redacted transcript so you can audit exactly what happened.
WebCLI turns the live page into numbered actions. The agent acts on the latest observed state instead of hallucinating selectors or coordinates.
web observe --json returns compact, actionable state: summary, actions, inputs, forms, tabs, and blockers — enough to decide, without dumping the whole DOM.
web state surfaces native and heuristic blockers so the agent stops blindly retrying when it hits auth, approval dialogs, file choosers, or ambiguous conditions.
web pause, web join, and web resume create a clean, auditable handoff when a workflow needs a person.
web transcript records redacted command history so you can see what the agent did without secrets leaking into logs.
One binary. JSON output. No Python or TypeScript framework required. Works with Claude Code, Cursor, Aider, OpenHands, custom shell loops, or CI.
Starts with your local browser, your machine, and named persistent profiles when your agent needs continuity across runs.
Open, observe, read, find, click, type, choose, submit, press keys, scroll, capture, tab management, and page evaluation.
When a site needs a person — passkeys, MFA, sensitive approvals, or ambiguous decisions — the agent can pause cleanly instead of guessing or looping.
Connect to CDP endpoints or BrowserBox sessions for shared, remote, or isolated browser surfaces.
Run web agents-md to generate compact rules your agent can follow: observe first, use numbered actions, prefer JSON, recover cleanly, and produce transcripts.
When a workflow reaches passkeys, MFA, sensitive approvals, or ambiguous states, the agent can pause instead of pretending it can continue.
Transcripts show what happened without dumping secrets, tokens, passwords, or sensitive field values.
Start on your own machine and browser context. Move to runners or BrowserBox-backed sessions when the workflow needs it.
Use Playwright or Cypress when you know the app and the script. Use WebCLI when an agent must inspect an unknown or changing website, decide what to do, act, observe again, and recover without writing a full test suite first.
Screenshots are useful for human verification. But weak as the primary control loop for your agent friends — shots are token-heavy, easy to misread, and disconnected from actionable page state. WebCLI gives agents structured state and stable numbered actions.
MCP is useful when you want a tool server. WebCLI is a local binary optimized for shell-based agents, terminals, scripts, and CI. They complement each other.
Those are frameworks for building agents inside specific stacks. WebCLI is the shell-native layer: one binary any coding agent (or human) can use to drive web actions without adopting a framework.
No. WebCLI is designed to detect blockers and create a clean human handoff. When a workflow needs a person, the agent pauses, you join, and the transcript records what happened.
WebCLI is built around redacted transcripts and explicit human handoff. For sensitive workflows, pause for human approval instead of letting the agent run unsupervised.
WebCLI is a paid developer tool with a short full-feature trial. Try the browser loop locally, then activate a license to keep your agents operating real websites.
Full local trial for evaluating WebCLI with a verified work email.
Using a free email provider, disposable email, or an org that has already used its 3 free trials? Start a paid trial pass, credited toward Solo Dev.
For one developer using WebCLI commercially with local agents.
For headless, CI, multi-machine, and production agent workflows.
Runner trial pass available for blocked trial domains and credited toward Pro Runner.
For redistribution, bundling, team platforms, and BrowserBox-backed integrations.
When the trial ends or a license is invalid, browser commands stop until a valid license is activated.
Drop WebCLI instructions into your repo so your coding agent knows how to browse safely: observe first, use numbered actions, prefer JSON, pause on blockers, and report with transcripts. AGENTS.md and SKILL.md formats are included.
web agents-md > AGENTS.mdweb skill-md > SKILL.mdcurl -fsSL webcli.sh/agents/SKILL.md -o SKILL.md