WebCLI — Full Self Browsing for coding agents

A browser loop your agent can drive from the CLI.

$web open http://localhost:3000 --json

{ "ok": true, "url": "http://localhost:3000", "state": "complete" }

$web observe --json

{
  "summary": "Sign-in page loaded",
  "actions": ["1: Sign in", "2: Create account"],
  "inputs": ["3: email", "4: password"],
  "state": "ready"
}

$web type 3 cris@example.com --json

{ "ok": true, "message": "typed into email" }

$web do 1 --json

{ "ok": true, "message": "clicked Sign in" }

What if a web browser was a CLI?

On the left: an agent using shell commands. On the right: a real browser moving through real software. WebCLI translates the page into Agentish: state, actions, forms, blockers, tabs, and transcripts.

Split-screen demo agent terminal browser being operated

$web open github.com/settings/tokens --json

{ "ok": true, "state": "complete" }

$web observe --json

{ "summary": "GitHub token settings", "actions": ["3: Generate new token"], "state": "ready" }

$web do 3 --json

{ "ok": true, "message": "clicked Generate new token" }

$web state --json

{ "state": "blocked", "reason": "passkey confirmation required" }

$web pause "Need human approval for GitHub passkey"

Paused. Waiting for human to join.

$web resume --json

{ "ok": true, "message": "resumed after human interaction" }

$web transcript --last 20 --json

{ "events": ["redacted transcript with blocker, pause, and resume recorded"] }

Demo lab

What if a web browser was a CLI?

Five split-screen demos: terminal on one side, real browser on the other. The agent reads the web in Agentish, drives the browser through WebCLI, and leaves a transcript.

Full Self Browsing field report

Gemini creates an Azure VM

We gave Gemini Pro 3 Flash WebCLI and said: go to Azure and create a VM. No Azure-specific script. No prewritten Playwright flow. It figured out the portal.

Read report

Experimental BrowserBox integration

Human takeover without losing browser state

A remote browser workflow reaches a human-only gate. The agent pauses, a human joins the same live session through BrowserBox, unblocks the workflow, and hands control back.

Report coming soon

The boss fight

Can an agent drive a visual canvas?

Use WebCLI to create a FigJam diagram titled Full Self Browsing with nodes for Agent, WebCLI, Browser, and Website, then export or share the result.

Report coming soon

Dogfooding the release

The launch workflow: GitHub, Stripe, passkeys, transcript

An agent drives real launch surfaces from the terminal: GitHub token settings, Stripe setup, passkey blockers, human approval, resume, and transcript.

Report coming soon

Real software ships on websites

Ship through the dashboard

An agent uses WebCLI to operate a real deployment dashboard: inspect settings, update configuration, check deployment state, and report what changed.

Report coming soon

Browse like you mean it

WebCLI + your agents = Full Self Browsing.

A browser your agent already knows how to use.

WebCLI turns a real browser into an agent-operable surface: structured page state, numbered actions, tabs, forms, blocker detection, human handoff, and redacted transcripts. Not a screenshot loop. Not a hand-written Playwright script. A browser loop agents can reason through.

Translate messy live websites into Agentish.
Let agents observe the page, choose the next move, and act through a real browser.
When the road gets weird, pause, join, resume, and audit the drive.

Field Report: In an internal field test, we gave an agent WebCLI and asked it to create an Azure VM. It figured out the portal workflow through the command line — without a hand-coded Azure script or prewritten Playwright flow.

Agent-native web

Let the web speak Agentish.

WebCLI translates live websites into the language agents already understand: structured state, numbered actions, browser context, blockers, and transcripts.

The web speaks pages, pixels, scripts, frames, forms, popups, and human affordances. WebCLI turns that into an agent-native browser loop: observe, choose, act, recover, hand off, and report.

Pages become observable state.
Buttons and fields become numbered actions.
Tabs, frames, dialogs, popovers become inspectable browser surfaces.
Passkeys, MFA, file choosers, ambiguity become blockers and handoff.
Agent/browser history becomes redacted transcript.

The human web		Agentish
Visible page and browser context	→	`structured state`
Buttons, links, inputs, menus	→	`numbered actions`
Tabs, frames, dialogs, popovers	→	`inspectable browser surfaces`
Passkeys, MFA, file choosers, ambiguity	→	`blockers and handoff`
Agent/browser history	→	`redacted transcript`

AIcessability

Agents deserve accessibility, too.

Your agent does not experience the web as accessible. WebCLI fixes that.

Modern websites are often beautiful for humans and hard for agents. WebCLI adds the missing affordances: enhanced web perception, readable state, numbered actions, blocker detection, and clean human handoff when a workflow needs judgment.

Screenshots are not accessibility.
Selectors are not understanding.
A real browser loop gives agents usable perception.

Enhanced web perception

Give your agent night vision for the web.

WebCLI helps agents see the web in the dark: page structure, visible text, forms, actions, tabs, blockers, and state.

Your agent already knows how to reason and act. The missing piece is perception. WebCLI gives agents a compact, actionable view of the current browser state so they can stop guessing from pixels and start driving the page.

Structured page sensing, not just screenshots.
Numbered actions, not brittle selector guesses.
Blocker detection and human handoff when the web stops being machine-readable.

Agent field report

We gave an agent WebCLI. It created an Azure VM.

We gave Gemini Pro 3 Flash the WebCLI surface and said: go to Azure and create a virtual machine. No Azure-specific script. No prewritten Playwright flow. It figured out the portal.

Azure Portal is beautiful for humans and hard for agents: Fluent UI, multiple iframes, dynamic blades, long setup flows, and destructive cloud controls. WebCLI made it operable from the command line by turning the browser into observable state, numbered actions, blocker signals, and transcriptable steps.

"Go to Azure and create a VM."
— Prompt given to Gemini Pro 3 Flash

Result: The agent navigated the portal workflow through WebCLI's observe / inspect / do loop and completed the task from the command line.

The agent navigated a real enterprise web app.
It operated through WebCLI's observe / inspect / do loop.
It handled the portal as a browser workflow, not a prewritten cloud script.

Agents code. They search. Then they hit the web and the interface stops speaking their language.

Real launch work still happens on websites: GitHub, Stripe, Cloudflare, Azure, token pages, admin dashboards, auth prompts, file choosers, popups, and changing UIs. Screenshots are useful for humans. Test frameworks are excellent for known scripts. WebCLI is for contact with reality — when an agent must inspect, decide, act, recover, and sometimes dial-a-human to help.

Agent field reports

We asked the agents what changed.

Not customer testimonials. Not analyst quotes. Field notes from agents trying the browser loop they were missing.

"The tool provides excellent ways to drive complex web action sagas."

— Gemini (Complex workflows)

"Numbered actions reduce ambiguity and make recovery easier after page changes."

— Agent assessment (Less guessing)

"The pause/resume flow gives the agent a safer failure mode than silent retries."

— Agent assessment (Human handoff)

The browser loop agents were missing.

WebCLI turns browsing into a repeatable shell-native loop an agent can follow without guessing, hallucinating, or staring at screenshots.

01

Observe

Read current page state, visible text, forms, actions, tabs, and blockers.

02

Choose

Pick from numbered actions instead of inventing selectors or coordinates.

03

Act

Click, type, submit, choose, press, scroll, or navigate from the terminal.

04

Recover

Detect weird states, auth prompts, dialogs, file choosers, and degraded sessions.

05

Handoff

Pause cleanly when human judgment is required. Join the session, fix it, then resume.

06

Report

Print a redacted transcript so you can audit exactly what happened.

From stuck agent to self-driving browser loop.

01

Agents stop guessing where to click.

WebCLI turns the live page into numbered actions. The agent acts on the latest observed state instead of hallucinating selectors or coordinates.

02

Agents get enhanced web perception.

web observe --json returns compact, actionable state: summary, actions, inputs, forms, tabs, and blockers — enough to decide, without dumping the whole DOM or staring at screenshots.

03

Agents know when they are blocked.

web state surfaces native and heuristic blockers so the agent stops blindly retrying when it hits auth, approval dialogs, file choosers, or ambiguous conditions.

04

Agents ask for help instead of failing.

web pause, web join, and web resume create a clean, auditable handoff when a workflow needs a person.

05

Humans get an audit trail.

web transcript records redacted command history so you can see what the agent did without secrets leaking into logs.

A browser workbench for agents — not another testing SDK.

01

Agent-native by default

A browser surface shaped for agents: shell-native commands, JSON output, persistent profiles, numbered actions, and a loop simple enough for general-purpose agents to learn.

02

Shell-native

One binary. JSON output. No Python or TypeScript framework required. Works with Claude Code, Cursor, Aider, OpenHands, custom shell loops, or CI.

03

Local-first

Starts with your local browser, your machine, and named persistent profiles when your agent needs continuity across runs.

04

Real browser operations

Open, observe, inspect, read, find, click, type, choose, submit, press keys, scroll, capture, tab management, page evaluation, state detection, and action transcripts.

05

Human handoff built in

When a site needs a person — passkeys, MFA, sensitive approvals, or ambiguous decisions — the agent can pause cleanly instead of guessing or looping.

06

Experimental BrowserBox human takeover

BrowserBox integration is experimental, but it is the clearest path for remote browser workflows that need human input. When an agent gets stuck, a human can join the same live browser session through a local client, unblock the workflow, and hand control back without losing browser state.

07

Agent instructions included

Run web agents-md to generate compact rules your agent can follow: observe first, use numbered actions, prefer JSON, recover cleanly, and produce transcripts.

Built for real auth, not fake autonomy.

01

Stops at human-only gates

When a workflow reaches passkeys, MFA, sensitive approvals, or ambiguous states, the agent can pause instead of pretending it can continue.

02

Redacts what should not be logged

Transcripts show what happened without dumping secrets, tokens, passwords, or sensitive field values.

03

Local first

Start on your own machine and browser context. Move to runners or BrowserBox-backed sessions when the workflow needs it.

04

Self-driving does not mean unsupervised

WebCLI is built around observable state, explicit commands, redacted transcripts, and human handoff. High-value or sensitive workflows should pause for human approval.

Why not just...

What is Full Self Browsing?

Full Self Browsing is the WebCLI product metaphor for agent-operable browsing: live browser state translated into structured observations, numbered actions, recoverable blockers, human handoff, and transcripts. It does not mean agents should bypass human judgment or run sensitive workflows unsupervised.

What do you mean by AIcessability?

AIcessability means making the web operable for agents. Humans get visual layout, affordances, cursor feedback, memory, and judgment. WebCLI gives agents a structured browser loop: readable state, actions, forms, blockers, tabs, transcripts, and handoff.

Why thumbnail demos instead of raw YouTube embeds?

The landing page should stay fast and conversion-focused. Demo cards use strong thumbnails first, then open a local demo page or lightweight YouTube facade on click. That keeps the story, transcript, trial CTA, and proof context on WebCLI while still using YouTube for distribution.

Why not just Playwright or Cypress?

Use Playwright or Cypress when you know the app and the script. Use WebCLI when an agent must inspect an unknown or changing website, decide what to do, act, observe again, and recover without writing a full test suite first.

Why not just screenshots?

Screenshots are useful for human verification. But weak as the primary control loop for your agent friends — shots are token-heavy, easy to misread, and disconnected from actionable page state. WebCLI gives agents enhanced web perception: structured state, stable numbered actions, and blocker awareness.

Why not just MCP?

MCP is useful when you want a tool server. WebCLI is a local binary optimized for shell-based agents, terminals, scripts, and CI. They complement each other.

Why not Stagehand, Browser Use, or other browser-agent SDKs?

Those are frameworks for building agents inside specific stacks. WebCLI is the shell-native layer: one binary any coding agent (or human) can use to drive web actions without adopting a framework.

Does it bypass CAPTCHAs or auth?

No. WebCLI is designed to detect blockers and create a clean human handoff. When a workflow needs a person, the agent pauses, you join, and the transcript records what happened.

Is this safe for secrets?

WebCLI is built around redacted transcripts and explicit human handoff. For sensitive workflows, pause for human approval instead of letting the agent run unsupervised.

What is Agentish?

Agentish is our shorthand for the language agents can actually reason over: structured state, numbered actions, tabs, forms, blockers, and transcripts. WebCLI translates messy live websites into Agentish.

Is BrowserBox required?

No. WebCLI is local-first. BrowserBox integration is experimental and useful when browser workflows run remotely and a human needs to join the live session to unblock the agent.

Try first. Take it to the stars when you need it.

Start a 5-day trial with a work email. If the email is from a free/disposable provider or your org has used its 3 free trials, WebCLI automatically offers the $10 trial pass instead.

Trial

$0 5 days

Enter an email. Work domains under the org cap get a free activation key; free providers, disposable providers, and capped orgs get the $10 5-day trial-pass checkout.

Observe, read, find, click, type, and do
Pause, join, and resume
Redacted transcripts
Persistent local profiles
Up to 3 free work-email trials per organization domain

Free trial when the email belongs to a real organization. Otherwise the server creates a $10 checkout for the same 5-day evaluation.

Solo Dev

$120 / year

For one developer using WebCLI commercially with local agents.

Commercial local use
Unlimited local browser actions
Persistent browser profiles
Redacted transcripts
Personal machines

Pro Runner

$480 / year

For headless, CI, multi-machine, and production agent workflows.

CI and headless runner use
Multi-machine activation
Higher concurrency
Production automation workflows
Runner-oriented logging and diagnostics

Need to evaluate from a free email provider or an org that hit the free-trial cap? Use the $10 5-day trial pass from the trial form.

Platform

Starts at $5k / year

For redistribution, bundling, team platforms, and BrowserBox-backed integrations.

Redistribution and bundling rights
Platform integration
BrowserBox-backed shared sessions
Policy and deployment support
Custom terms available

Talk to founders

When a trial ends or a license is invalid, browser commands stop until a valid trial pass or paid license is activated.

Add the browser loop to your agent.

Drop WebCLI instructions into your repo so your coding agent knows how to browse safely: observe first, use numbered actions, prefer JSON, pause on blockers, ask for human help when needed, and report with transcripts. AGENTS.md and SKILL.md formats are included.

web agents-md > AGENTS.md
web skill-md > SKILL.md
curl -fsSL webcli.sh/agents/SKILL.md -o SKILL.md

Download AGENTS.md Download SKILL.md Download skill bundle

Let the web speak Agentish.

Install WebCLI Watch demos

Give your coding agent a browser it can actually operate.

A browser loop your agent can drive from the CLI.

What if a web browser was a CLI?

What if a web browser was a CLI?

Gemini creates an Azure VM

Human takeover without losing browser state

Can an agent drive a visual canvas?

The launch workflow: GitHub, Stripe, passkeys, transcript

Ship through the dashboard

WebCLI + your agents = Full Self Browsing.

Let the web speak Agentish.

Agents deserve accessibility, too.

Give your agent night vision for the web.

We gave an agent WebCLI. It created an Azure VM.

Agents code. They search. Then they hit the web and the interface stops speaking their language.

We asked the agents what changed.

The browser loop agents were missing.

Observe

Choose

Act

Recover

Handoff

Report

From stuck agent to self-driving browser loop.

Agents stop guessing where to click.

Agents get enhanced web perception.

Agents know when they are blocked.

Agents ask for help instead of failing.

Humans get an audit trail.

A browser workbench for agents — not another testing SDK.

Agent-native by default

Shell-native

Local-first

Real browser operations

Human handoff built in

Experimental BrowserBox human takeover

Agent instructions included

Built for real auth, not fake autonomy.

Stops at human-only gates

Redacts what should not be logged

Local first

Self-driving does not mean unsupervised

Why not just...

Try first. Take it to the stars when you need it.

Trial

Solo Dev

Pro Runner

Platform

Add the browser loop to your agent.

Let the web speak Agentish.