Observe the current page
web observe --json gives the agent a compact view of summary text, candidate actions, inputs, forms, refs, and state.
Local-first browser control for coding agents
A local-first browser CLI for shell-based coding agents and custom agent loops. Open pages, inspect state, preserve refs, click, and type — all through token-efficient JSON, without framework lock-in.
web open http://localhost:3000 --json{ "ok": true, "url": "http://localhost:3000", "state": "complete" }
web observe --json{
"summary": "Sign-in page loaded",
"actions": ["a1: Sign in", "a2: Create account"],
"inputs": ["t1: email", "t2: password"],
"forms": ["f1: login"]
}
web type 1 cris@example.com --json{ "ok": true, "command": "type", "message": "typed into candidate 1" }
web do 2 --json{ "ok": true, "command": "act", "message": "clicked tab=1 Sign in" }
Screenshots are useful for visual confirmation, but they are a poor primary interface for browser control. Test frameworks are powerful, but often too much ceremony for exploratory agent work. SDKs require choosing a stack. Hosted browsers are not always the right default for local development.
web observe --json gives the agent a compact view of summary text, candidate actions, inputs, forms, refs, and state.
web maps interactive elements to short refs like t4, so the agent can act on the latest observed page without inventing selectors.
When a page gets weird, agents should not blindly retry clicks. web state, web pause, and web transcript make the blocker explicit.
web transcript --last 20 --json records redacted command output so humans can audit what the agent did.
Agents need to know what the page says, what can be acted on, what changed, and why they are blocked.
Every important command can return machine-readable output for coding agents, scripts, CI, and debugging.
No Python framework. No TypeScript framework. No app rewrite. A shell-based agent can use it immediately.
Run web agents-md to generate a compact instruction sheet for your agent: observe first, use refs, prefer JSON, recover cleanly.
Playwright is the gold standard for deterministic web testing. web is for a different loop: shell-driven agents that need to inspect the current page, use refs, act once, observe again, and recover without writing a test suite.
MCP is excellent when you want a tool server. web is useful when your agent can run shell commands and you want one local binary with JSON output.
Those are useful SDKs for building browser agents inside Node or Python workflows. web is a CLI, so a bash script, CI job, or coding agent terminal can use it without adopting a framework.
Screenshots are useful for human verification and visual checks. They are a weak primary control loop for agents. web gives agents dense, actionable browser state.
Hosted browsers are useful for some workloads. web starts local: your machine, your browser context, your shell, and your agent workflow.
Read-only browser understanding stays free. Licenses unlock write actions, CI, concurrency, and commercial runner workflows.
Inspect and understand local browser state before buying.
One developer using WebCLI commercially with local coding agents.
Live-payment evaluation license with Solo behavior for launch smoke tests.
Headless operation, CI workflows, higher concurrency, and multi-machine execution.
After the trial, read-only commands keep working. Your agent can still inspect and understand pages for free. A license is required when you want it to act.
Run web agents-md and give your agent the rules. It learns to observe first, use refs, prefer JSON, recover from blocked states, and produce redacted reports when something fails.
web agents-md > AGENTS.mdweb pause "Need human passkey approval"web transcript --last 20 --json