[{"data":1,"prerenderedAt":3311},["ShallowReactive",2],{"/blog/browser-use-vs-playwright-which-is-better-for-ai-agent-control":3,"related-/blog/browser-use-vs-playwright-which-is-better-for-ai-agent-control":942},{"id":4,"title":5,"authorId":6,"body":7,"category":893,"created":894,"description":895,"extension":896,"faqs":897,"featurePriority":916,"head":917,"landingPath":917,"meta":918,"navigation":403,"ogImage":917,"path":931,"robots":917,"schemaOrg":917,"seo":932,"sitemap":933,"stem":934,"tags":935,"__hash__":941},"blog/blog/1044.browser-use-vs-playwright-which-is-better-for-ai-agent-control.md","Browser-Use vs. Playwright: Which is Better for AI Agent Control?","salome-koshadze",{"type":8,"value":9,"toc":879},"minimark",[10,23,45,66,69,74,82,91,94,98,101,106,113,116,124,130,133,172,175,179,183,186,197,200,240,248,253,257,260,265,268,271,280,283,286,290,293,298,302,305,310,340,343,551,555,562,566,586,589,820,824,827,838,842,845,872,875],[11,12,13,14,18,19,22],"p",{},"If you want the short answer first: ",[15,16,17],"strong",{},"Browser-Use is the better default for autonomous, goal-driven agents",", while ",[15,20,21],{},"Playwright is the better default for deterministic, production-grade automation"," — and the strongest production systems combine the two.",[24,25,27,33,39],"tldr-box",{"title":26},"TL;DR",[11,28,29,32],{},[15,30,31],{},"Pick Browser-Use"," when the task is open-ended, the UI changes often, or writing selectors is the bottleneck.",[11,34,35,38],{},[15,36,37],{},"Pick Playwright"," when the flow is well-defined and you need speed, cost control, and clear debugging.",[11,40,41,44],{},[15,42,43],{},"Combine them"," when most of the script can be deterministic and only the messy parts need an LLM.",[11,46,47,48,57,58,65],{},"In 2026, AI agents driving real browsers have moved from novelty to standard component of modern software. These agents handle data extraction, QA, research, and personal-assistant workflows that traverse the open web. Two names dominate the conversation: ",[15,49,50],{},[51,52,56],"a",{"href":53,"rel":54},"https://github.com/browser-use/browser-use",[55],"nofollow","Browser-Use",", a high-level framework designed for agent autonomy, and ",[15,59,60],{},[51,61,64],{"href":62,"rel":63},"https://playwright.dev/",[55],"Playwright",", the established browser automation library from Microsoft, now with its own AI extensions.",[11,67,68],{},"Choosing between them is not a simple matter of one being better than the other. They represent different philosophies and serve different developer needs. Browser-Use prioritizes goal-oriented, autonomous operation with minimal code, letting an LLM handle the planning. Playwright offers deterministic, low-level control and extreme reliability — a solid base on which agentic logic can be built. This article compares them from a developer's perspective: architecture, performance trade-offs, and ideal use cases.",[70,71,73],"h2",{"id":72},"core-philosophies-autonomy-vs-control","Core Philosophies: Autonomy vs. Control",[11,75,76,77,81],{},"The primary distinction lies in their approach to automation. Browser-Use is designed with an \"agent-first\" mentality. A developer provides a high-level objective in natural language — ",[78,79,80],"em",{},"\"Research the top three competitors for our product and summarize their pricing\""," — and the framework, powered by an LLM, plans and executes the steps: opening tabs, searching Google, navigating to pricing pages, identifying the relevant tables, and compiling the result. It is a goal-oriented system where the \"how\" is largely determined by the AI.",[83,84],"nuxt-picture",{":height":85,":width":86,"alt":87,"loading":88,"src":89,"provider":90},"640","820","Side-by-side comparison of the Browser-Use and Playwright philosophies — autonomy and goal-driven planning versus deterministic, step-by-step control","lazy","/blog/browser-use-vs-playwright-which-is-better-for-ai-agent-control/1.svg","none",[11,92,93],{},"Playwright, in contrast, originates from end-to-end testing and deterministic scripting. Its core philosophy is precise, reliable control over every browser action. You write explicit code to navigate, locate elements, and perform actions. The newer AI extensions — most notably the Model Context Protocol (MCP) server — do not replace this philosophy; they extend it. MCP lets an AI \"see\" the page in a structured, efficient form and request actions, but the developer still wires the system and retains control over the execution loop. It offers a path to agentic behavior without sacrificing the granular control and testability that define the library.",[70,95,97],{"id":96},"architecture-and-key-features","Architecture and Key Features",[11,99,100],{},"To see the practical differences, look at the architecture of each tool and the features they offer for building AI agents.",[102,103,105],"h3",{"id":104},"browser-use-the-high-level-abstraction-layer","Browser-Use: The High-Level Abstraction Layer",[83,107],{":height":108,":width":109,"alt":110,"loading":88,"src":111,"format":112},"600","1200","Browser-Use GitHub repository social preview","/blog/browser-use-vs-playwright-which-is-better-for-ai-agent-control/browser-use-og.png","webp",[11,114,115],{},"Browser-Use positions itself as an intelligent layer that sits atop a powerful engine. It uses Playwright (and Patchright in some configurations) for underlying browser interactions — launching browsers, executing JavaScript, taking screenshots. Its contribution is the agent-oriented abstraction built on top.",[11,117,118,119,123],{},"The design centers on simplifying the interaction between an LLM and a web page. When the agent observes a page, Browser-Use does not just send a raw DOM tree or a screenshot. Instead, it processes the page and extracts a clean, structured list of interactive elements — buttons, inputs, links — annotated with accessibility information, text content, and a unique identifier. This structured data is far more token-efficient and easier for an LLM to parse than raw HTML, which lets the model make more accurate decisions about which action to take next. (For background on why this matters, see ",[51,120,122],{"href":121},"/blog/dom-downsampling-for-llm-based-web-agents","DOM downsampling for LLM web agents",".)",[83,125],{":height":126,":width":127,"alt":128,"loading":88,"src":129,"provider":90},"620","760","Diagram of the Browser-Use agent loop — the LLM receives a structured snapshot of interactive elements and decides the next action against the browser","/blog/browser-use-vs-playwright-which-is-better-for-ai-agent-control/2.svg",[11,131,132],{},"Key components of the Browser-Use architecture include:",[134,135,136,143,154,160,166],"ul",{},[137,138,139,142],"li",{},[15,140,141],{},"Agent-first interface."," The primary interaction model is giving the agent a natural-language task. The internal loop handles planning, observation, and action execution until the task is complete.",[137,144,145,148,149,153],{},[15,146,147],{},"LLM integration."," Broad compatibility with major providers — OpenAI, Anthropic, Google — plus local models through services like Ollama, and its own ",[150,151,152],"code",{},"ChatBrowserUse"," LLM class for the hosted models.",[137,155,156,159],{},[15,157,158],{},"Self-healing harness."," Web pages are dynamic and selectors break. When an action fails, the agent can re-observe the page and select a new element from context, which makes it resilient to minor UI changes.",[137,161,162,165],{},[15,163,164],{},"Cloud platform."," The open-source version can be self-hosted, which means you manage your own browser infrastructure and LLM API keys. The Browser-Use Cloud offering adds fingerprint spoofing, a global residential proxy network, and integrated CAPTCHA-solving for production workloads.",[137,167,168,171],{},[15,169,170],{},"Extensible tools."," Developers can equip the agent with custom tools beyond standard browser actions — calling a private API, querying a database, or writing files locally — which expands the agent's capabilities.",[11,173,174],{},"This architecture suits tasks where the exact workflow is not known in advance or is likely to change. The Browser-Use team has reported strong WebVoyager benchmark results for complex, adaptive multi-step tasks, though benchmark numbers move quickly and are best treated as a directional signal rather than a fixed score.",[102,176,178],{"id":177},"playwright-precision-control-with-ai-extensions","Playwright: Precision Control with AI Extensions",[83,180],{":height":108,":width":109,"alt":181,"loading":88,"src":182,"format":112},"Playwright GitHub repository social preview","/blog/browser-use-vs-playwright-which-is-better-for-ai-agent-control/playwright-og.png",[11,184,185],{},"Playwright's core is a mature, battle-tested library for reliable browser automation. Its architecture is built around three pillars: cross-browser compatibility, reliability features, and a rich API for granular control. It supports Chromium, Firefox, and WebKit through a single API, which makes it the standard for QA and cross-browser testing.",[11,187,188,189,192,193,196],{},"For AI agent development, its native features — auto-waiting (Playwright waits for an element to be actionable before proceeding) and resilient locators like ",[150,190,191],{},"getByRole"," and ",[150,194,195],{},"getByLabel"," — eliminate much of the flakiness common in browser automation. The AI-specific extensions build on this foundation.",[11,198,199],{},"Playwright's AI-centric features:",[134,201,202,216,222],{},[137,203,204,207,208,211,212,215],{},[15,205,206],{},"Model Context Protocol (MCP) server."," MCP runs alongside the browser and provides a structured snapshot of the page's accessibility tree to an LLM — element roles, names, and references. An LLM can reason about its next action without a vision model, which is slower and more expensive. The agent responds with a tool call like ",[150,209,210],{},"click(\"ref-123\")"," or ",[150,213,214],{},"fill(\"ref-456\", \"some text\")",", which Playwright executes. The protocol is integrated into editors like VS Code and Cursor for AI-assisted coding.",[137,217,218,221],{},[15,219,220],{},"Command-line interface."," A token-efficient CLI is available for coding agents (for example, those built on GitHub Copilot) to drive a browser during development tasks — verifying that a code change had the intended effect on a web application.",[137,223,224,227,228,231,232,235,236,239],{},[15,225,226],{},"Test Agents."," A suite of AI tools focused on the testing lifecycle. The ",[15,229,230],{},"Planner"," explores an application to understand its functionality. The ",[15,233,234],{},"Generator"," writes test plans and Playwright code. The ",[15,237,238],{},"Healer"," analyzes test failures, proposes fixes, and repairs broken tests. This suite is newer and evolving — capabilities depend on the Playwright release you are on.",[11,241,242,243,247],{},"Playwright's approach enables a hybrid model. You can write a deterministic script for a login sequence and then hand control to an LLM via MCP to navigate a dynamic dashboard. That combination of precise scripting and AI-driven exploration is flexible enough for most production systems. For a deeper look at the underlying protocols, see ",[51,244,246],{"href":245},"/blog/cdp-vs-playwright-vs-puppeteer","CDP vs Playwright vs Puppeteer",".",[249,250],"article-signup-cta",{"heading":251,"subtitle":252},"Run AI Browser Agents on Any Web App","Webfuse lets you embed AI-driven automation directly into any web application without browser extensions or backend rebuilds. Pair deterministic Playwright scripts with intent-driven agents and ship reliable, production-grade workflows on top of websites you do not control.",[70,254,256],{"id":255},"a-developers-look-at-performance-and-trade-offs","A Developer's Look at Performance and Trade-offs",[11,258,259],{},"The choice between Browser-Use and Playwright comes down to balancing autonomy against predictability and performance. Each has a profile better suited to certain projects.",[83,261],{":height":108,":width":262,"alt":263,"loading":88,"src":264,"provider":90},"740","Performance and trade-off comparison between Browser-Use and Playwright across autonomy, speed, cost, and reliability for AI agent workloads","/blog/browser-use-vs-playwright-which-is-better-for-ai-agent-control/3.svg",[11,266,267],{},"For rapid prototyping of autonomous agents, Browser-Use is hard to beat. Expressing a complex goal in a single line of English and having an agent attempt it immediately is a major accelerator. It excels in dynamic environments where layouts change — e-commerce checkout flows, social media sites — because the self-healing mechanism and the LLM's ability to adapt mean the agent is less likely to fail on a small front-end change. That adaptability makes it a strong choice for open-ended research tasks or personal assistants.",[11,269,270],{},"The trade-offs are real. The agent's performance is directly tied to the quality of the underlying LLM; a weaker model generates poor plans or gets stuck in loops. The reasoning step at each turn adds latency and token consumption, which becomes expensive for high-volume, repetitive tasks. Scraping thousands of product pages with an identical structure will be slower and costlier with a fully autonomous agent than with a deterministic script. The open-source version also puts the burden of managing infrastructure (browsers, proxies, LLMs) on the developer; the cloud version trades that for a service fee.",[272,273],"article-cheatsheet-card",{"description":274,"href":275,"image":276,"imageAlt":277,"label":278,"title":279},"Quick reference for Playwright locators, contexts, debugging tools, and best practices.","/playwright-cheat-sheet","/misc/playwright-cheatsheet.png","Playwright Cheat Sheet preview","Cheat Sheet","Playwright Cheat Sheet",[11,281,282],{},"Playwright, on the other hand, is built for speed and reliability in scripted or hybrid flows. When you have a defined, repeatable process, a Playwright script will execute it faster and more predictably than an LLM-driven agent. Its debugging tools — particularly the Trace Viewer, which captures DOM snapshots, console logs, and network requests for every run — make failure inspection straightforward. That is a major benefit for maintaining automation in a CI/CD pipeline.",[11,284,285],{},"The MCP extension makes Playwright more agent-friendly without the full overhead of an autonomous loop. Because it relies on the accessibility tree, it avoids the cost and latency of vision models for many interactions, which makes it a cost-efficient way to add intelligence to deterministic scripts. The main drawback from a pure agentic perspective is increased development effort. Building a fully autonomous agent requires more code — developers often wrap Playwright in a framework like LangChain or CrewAI and write the observation-planning-action loop themselves.",[70,287,289],{"id":288},"practical-use-cases-and-code-philosophy","Practical Use Cases and Code Philosophy",[11,291,292],{},"The right tool depends heavily on the problem. Concrete scenarios make the trade-offs easier to feel.",[83,294],{":height":295,":width":86,"alt":296,"loading":88,"src":297,"provider":90},"560","Decision guide showing when to reach for Browser-Use, Playwright, or a hybrid approach across job applications, e-commerce, enterprise automation, and AI-assisted coding scenarios","/blog/browser-use-vs-playwright-which-is-better-for-ai-agent-control/4.svg",[102,299,301],{"id":300},"when-to-reach-for-browser-use","When to Reach for Browser-Use",[11,303,304],{},"Browser-Use is the go-to choice when the task is defined by a goal rather than a sequence of steps — when the agent needs to reason, research, and adapt.",[11,306,307],{},[15,308,309],{},"Example scenarios:",[134,311,312,322,331],{},[137,313,314,317,318,321],{},[15,315,316],{},"Automated job applications."," ",[78,319,320],{},"\"Fill in the job application at this URL using the information from my resume.\""," The agent navigates the multi-page form, identifies fields for name, email, and work experience, and handles the file upload.",[137,323,324,317,327,330],{},[15,325,326],{},"E-commerce and research.",[78,328,329],{},"\"Go to pcpartpicker.com, build a gaming PC with an NVIDIA 4080 GPU and an Intel CPU under $2000, and save the build link.\""," The agent navigates categories, applies filters, makes constrained selections, and finds the shareable URL.",[137,332,333,317,336,339],{},[15,334,335],{},"Personal assistants.",[78,337,338],{},"\"Find reviews for the top three Italian restaurants in downtown San Francisco, check their availability for two this Friday at 7pm, and send me the options.\""," Multi-site searching, navigation, and structured extraction.",[11,341,342],{},"The developer mindset is one of delegation: define the desired outcome and trust the agent to figure out the path. A typical interaction looks like this:",[344,345,350],"pre",{"className":346,"code":347,"language":348,"meta":349,"style":349},"language-python shiki shiki-themes catppuccin-latte night-owl","# Browser-Use: define a task, run the agent\nfrom browser_use import Agent, Browser, ChatBrowserUse\nimport asyncio\n\nasync def main():\n    browser = Browser()\n    agent = Agent(\n        task=\"Add three boxes of organic pasta and a jar of marinara sauce to my Instacart cart.\",\n        llm=ChatBrowserUse(),\n        browser=browser,\n    )\n    await agent.run()\n\nasyncio.run(main())\n","python","",[150,351,352,361,390,398,405,426,442,455,477,491,505,511,527,532],{"__ignoreMap":349},[353,354,357],"span",{"class":355,"line":356},"line",1,[353,358,360],{"class":359},"sDmS1","# Browser-Use: define a task, run the agent\n",[353,362,364,368,372,375,378,382,385,387],{"class":355,"line":363},2,[353,365,367],{"class":366},"srhcd","from",[353,369,371],{"class":370},"s2kId"," browser_use ",[353,373,374],{"class":366},"import",[353,376,377],{"class":370}," Agent",[353,379,381],{"class":380},"scGhl",",",[353,383,384],{"class":370}," Browser",[353,386,381],{"class":380},[353,388,389],{"class":370}," ChatBrowserUse\n",[353,391,393,395],{"class":355,"line":392},3,[353,394,374],{"class":366},[353,396,397],{"class":370}," asyncio\n",[353,399,401],{"class":355,"line":400},4,[353,402,404],{"emptyLinePlaceholder":403},true,"\n",[353,406,408,412,415,419,423],{"class":355,"line":407},5,[353,409,411],{"class":410},"s76yb","async",[353,413,414],{"class":410}," def",[353,416,418],{"class":417},"sNstc"," main",[353,420,422],{"class":421},"sMtgK","()",[353,424,425],{"class":380},":\n",[353,427,429,432,436,439],{"class":355,"line":428},6,[353,430,431],{"class":370},"    browser ",[353,433,435],{"class":434},"s-_ek","=",[353,437,384],{"class":438},"s75IF",[353,440,441],{"class":380},"()\n",[353,443,445,448,450,452],{"class":355,"line":444},7,[353,446,447],{"class":370},"    agent ",[353,449,435],{"class":434},[353,451,377],{"class":438},[353,453,454],{"class":380},"(\n",[353,456,458,462,464,468,472,474],{"class":355,"line":457},8,[353,459,461],{"class":460},"sIhCM","        task",[353,463,435],{"class":434},[353,465,467],{"class":466},"sbuKk","\"",[353,469,471],{"class":470},"sfrMT","Add three boxes of organic pasta and a jar of marinara sauce to my Instacart cart.",[353,473,467],{"class":466},[353,475,476],{"class":421},",\n",[353,478,480,483,485,487,489],{"class":355,"line":479},9,[353,481,482],{"class":460},"        llm",[353,484,435],{"class":434},[353,486,152],{"class":438},[353,488,422],{"class":380},[353,490,476],{"class":421},[353,492,494,497,499,503],{"class":355,"line":493},10,[353,495,496],{"class":460},"        browser",[353,498,435],{"class":434},[353,500,502],{"class":501},"sqxXB","browser",[353,504,476],{"class":421},[353,506,508],{"class":355,"line":507},11,[353,509,510],{"class":380},"    )\n",[353,512,514,517,520,522,525],{"class":355,"line":513},12,[353,515,516],{"class":366},"    await",[353,518,519],{"class":370}," agent",[353,521,247],{"class":380},[353,523,524],{"class":438},"run",[353,526,441],{"class":380},[353,528,530],{"class":355,"line":529},13,[353,531,404],{"emptyLinePlaceholder":403},[353,533,535,538,540,542,545,548],{"class":355,"line":534},14,[353,536,537],{"class":370},"asyncio",[353,539,247],{"class":380},[353,541,524],{"class":438},[353,543,544],{"class":380},"(",[353,546,547],{"class":438},"main",[353,549,550],{"class":380},"())\n",[102,552,554],{"id":553},"when-to-reach-for-playwright","When to Reach for Playwright",[11,556,557,558,123],{},"Playwright excels where reliability, speed, and testability matter. It is the tool of choice for production-grade automation of well-defined workflows and for hybrid systems where AI assists a deterministic process. (For a side-by-side with Puppeteer in the same space, see ",[51,559,561],{"href":560},"/blog/playwright-vs-puppeteer-which-is-better-for-ai-agent-control","Playwright vs. Puppeteer for AI agent control",[11,563,564],{},[15,565,309],{},[134,567,568,574,580],{},[137,569,570,573],{},[15,571,572],{},"E2E test generation and maintenance."," In a QA pipeline, Playwright's Test Agents can explore a new feature, generate corresponding E2E code, and commit it. When a test fails later because of a UI change, the Healer can attempt to fix the broken locators automatically.",[137,575,576,579],{},[15,577,578],{},"Enterprise data automation."," Logging into a legacy financial portal, navigating to a report generator, filling in date ranges, and downloading a CSV — a Playwright script executes this with high reliability, using persistent authentication state to handle logins efficiently.",[137,581,582,585],{},[15,583,584],{},"AI-assisted coding."," A developer using an AI coding assistant asks it to verify a change. The assistant, using Playwright with MCP, opens a browser, navigates to the local dev server, and confirms a new button renders correctly — immediate feedback without leaving the editor.",[11,587,588],{},"The developer mindset is one of precise instruction: define each step explicitly to ensure a consistent outcome. The code is direct:",[344,590,592],{"className":346,"code":591,"language":348,"meta":349,"style":349},"# Playwright: explicit steps, resilient locators\nfrom playwright.sync_api import sync_playwright\n\nwith sync_playwright() as p:\n    browser = p.chromium.launch()\n    page = browser.new_page()\n    page.goto(\"https://shopping.example.com\")\n\n    page.get_by_label(\"Search products\").fill(\"organic pasta\")\n    page.get_by_role(\"button\", name=\"Search\").click()\n    page.get_by_alt_text(\"Image of organic pasta box\").first.click()\n\n    browser.close()\n",[150,593,594,599,616,620,638,658,675,697,701,736,775,804,808],{"__ignoreMap":349},[353,595,596],{"class":355,"line":356},[353,597,598],{"class":359},"# Playwright: explicit steps, resilient locators\n",[353,600,601,603,606,608,611,613],{"class":355,"line":363},[353,602,367],{"class":366},[353,604,605],{"class":370}," playwright",[353,607,247],{"class":380},[353,609,610],{"class":370},"sync_api ",[353,612,374],{"class":366},[353,614,615],{"class":370}," sync_playwright\n",[353,617,618],{"class":355,"line":392},[353,619,404],{"emptyLinePlaceholder":403},[353,621,622,625,628,630,633,636],{"class":355,"line":400},[353,623,624],{"class":366},"with",[353,626,627],{"class":438}," sync_playwright",[353,629,422],{"class":380},[353,631,632],{"class":366}," as",[353,634,635],{"class":370}," p",[353,637,425],{"class":380},[353,639,640,642,644,646,648,651,653,656],{"class":355,"line":407},[353,641,431],{"class":370},[353,643,435],{"class":434},[353,645,635],{"class":370},[353,647,247],{"class":380},[353,649,650],{"class":370},"chromium",[353,652,247],{"class":380},[353,654,655],{"class":438},"launch",[353,657,441],{"class":380},[353,659,660,663,665,668,670,673],{"class":355,"line":428},[353,661,662],{"class":370},"    page ",[353,664,435],{"class":434},[353,666,667],{"class":370}," browser",[353,669,247],{"class":380},[353,671,672],{"class":438},"new_page",[353,674,441],{"class":380},[353,676,677,680,682,685,687,689,692,694],{"class":355,"line":444},[353,678,679],{"class":370},"    page",[353,681,247],{"class":380},[353,683,684],{"class":438},"goto",[353,686,544],{"class":380},[353,688,467],{"class":466},[353,690,691],{"class":470},"https://shopping.example.com",[353,693,467],{"class":466},[353,695,696],{"class":380},")\n",[353,698,699],{"class":355,"line":457},[353,700,404],{"emptyLinePlaceholder":403},[353,702,703,705,707,710,712,714,717,719,722,725,727,729,732,734],{"class":355,"line":479},[353,704,679],{"class":370},[353,706,247],{"class":380},[353,708,709],{"class":438},"get_by_label",[353,711,544],{"class":380},[353,713,467],{"class":466},[353,715,716],{"class":470},"Search products",[353,718,467],{"class":466},[353,720,721],{"class":380},").",[353,723,724],{"class":438},"fill",[353,726,544],{"class":380},[353,728,467],{"class":466},[353,730,731],{"class":470},"organic pasta",[353,733,467],{"class":466},[353,735,696],{"class":380},[353,737,738,740,742,745,747,749,752,754,756,759,761,763,766,768,770,773],{"class":355,"line":493},[353,739,679],{"class":370},[353,741,247],{"class":380},[353,743,744],{"class":438},"get_by_role",[353,746,544],{"class":380},[353,748,467],{"class":466},[353,750,751],{"class":470},"button",[353,753,467],{"class":466},[353,755,381],{"class":421},[353,757,758],{"class":460}," name",[353,760,435],{"class":434},[353,762,467],{"class":466},[353,764,765],{"class":470},"Search",[353,767,467],{"class":466},[353,769,721],{"class":380},[353,771,772],{"class":438},"click",[353,774,441],{"class":380},[353,776,777,779,781,784,786,788,791,793,795,798,800,802],{"class":355,"line":507},[353,778,679],{"class":370},[353,780,247],{"class":380},[353,782,783],{"class":438},"get_by_alt_text",[353,785,544],{"class":380},[353,787,467],{"class":466},[353,789,790],{"class":470},"Image of organic pasta box",[353,792,467],{"class":466},[353,794,721],{"class":380},[353,796,797],{"class":370},"first",[353,799,247],{"class":380},[353,801,772],{"class":438},[353,803,441],{"class":380},[353,805,806],{"class":355,"line":513},[353,807,404],{"emptyLinePlaceholder":403},[353,809,810,813,815,818],{"class":355,"line":529},[353,811,812],{"class":370},"    browser",[353,814,247],{"class":380},[353,816,817],{"class":438},"close",[353,819,441],{"class":380},[70,821,823],{"id":822},"combining-strengths-the-hybrid-approach","Combining Strengths: The Hybrid Approach",[11,825,826],{},"Many production agentic systems do not use one tool exclusively. They combine the strengths of both philosophies. A developer might build a custom agent that uses an LLM for high-level planning but relies on Playwright for executing actions.",[11,828,829,830,833,834,247],{},"In such a system, the agent observes a page using Playwright's MCP. The LLM receives the structured accessibility data and decides the next logical step is to \"click the login button.\" Instead of letting the LLM generate a brittle selector, the system maps that intent to a robust, predefined Playwright call like ",[150,831,832],{},"page.get_by_role(\"button\", name=\"Login\").click()",". This combines the reasoning capability of an LLM with the reliability of Playwright's execution engine and reduces the chance of errors. Frameworks like Stagehand from Browserbase formalize this pattern — a managed Playwright infrastructure with an AI layer on top. For a closer look, see ",[51,835,837],{"href":836},"/blog/lightpanda-vs-browser-use-vs-stagehand-2026","Lightpanda vs Browser-Use vs Stagehand",[70,839,841],{"id":840},"key-decision-factors-for-your-project","Key Decision Factors for Your Project",[11,843,844],{},"To choose the right tool, consider the requirements of your project:",[846,847,848,854,860,866],"ol",{},[137,849,850,853],{},[15,851,852],{},"Autonomy vs. precision."," Is the task open-ended and requires adaptation, or is it a fixed, repeatable workflow? Exploratory research and personal assistants need Browser-Use's flexibility; CI/CD testing and fixed business processes are a better fit for Playwright.",[137,855,856,859],{},[15,857,858],{},"Speed and cost."," How sensitive is the application to latency and operational cost? Playwright scripts are faster and cheaper to run at scale for deterministic work. Browser-Use's LLM reasoning loop adds latency and token cost, which can be prohibitive at high volume.",[137,861,862,865],{},[15,863,864],{},"Developer experience and prototyping speed."," How quickly do you need a working prototype? Browser-Use allows very fast iteration on natural-language tasks. Playwright requires more initial code to set up a full agentic loop, though its codegen tools speed up basic script creation.",[137,867,868,871],{},[15,869,870],{},"Stealth and scaling."," Do you need to automate sites protected by services like Cloudflare? Browser-Use Cloud provides built-in stealth — fingerprint spoofing and residential proxies. With Playwright you would set up that infrastructure yourself or integrate a third-party service.",[11,873,874],{},"The space is moving quickly. Both Browser-Use and Playwright are under active development, with new features and models landing regularly. The best choice today may be different tomorrow. A practical strategy for most teams is to start with Playwright for its reliable foundation and progressively introduce agentic layers — Browser-Use, Playwright MCP, or custom abstractions — as the project's need for autonomy grows.",[876,877,878],"style",{},"html pre.shiki code .sDmS1, html code.shiki .sDmS1{--shiki-default:#7C7F93;--shiki-default-font-style:italic;--shiki-dark:#637777;--shiki-dark-font-style:italic}html pre.shiki code .srhcd, html code.shiki .srhcd{--shiki-default:#8839EF;--shiki-default-font-style:inherit;--shiki-dark:#C792EA;--shiki-dark-font-style:italic}html pre.shiki code .s2kId, html code.shiki .s2kId{--shiki-default:#4C4F69;--shiki-dark:#D6DEEB}html pre.shiki code .scGhl, html code.shiki .scGhl{--shiki-default:#7C7F93;--shiki-dark:#D6DEEB}html pre.shiki code .s76yb, html code.shiki .s76yb{--shiki-default:#8839EF;--shiki-dark:#C792EA}html pre.shiki code .sNstc, html code.shiki .sNstc{--shiki-default:#1E66F5;--shiki-default-font-style:italic;--shiki-dark:#82AAFF;--shiki-dark-font-style:italic}html pre.shiki code .sMtgK, html code.shiki .sMtgK{--shiki-default:#7C7F93;--shiki-dark:#D9F5DD}html pre.shiki code .s-_ek, html code.shiki .s-_ek{--shiki-default:#179299;--shiki-dark:#C792EA}html pre.shiki code .s75IF, html code.shiki .s75IF{--shiki-default:#1E66F5;--shiki-dark:#B2CCD6}html pre.shiki code .sIhCM, html code.shiki .sIhCM{--shiki-default:#E64553;--shiki-default-font-style:italic;--shiki-dark:#D7DBE0;--shiki-dark-font-style:inherit}html pre.shiki code .sbuKk, html code.shiki .sbuKk{--shiki-default:#40A02B;--shiki-dark:#D9F5DD}html pre.shiki code .sfrMT, html code.shiki .sfrMT{--shiki-default:#40A02B;--shiki-dark:#ECC48D}html pre.shiki code .sqxXB, html code.shiki .sqxXB{--shiki-default:#4C4F69;--shiki-dark:#82AAFF}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}",{"title":349,"searchDepth":363,"depth":363,"links":880},[881,882,886,887,891,892],{"id":72,"depth":363,"text":73},{"id":96,"depth":363,"text":97,"children":883},[884,885],{"id":104,"depth":392,"text":105},{"id":177,"depth":392,"text":178},{"id":255,"depth":363,"text":256},{"id":288,"depth":363,"text":289,"children":888},[889,890],{"id":300,"depth":392,"text":301},{"id":553,"depth":392,"text":554},{"id":822,"depth":363,"text":823},{"id":840,"depth":363,"text":841},"ai-agents","2026-05-20","Browser-Use vs Playwright in 2026 for AI agent control: compare autonomy vs determinism, architecture, MCP integration, performance, cost, and when to choose each (or combine them).","md",[898,901,904,907,910,913],{"question":899,"answer":900},"Is Browser-Use built on top of Playwright?","Yes. Browser-Use uses Playwright (and Patchright in some configurations) for the underlying browser interactions and adds an agent-oriented abstraction layer on top — an LLM-friendly element tree, a planning loop, self-healing selectors, and integrations with model providers.",{"question":902,"answer":903},"When should I choose Browser-Use over Playwright?","Reach for Browser-Use when the task is open-ended, hard to script in advance, or runs against UIs that change often. It is especially useful for research agents, personal assistants, and exploratory workflows where you would otherwise rewrite selectors every week.",{"question":905,"answer":906},"When should I choose Playwright over Browser-Use?","Choose Playwright when the workflow is well-defined, runs at high volume, or sits inside a CI/CD pipeline. It is faster, cheaper, and easier to debug for deterministic flows like data extraction, regression testing, and back-office automation.",{"question":908,"answer":909},"What is the Playwright MCP server?","Playwright MCP is a Model Context Protocol server that exposes a structured accessibility-tree snapshot of the page to an LLM. The model can decide actions like 'click ref-123' without needing screenshots or vision models, and Playwright executes those actions deterministically.",{"question":911,"answer":912},"Can I combine Browser-Use and Playwright in one system?","Yes, and most production agents do. A common pattern is to use Playwright for deterministic steps (login, navigation, data extraction) and hand control to an LLM via Browser-Use or Playwright MCP only for the unpredictable parts of the flow.",{"question":914,"answer":915},"Which is cheaper to run at scale?","Playwright. A deterministic script has no per-step LLM cost. Browser-Use calls a model on every observation and decision, which adds both latency and token spend — fine for low-volume agents, painful for thousands of pages a day.",0,null,{"shortTitle":919,"relatedLinks":920},"Browser-Use vs. Playwright: AI Agents",[921,924,926,928],{"text":922,"href":560,"description":923},"Playwright vs. Puppeteer: AI Agent Control","How Playwright and Puppeteer compare specifically for AI agent workloads, not just testing.",{"text":837,"href":836,"description":925},"A broader three-way comparison covering a lightweight engine, an autonomous agent, and a hybrid framework.",{"text":246,"href":245,"description":927},"How the underlying protocols and frameworks compare for everyday browser automation.",{"text":929,"href":121,"description":930},"DOM Downsampling for LLM Web Agents","How to reduce DOM complexity so LLM-based agents can process web pages more efficiently.","/blog/browser-use-vs-playwright-which-is-better-for-ai-agent-control",{"title":5,"description":895},{"loc":931},"blog/1044.browser-use-vs-playwright-which-is-better-for-ai-agent-control",[936,937,938,893,939,940],"browser-use","playwright","browser-automation","web-agents","mcp","PPicqMl7w5OYQpYzt95WWqtCtQJzWwvYLuLDncLIkpw",[943,2564],{"id":944,"title":945,"authorId":946,"body":947,"category":893,"created":2541,"description":2542,"extension":896,"faqs":917,"featurePriority":917,"head":917,"landingPath":917,"meta":2543,"navigation":403,"ogImage":917,"path":121,"robots":917,"schemaOrg":917,"seo":2555,"sitemap":2556,"stem":2557,"tags":2558,"__hash__":2563},"blog/blog/1012.dom-downsampling-for-llm-based-web-agents.md","DOM Downsampling for LLM-Based Web Agents","thassilo-schiepanski",{"type":8,"value":948,"toc":2526},[949,954,977,981,988,992,1007,1011,1017,1021,1039,1063,1066,1070,1073,1084,1090,1121,1125,1145,1157,1162,1177,1191,1194,1198,1218,1222,1230,1242,1246,1249,1620,1626,1633,1797,1804,1895,1902,1974,1983,1989,1998,2002,2008,2018,2030,2253,2271,2293,2299,2342,2346,2358,2367,2372,2377,2380,2384,2390,2395,2433,2437,2443,2447,2457,2461,2464,2523],[83,950],{":width":951,"alt":952,"format":112,"loading":88,"src":953},"900","Downsampling visualised for digital images and HTML","/blog/dom-downsampling-for-web-agents/1.png",[11,955,956,961,962,961,967,972,973,976],{},[51,957,960],{"href":958,"rel":959},"https://operator.chatgpt.com",[55],"Operator (OpenAI)",", ",[51,963,966],{"href":964,"rel":965},"https://www.director.ai",[55],"Director (Browserbase)",[51,968,971],{"href":969,"rel":970},"https://browser-use.com",[55],"Browser Use"," – we are currently witnessing the rise of ",[15,974,975],{},"web AI agents",". The first iteration of serviceable web agents was enabled by frontier LLMs, which act as instantaneous domain model backends. The domain, hereby, corresponds to the landscape of web application UIs.",[70,978,980],{"id":979},"what-is-a-snapshot","What is a Snapshot?",[11,982,983,984,987],{},"Web agents provide an LLM with a task, and serialised runtime state of a currently browsed web application (e.g., a screenshot). The LLM is ought to suggest relevant actions to perform in the web application. Serialisation of such runtime state is referred to as a ",[15,985,986],{},"snapshot",". And the snapshot technique primarily decides the quality of LLM interaction suggestions.",[102,989,991],{"id":990},"gui-snapshots","GUI Snapshots",[11,993,994,995,998,999,1002,1003,1006],{},"Screenshots – for consistency reasons referred to as ",[15,996,997],{},"GUI snapshots"," – resemble how humans visually perceive web application UIs. LLM APIs subsidise the use of image input through upstream compression. Compresssion, however, irreversibly affects image dimensions, which takes away pixel precision; no way to suggest interactions like ",[78,1000,1001],{},"“click at 100, 735”",". As a workaround, early web agents used ",[78,1004,1005],{},"grounded"," GUI snapshots. Grounding describes adding visual cues to the GUI, such as bounding boxes with numerical identifiers. Grounding lets the LLM refer to specific parts of the page by identifier, so the agent can trace back interaction targets.",[83,1008],{":width":951,"alt":1009,"format":112,"loading":88,"src":1010},"Grounded GUI snapshot as implemented by Browser Use","/blog/dom-downsampling-for-web-agents/2.png",[11,1012,1013],{},[1014,1015,1016],"small",{},"Grounded GUI snapshot as implemented by Browser Use.",[102,1018,1020],{"id":1019},"dom-snapshots","DOM Snapshots",[11,1022,1023,1024,1034,1035,1038],{},"LLMs arguably are much better at understanding code than images. Research supports they excel at describing and classifying HTML, and also navigating an inherent UI",[1025,1026,1027],"sup",{},[51,1028,1033],{"href":1029,"ariaDescribedBy":1030,"dataFootnoteRef":349,"id":1032},"#user-content-fn-1",[1031],"footnote-label","user-content-fnref-1","1",". The DOM (document object model) – a web browser's runtime state model of a web application – translates back to HTML. For this reason, ",[15,1036,1037],{},"DOM snapshots"," offer a compelling alternative to GUI snapshots. DOM snapshots offer a handful of key advantages:",[846,1040,1041,1044,1047,1050,1053],{},[137,1042,1043],{},"DOM snapshots connect with LLM code (HTML) interpretation abilities.",[137,1045,1046],{},"DOM snapshots can be compiled from deep clones, hidden from supervision (unlike GUI grounding).",[137,1048,1049],{},"DOM snapshots render text input that on average consume less bandwidth than screnshots.",[137,1051,1052],{},"DOM snapshots allow for exact programmatic targeting of elements (e.g., via CSS selectors).",[137,1054,1055,1056,1059,1060,721],{},"DOM snapshots are available with the ",[150,1057,1058],{},"DOMContentLoaded"," event (whereas the GUI completes initial rendering with ",[150,1061,1062],{},"load",[11,1064,1065],{},"Yet, DOM snapshots have a major problem: potentially exhaustive model context. Whereas GUI snapshot commonly cost four figures of tokens, a raw DOM snapshot can cost into hundreds of thousands of tokens. To connect with LLM code interpretation abilities, however, developers have used element extraction techniques – picking only (likely) important elements from the DOM. Element extraction flattens the DOM tree, which disregards hierarchy as a potential UI feature (how do elements relate to each other?).",[70,1067,1069],{"id":1068},"dom-downsampling-a-novel-approach","DOM Downsampling: A Novel Approach",[11,1071,1072],{},"To enable DOM snapshots for use with web agents, it requires client-side pre-processing – similar to how LLM vision APIs process image input. Downsampling is a fundamental signal processing technique that reduces data that scales out of time or space constraints under the assumption that the majority of relevant features is retained. Picture JPEG compression as an example: put simply, a JPEG image stores only an average colour for patches of pixels. The bigger the patches, the smaller the file. Although some detail is lost, key image features – colours, edges, objects – keep being recognisable – up to a large patch size.",[11,1074,1075,1076,1079,1080,1083],{},"We transfer the concept of ",[15,1077,1078],{},"downsampling"," to ",[15,1081,1082],{},"DOMs",". Particularly, since such an approach retains HTML characteristics that might be valuable for an LLM backend. We define UI features as concepts that, to a substantial degree, facilitate LLM suggestions on how to act in the UI in order to solve related web-based tasks.",[70,1085,1087],{"id":1086},"d2snap",[78,1088,1089],{},"D2Snap",[11,1091,1092,1093,1101,1109,1117,1118,1120],{},"We recently proposed ",[51,1094,1097],{"href":1095,"rel":1096},"https://arxiv.org/abs/2508.04412",[55],[15,1098,1099],{},[78,1100,1089],{},[1025,1102,1103],{},[51,1104,1108],{"href":1105,"ariaDescribedBy":1106,"dataFootnoteRef":349,"id":1107},"#user-content-fn-2",[1031],"user-content-fnref-2","2",[1025,1110,1111],{},[51,1112,1116],{"href":1113,"ariaDescribedBy":1114,"dataFootnoteRef":349,"id":1115},"#user-content-fn-3",[1031],"user-content-fnref-3","3"," – a first-of-its-kind downsampling algorithm for DOMs. Herein, we'll briefly explain how the ",[78,1119,1089],{}," algorithm works, and how it can be utilised to build efficient and performant web agents.",[102,1122,1124],{"id":1123},"how-it-works","How it works",[11,1126,1127,1128,1130,1131,961,1134,1137,1138,1141,1142,721],{},"There are basically three redundant types of DOM nodes, and HTML concepts: elements, text, and attributes. We defined and empirically adjusted three node-specific procedures. ",[78,1129,1089],{}," downsamples at a variable ratio, configured through procedure-specific parameters  ",[150,1132,1133],{},"k",[150,1135,1136],{},"l",", and ",[150,1139,1140],{},"m"," (",[150,1143,1144],{},"∈ [0, 1]",[1146,1147,1148],"blockquote",{},[11,1149,1150,1151,1156],{},"We used ",[51,1152,1155],{"href":1153,"rel":1154},"https://openai.com/index/hello-gpt-4o/",[55],"GPT-4o"," to create a downsampling ground truth dataset by having it classify HTML elements and scoring semantics regarding relevance for understanding the inherent UI – a UI feature degree.",[1158,1159,1161],"h4",{"id":1160},"procedure-elements","Procedure: Elements",[11,1163,1164,1166,1167,192,1170,1173,1174,1176],{},[78,1165,1089],{}," downsamples (simplifies) elements by merging container elements like ",[150,1168,1169],{},"section",[150,1171,1172],{},"div"," together. A parameter ",[150,1175,1133],{}," controls the merge ratio depending on the total DOM tree height. For competing concepts, such as element name, the ground truth determines which element's characterisitics to keep – comparing UI feature scores.",[11,1178,1179,1180,961,1182,1184,1185,1190],{},"Elements in content elements (",[150,1181,11],{},[150,1183,1146],{},", ...) are translated to a more comprehensive ",[51,1186,1189],{"href":1187,"rel":1188},"https://www.markdownguide.org/basic-syntax/",[55],"Markdown"," representation.",[11,1192,1193],{},"Interactive elements, definite interaction target candidates, are kept as is.",[1158,1195,1197],{"id":1196},"procedure-text","Procedure: Text",[11,1199,1200,1202,1203,1206,1214,1215,1217],{},[78,1201,1089],{}," downsamples text by dropping a fraction. Natural units of text are space-separated words, or punctuation-separated sentences. We reuse the ",[78,1204,1205],{},"TextRank",[1025,1207,1208],{},[51,1209,1213],{"href":1210,"ariaDescribedBy":1211,"dataFootnoteRef":349,"id":1212},"#user-content-fn-4",[1031],"user-content-fnref-4","4"," algorithm to rank sentences in text nodes. The lowest-ranking fraction of sentences, denoted by parameter ",[150,1216,1136],{},", is dropped.",[1158,1219,1221],{"id":1220},"procedure-attributes","Procedure: Attributes",[11,1223,1224,1226,1227,1229],{},[78,1225,1089],{}," downsamples attributes by dropping those with a name that, according to ground truth, holds a UI feature degree below a threshold. Parameter ",[150,1228,1140],{}," denotes this threshold.",[1146,1231,1232],{},[11,1233,1234,1235,1241],{},"Check out the ",[51,1236,1238,1240],{"href":1095,"rel":1237},[55],[78,1239,1089],{}," paper"," to learn about the algorithm in-depth.",[102,1243,1245],{"id":1244},"example-of-a-downsampled-dom","Example of a Downsampled DOM",[11,1247,1248],{},"Consider a partial DOM state, serialised as HTML:",[344,1250,1254],{"className":1251,"code":1252,"language":1253,"meta":349,"style":349},"language-html shiki shiki-themes catppuccin-latte night-owl","\u003Csection class=\"container\" tabindex=\"3\" required=\"true\" type=\"example\">\n  \u003Cdiv class=\"mx-auto\" data-topic=\"products\" required=\"false\">\n    \u003Ch1>Our Pizza\u003C/h1>\n    \u003Cdiv>\n      \u003Cdiv class=\"shadow-lg\">\n        \u003Ch2>Margherita\u003C/h2>\n        \u003Cp>\n          A simple classic: mozzarela, tomatoes and basil.\n          An everyday choice!\n        \u003C/p>\n        \u003Cbutton type=\"button\">Add\u003C/button>\n      \u003C/div>\n      \u003Cdiv class=\"shadow-lg\">\n        \u003Ch2>Capricciosa\u003C/h2>\n        \u003Cp>\n          A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\n          A true favourite!\n          \u003C/p>\n        \u003Cbutton type=\"button\">Add\u003C/button>\n      \u003C/div>\n    \u003C/div>\n  \u003C/div>\n\u003C/section>\n","html",[150,1255,1256,1316,1359,1380,1388,1408,1426,1434,1439,1444,1453,1480,1489,1507,1524,1533,1539,1545,1555,1582,1591,1601,1611],{"__ignoreMap":349},[353,1257,1258,1262,1265,1269,1271,1273,1276,1278,1281,1283,1285,1287,1289,1292,1294,1296,1299,1301,1304,1306,1308,1311,1313],{"class":355,"line":356},[353,1259,1261],{"class":1260},"s9rnR","\u003C",[353,1263,1169],{"class":1264},"sY2RG",[353,1266,1268],{"class":1267},"swkLt"," class",[353,1270,435],{"class":1260},[353,1272,467],{"class":466},[353,1274,1275],{"class":470},"container",[353,1277,467],{"class":466},[353,1279,1280],{"class":1267}," tabindex",[353,1282,435],{"class":1260},[353,1284,467],{"class":466},[353,1286,1116],{"class":470},[353,1288,467],{"class":466},[353,1290,1291],{"class":1267}," required",[353,1293,435],{"class":1260},[353,1295,467],{"class":466},[353,1297,1298],{"class":470},"true",[353,1300,467],{"class":466},[353,1302,1303],{"class":1267}," type",[353,1305,435],{"class":1260},[353,1307,467],{"class":466},[353,1309,1310],{"class":470},"example",[353,1312,467],{"class":466},[353,1314,1315],{"class":1260},">\n",[353,1317,1318,1321,1323,1325,1327,1329,1332,1334,1337,1339,1341,1344,1346,1348,1350,1352,1355,1357],{"class":355,"line":363},[353,1319,1320],{"class":1260},"  \u003C",[353,1322,1172],{"class":1264},[353,1324,1268],{"class":1267},[353,1326,435],{"class":1260},[353,1328,467],{"class":466},[353,1330,1331],{"class":470},"mx-auto",[353,1333,467],{"class":466},[353,1335,1336],{"class":1267}," data-topic",[353,1338,435],{"class":1260},[353,1340,467],{"class":466},[353,1342,1343],{"class":470},"products",[353,1345,467],{"class":466},[353,1347,1291],{"class":1267},[353,1349,435],{"class":1260},[353,1351,467],{"class":466},[353,1353,1354],{"class":470},"false",[353,1356,467],{"class":466},[353,1358,1315],{"class":1260},[353,1360,1361,1364,1367,1370,1373,1376,1378],{"class":355,"line":392},[353,1362,1363],{"class":1260},"    \u003C",[353,1365,1366],{"class":1264},"h1",[353,1368,1369],{"class":1260},">",[353,1371,1372],{"class":370},"Our Pizza",[353,1374,1375],{"class":1260},"\u003C/",[353,1377,1366],{"class":1264},[353,1379,1315],{"class":1260},[353,1381,1382,1384,1386],{"class":355,"line":400},[353,1383,1363],{"class":1260},[353,1385,1172],{"class":1264},[353,1387,1315],{"class":1260},[353,1389,1390,1393,1395,1397,1399,1401,1404,1406],{"class":355,"line":407},[353,1391,1392],{"class":1260},"      \u003C",[353,1394,1172],{"class":1264},[353,1396,1268],{"class":1267},[353,1398,435],{"class":1260},[353,1400,467],{"class":466},[353,1402,1403],{"class":470},"shadow-lg",[353,1405,467],{"class":466},[353,1407,1315],{"class":1260},[353,1409,1410,1413,1415,1417,1420,1422,1424],{"class":355,"line":428},[353,1411,1412],{"class":1260},"        \u003C",[353,1414,70],{"class":1264},[353,1416,1369],{"class":1260},[353,1418,1419],{"class":370},"Margherita",[353,1421,1375],{"class":1260},[353,1423,70],{"class":1264},[353,1425,1315],{"class":1260},[353,1427,1428,1430,1432],{"class":355,"line":444},[353,1429,1412],{"class":1260},[353,1431,11],{"class":1264},[353,1433,1315],{"class":1260},[353,1435,1436],{"class":355,"line":457},[353,1437,1438],{"class":370},"          A simple classic: mozzarela, tomatoes and basil.\n",[353,1440,1441],{"class":355,"line":479},[353,1442,1443],{"class":370},"          An everyday choice!\n",[353,1445,1446,1449,1451],{"class":355,"line":493},[353,1447,1448],{"class":1260},"        \u003C/",[353,1450,11],{"class":1264},[353,1452,1315],{"class":1260},[353,1454,1455,1457,1459,1461,1463,1465,1467,1469,1471,1474,1476,1478],{"class":355,"line":507},[353,1456,1412],{"class":1260},[353,1458,751],{"class":1264},[353,1460,1303],{"class":1267},[353,1462,435],{"class":1260},[353,1464,467],{"class":466},[353,1466,751],{"class":470},[353,1468,467],{"class":466},[353,1470,1369],{"class":1260},[353,1472,1473],{"class":370},"Add",[353,1475,1375],{"class":1260},[353,1477,751],{"class":1264},[353,1479,1315],{"class":1260},[353,1481,1482,1485,1487],{"class":355,"line":513},[353,1483,1484],{"class":1260},"      \u003C/",[353,1486,1172],{"class":1264},[353,1488,1315],{"class":1260},[353,1490,1491,1493,1495,1497,1499,1501,1503,1505],{"class":355,"line":529},[353,1492,1392],{"class":1260},[353,1494,1172],{"class":1264},[353,1496,1268],{"class":1267},[353,1498,435],{"class":1260},[353,1500,467],{"class":466},[353,1502,1403],{"class":470},[353,1504,467],{"class":466},[353,1506,1315],{"class":1260},[353,1508,1509,1511,1513,1515,1518,1520,1522],{"class":355,"line":534},[353,1510,1412],{"class":1260},[353,1512,70],{"class":1264},[353,1514,1369],{"class":1260},[353,1516,1517],{"class":370},"Capricciosa",[353,1519,1375],{"class":1260},[353,1521,70],{"class":1264},[353,1523,1315],{"class":1260},[353,1525,1527,1529,1531],{"class":355,"line":1526},15,[353,1528,1412],{"class":1260},[353,1530,11],{"class":1264},[353,1532,1315],{"class":1260},[353,1534,1536],{"class":355,"line":1535},16,[353,1537,1538],{"class":370},"          A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\n",[353,1540,1542],{"class":355,"line":1541},17,[353,1543,1544],{"class":370},"          A true favourite!\n",[353,1546,1548,1551,1553],{"class":355,"line":1547},18,[353,1549,1550],{"class":1260},"          \u003C/",[353,1552,11],{"class":1264},[353,1554,1315],{"class":1260},[353,1556,1558,1560,1562,1564,1566,1568,1570,1572,1574,1576,1578,1580],{"class":355,"line":1557},19,[353,1559,1412],{"class":1260},[353,1561,751],{"class":1264},[353,1563,1303],{"class":1267},[353,1565,435],{"class":1260},[353,1567,467],{"class":466},[353,1569,751],{"class":470},[353,1571,467],{"class":466},[353,1573,1369],{"class":1260},[353,1575,1473],{"class":370},[353,1577,1375],{"class":1260},[353,1579,751],{"class":1264},[353,1581,1315],{"class":1260},[353,1583,1585,1587,1589],{"class":355,"line":1584},20,[353,1586,1484],{"class":1260},[353,1588,1172],{"class":1264},[353,1590,1315],{"class":1260},[353,1592,1594,1597,1599],{"class":355,"line":1593},21,[353,1595,1596],{"class":1260},"    \u003C/",[353,1598,1172],{"class":1264},[353,1600,1315],{"class":1260},[353,1602,1604,1607,1609],{"class":355,"line":1603},22,[353,1605,1606],{"class":1260},"  \u003C/",[353,1608,1172],{"class":1264},[353,1610,1315],{"class":1260},[353,1612,1614,1616,1618],{"class":355,"line":1613},23,[353,1615,1375],{"class":1260},[353,1617,1169],{"class":1264},[353,1619,1315],{"class":1260},[11,1621,1622,1623,1625],{},"Here are some ",[78,1624,1089],{}," downsampling results, which are based on different parametric configurations. A percentage denotes the reduced size.",[1158,1627,1629,1632],{"id":1628},"k3-l3-m3-55",[150,1630,1631],{},"k=.3, l=.3, m=.3"," (55%)",[344,1634,1636],{"className":1251,"code":1635,"language":1253,"meta":349,"style":349},"\u003Csection tabindex=\"3\" type=\"example\" class=\"container\" required=\"true\">\n  # Our Pizza\n  \u003Cdiv class=\"shadow-lg\">\n    ## Margherita\n    A simple classic: mozzarela, tomatoes, and basil.\n    \u003Cbutton type=\"button\">Add\u003C/button>\n    ## Capricciosa\n    A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\n    \u003Cbutton type=\"button\">Add\u003C/button>\n  \u003C/div>\n\u003C/section>\n",[150,1637,1638,1686,1691,1709,1714,1719,1745,1750,1755,1781,1789],{"__ignoreMap":349},[353,1639,1640,1642,1644,1646,1648,1650,1652,1654,1656,1658,1660,1662,1664,1666,1668,1670,1672,1674,1676,1678,1680,1682,1684],{"class":355,"line":356},[353,1641,1261],{"class":1260},[353,1643,1169],{"class":1264},[353,1645,1280],{"class":1267},[353,1647,435],{"class":1260},[353,1649,467],{"class":466},[353,1651,1116],{"class":470},[353,1653,467],{"class":466},[353,1655,1303],{"class":1267},[353,1657,435],{"class":1260},[353,1659,467],{"class":466},[353,1661,1310],{"class":470},[353,1663,467],{"class":466},[353,1665,1268],{"class":1267},[353,1667,435],{"class":1260},[353,1669,467],{"class":466},[353,1671,1275],{"class":470},[353,1673,467],{"class":466},[353,1675,1291],{"class":1267},[353,1677,435],{"class":1260},[353,1679,467],{"class":466},[353,1681,1298],{"class":470},[353,1683,467],{"class":466},[353,1685,1315],{"class":1260},[353,1687,1688],{"class":355,"line":363},[353,1689,1690],{"class":370},"  # Our Pizza\n",[353,1692,1693,1695,1697,1699,1701,1703,1705,1707],{"class":355,"line":392},[353,1694,1320],{"class":1260},[353,1696,1172],{"class":1264},[353,1698,1268],{"class":1267},[353,1700,435],{"class":1260},[353,1702,467],{"class":466},[353,1704,1403],{"class":470},[353,1706,467],{"class":466},[353,1708,1315],{"class":1260},[353,1710,1711],{"class":355,"line":400},[353,1712,1713],{"class":370},"    ## Margherita\n",[353,1715,1716],{"class":355,"line":407},[353,1717,1718],{"class":370},"    A simple classic: mozzarela, tomatoes, and basil.\n",[353,1720,1721,1723,1725,1727,1729,1731,1733,1735,1737,1739,1741,1743],{"class":355,"line":428},[353,1722,1363],{"class":1260},[353,1724,751],{"class":1264},[353,1726,1303],{"class":1267},[353,1728,435],{"class":1260},[353,1730,467],{"class":466},[353,1732,751],{"class":470},[353,1734,467],{"class":466},[353,1736,1369],{"class":1260},[353,1738,1473],{"class":370},[353,1740,1375],{"class":1260},[353,1742,751],{"class":1264},[353,1744,1315],{"class":1260},[353,1746,1747],{"class":355,"line":444},[353,1748,1749],{"class":370},"    ## Capricciosa\n",[353,1751,1752],{"class":355,"line":457},[353,1753,1754],{"class":370},"    A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\n",[353,1756,1757,1759,1761,1763,1765,1767,1769,1771,1773,1775,1777,1779],{"class":355,"line":479},[353,1758,1363],{"class":1260},[353,1760,751],{"class":1264},[353,1762,1303],{"class":1267},[353,1764,435],{"class":1260},[353,1766,467],{"class":466},[353,1768,751],{"class":470},[353,1770,467],{"class":466},[353,1772,1369],{"class":1260},[353,1774,1473],{"class":370},[353,1776,1375],{"class":1260},[353,1778,751],{"class":1264},[353,1780,1315],{"class":1260},[353,1782,1783,1785,1787],{"class":355,"line":493},[353,1784,1606],{"class":1260},[353,1786,1172],{"class":1264},[353,1788,1315],{"class":1260},[353,1790,1791,1793,1795],{"class":355,"line":507},[353,1792,1375],{"class":1260},[353,1794,1169],{"class":1264},[353,1796,1315],{"class":1260},[1158,1798,1800,1803],{"id":1799},"k4-l6-m8-27",[150,1801,1802],{},"k=.4, l=.6, m=.8"," (27%)",[344,1805,1807],{"className":1251,"code":1806,"language":1253,"meta":349,"style":349},"\u003Csection>\n  # Our Pizza\n  \u003Cdiv>\n    ## Margherita\n    A simple classic:\n    \u003Cbutton>Add\u003C/button>\n    ## Capricciosa\n    A rich taste:\n    \u003Cbutton>Add\u003C/button>\n  \u003C/div>\n\u003C/section>\n",[150,1808,1809,1817,1821,1829,1833,1838,1854,1858,1863,1879,1887],{"__ignoreMap":349},[353,1810,1811,1813,1815],{"class":355,"line":356},[353,1812,1261],{"class":1260},[353,1814,1169],{"class":1264},[353,1816,1315],{"class":1260},[353,1818,1819],{"class":355,"line":363},[353,1820,1690],{"class":370},[353,1822,1823,1825,1827],{"class":355,"line":392},[353,1824,1320],{"class":1260},[353,1826,1172],{"class":1264},[353,1828,1315],{"class":1260},[353,1830,1831],{"class":355,"line":400},[353,1832,1713],{"class":370},[353,1834,1835],{"class":355,"line":407},[353,1836,1837],{"class":370},"    A simple classic:\n",[353,1839,1840,1842,1844,1846,1848,1850,1852],{"class":355,"line":428},[353,1841,1363],{"class":1260},[353,1843,751],{"class":1264},[353,1845,1369],{"class":1260},[353,1847,1473],{"class":370},[353,1849,1375],{"class":1260},[353,1851,751],{"class":1264},[353,1853,1315],{"class":1260},[353,1855,1856],{"class":355,"line":444},[353,1857,1749],{"class":370},[353,1859,1860],{"class":355,"line":457},[353,1861,1862],{"class":370},"    A rich taste:\n",[353,1864,1865,1867,1869,1871,1873,1875,1877],{"class":355,"line":479},[353,1866,1363],{"class":1260},[353,1868,751],{"class":1264},[353,1870,1369],{"class":1260},[353,1872,1473],{"class":370},[353,1874,1375],{"class":1260},[353,1876,751],{"class":1264},[353,1878,1315],{"class":1260},[353,1880,1881,1883,1885],{"class":355,"line":493},[353,1882,1606],{"class":1260},[353,1884,1172],{"class":1264},[353,1886,1315],{"class":1260},[353,1888,1889,1891,1893],{"class":355,"line":507},[353,1890,1375],{"class":1260},[353,1892,1169],{"class":1264},[353,1894,1315],{"class":1260},[1158,1896,1898,1901],{"id":1897},"k-l0-m-35",[150,1899,1900],{},"k→∞, l=0, ∀m"," (35%)",[344,1903,1905],{"className":1251,"code":1904,"language":1253,"meta":349,"style":349},"# Our Pizza\n## Margherita\nA simple classic: mozzarela, tomatoes, and basil.\nAn everyday choice!\n\u003Cbutton>Add\u003C/button>\n## Capricciosa\nA rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\nA true favourite!\n\u003Cbutton>Add\u003C/button>\n",[150,1906,1907,1912,1917,1922,1927,1943,1948,1953,1958],{"__ignoreMap":349},[353,1908,1909],{"class":355,"line":356},[353,1910,1911],{"class":370},"# Our Pizza\n",[353,1913,1914],{"class":355,"line":363},[353,1915,1916],{"class":370},"## Margherita\n",[353,1918,1919],{"class":355,"line":392},[353,1920,1921],{"class":370},"A simple classic: mozzarela, tomatoes, and basil.\n",[353,1923,1924],{"class":355,"line":400},[353,1925,1926],{"class":370},"An everyday choice!\n",[353,1928,1929,1931,1933,1935,1937,1939,1941],{"class":355,"line":407},[353,1930,1261],{"class":1260},[353,1932,751],{"class":1264},[353,1934,1369],{"class":1260},[353,1936,1473],{"class":370},[353,1938,1375],{"class":1260},[353,1940,751],{"class":1264},[353,1942,1315],{"class":1260},[353,1944,1945],{"class":355,"line":428},[353,1946,1947],{"class":370},"## Capricciosa\n",[353,1949,1950],{"class":355,"line":444},[353,1951,1952],{"class":370},"A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\n",[353,1954,1955],{"class":355,"line":457},[353,1956,1957],{"class":370},"A true favourite!\n",[353,1959,1960,1962,1964,1966,1968,1970,1972],{"class":355,"line":479},[353,1961,1261],{"class":1260},[353,1963,751],{"class":1264},[353,1965,1369],{"class":1260},[353,1967,1473],{"class":370},[353,1969,1375],{"class":1260},[353,1971,751],{"class":1264},[353,1973,1315],{"class":1260},[11,1975,1976,1977,1979,1980,1982],{},"Asymptotic ",[150,1978,1133],{}," (kind of 'infinite' ",[150,1981,1133],{},") completely flattens the DOM, that is, leads to a full content linearisation similar to reader views as present in most browsers. Notably, it preserves all interactive elements like buttons – which are essential for a web agent.",[102,1984,1986],{"id":1985},"adaptived2snap",[78,1987,1988],{},"AdaptiveD2Snap",[11,1990,1991,1992,1994,1995,1997],{},"Fixed parameters might not be ideal for arbitrary DOMs – sourced from a landscape of web applications. We created ",[78,1993,1988],{}," – a wrapper for ",[78,1996,1089],{}," that infers suitable parameters from a given DOM in order to hit a certain token budget.",[102,1999,2001],{"id":2000},"implementation-integration","Implementation & Integration",[11,2003,2004,2005,2007],{},"Picture an LLM-based weg agent that is premised on DOM snapshots. Implementing ",[78,2006,1089],{}," is simple: Deep clone the DOM, and feed it to the algorithm. Now, take the snapshot; this is, serialise the resulting DOM. Done.",[1146,2009,2010],{},[11,2011,2012,2013,2017],{},"Read our ",[51,2014,2016],{"href":2015},"/blog/a-gentle-introduction-to-ai-agents-for-the-web","gentle introduction to AI agents for the web"," to get started with high-level web agent concepts.",[11,2019,2020,2021,2023,2024,2029],{},"The open source ",[78,2022,1089],{}," API, provided as a ",[51,2025,2028],{"href":2026,"rel":2027},"https://github.com/webfuse-com/D2Snap",[55],"package on GitHub"," provides the following signature:",[344,2031,2035],{"className":2032,"code":2033,"language":2034,"meta":349,"style":349},"language-ts shiki shiki-themes catppuccin-latte night-owl","type DOM = Document | Element | string;\ntype Options = {\n  assignUniqueIDs?: boolean; // false\n  debug?: boolean;           // true\n};\n\nD2Snap.d2Snap(\n  dom: DOM,\n  k: number, l: number, m: number,\n  options?: Options\n): Promise\u003Cstring>\n\nD2Snap.adaptiveD2Snap(\n  dom: DOM,\n  maxTokens: number = 4096,\n  maxIterations: number = 5,\n  options?: Options\n): Promise\u003Cstring>\n\n","ts",[150,2036,2037,2067,2079,2097,2111,2116,2120,2132,2142,2159,2169,2185,2189,2200,2208,2221,2233,2241],{"__ignoreMap":349},[353,2038,2039,2042,2046,2048,2052,2055,2058,2060,2064],{"class":355,"line":356},[353,2040,2041],{"class":410},"type",[353,2043,2045],{"class":2044},"sXbZB"," DOM ",[353,2047,435],{"class":434},[353,2049,2051],{"class":2050},"s-DR7"," Document",[353,2053,2054],{"class":1260}," |",[353,2056,2057],{"class":2050}," Element",[353,2059,2054],{"class":1260},[353,2061,2063],{"class":2062},"scrte"," string",[353,2065,2066],{"class":380},";\n",[353,2068,2069,2071,2074,2076],{"class":355,"line":363},[353,2070,2041],{"class":410},[353,2072,2073],{"class":2044}," Options ",[353,2075,435],{"class":434},[353,2077,2078],{"class":380}," {\n",[353,2080,2081,2085,2088,2091,2094],{"class":355,"line":392},[353,2082,2084],{"class":2083},"swl0y","  assignUniqueIDs",[353,2086,2087],{"class":1260},"?:",[353,2089,2090],{"class":2062}," boolean",[353,2092,2093],{"class":380},";",[353,2095,2096],{"class":359}," // false\n",[353,2098,2099,2102,2104,2106,2108],{"class":355,"line":400},[353,2100,2101],{"class":2083},"  debug",[353,2103,2087],{"class":1260},[353,2105,2090],{"class":2062},[353,2107,2093],{"class":380},[353,2109,2110],{"class":359},"           // true\n",[353,2112,2113],{"class":355,"line":407},[353,2114,2115],{"class":380},"};\n",[353,2117,2118],{"class":355,"line":428},[353,2119,404],{"emptyLinePlaceholder":403},[353,2121,2122,2124,2127,2130],{"class":355,"line":444},[353,2123,1089],{"class":370},[353,2125,247],{"class":2126},"s5FwJ",[353,2128,2129],{"class":417},"d2Snap",[353,2131,454],{"class":370},[353,2133,2134,2137,2140],{"class":355,"line":457},[353,2135,2136],{"class":370},"  dom: ",[353,2138,2139],{"class":501},"DOM",[353,2141,476],{"class":380},[353,2143,2144,2147,2149,2152,2154,2157],{"class":355,"line":479},[353,2145,2146],{"class":370},"  k: number",[353,2148,381],{"class":380},[353,2150,2151],{"class":370}," l: number",[353,2153,381],{"class":380},[353,2155,2156],{"class":370}," m: number",[353,2158,476],{"class":380},[353,2160,2161,2164,2166],{"class":355,"line":493},[353,2162,2163],{"class":370},"  options",[353,2165,2087],{"class":434},[353,2167,2168],{"class":370}," Options\n",[353,2170,2171,2174,2178,2180,2183],{"class":355,"line":507},[353,2172,2173],{"class":370},"): ",[353,2175,2177],{"class":2176},"s8Irk","Promise",[353,2179,1261],{"class":434},[353,2181,2182],{"class":370},"string",[353,2184,1315],{"class":434},[353,2186,2187],{"class":355,"line":513},[353,2188,404],{"emptyLinePlaceholder":403},[353,2190,2191,2193,2195,2198],{"class":355,"line":529},[353,2192,1089],{"class":370},[353,2194,247],{"class":2126},[353,2196,2197],{"class":417},"adaptiveD2Snap",[353,2199,454],{"class":370},[353,2201,2202,2204,2206],{"class":355,"line":534},[353,2203,2136],{"class":370},[353,2205,2139],{"class":501},[353,2207,476],{"class":380},[353,2209,2210,2213,2215,2219],{"class":355,"line":1526},[353,2211,2212],{"class":370},"  maxTokens: number ",[353,2214,435],{"class":434},[353,2216,2218],{"class":2217},"sZ_Zo"," 4096",[353,2220,476],{"class":380},[353,2222,2223,2226,2228,2231],{"class":355,"line":1535},[353,2224,2225],{"class":370},"  maxIterations: number ",[353,2227,435],{"class":434},[353,2229,2230],{"class":2217}," 5",[353,2232,476],{"class":380},[353,2234,2235,2237,2239],{"class":355,"line":1541},[353,2236,2163],{"class":370},[353,2238,2087],{"class":434},[353,2240,2168],{"class":370},[353,2242,2243,2245,2247,2249,2251],{"class":355,"line":1547},[353,2244,2173],{"class":370},[353,2246,2177],{"class":2176},[353,2248,1261],{"class":434},[353,2250,2182],{"class":370},[353,2252,1315],{"class":434},[11,2254,2255,2256,2258,2259,2264,2265,2270],{},"Moreover, ",[78,2257,1089],{}," it is available on the ",[51,2260,2263],{"href":2261,"rel":2262},"https://dev.webfuse.com/automation-api",[55],"Webfuse Automation API",". ",[51,2266,2269],{"href":2267,"rel":2268},"https://www.webfuse.com",[55],"Webfuse"," essentially is a proxy to seamlessly serve any existing web application with custom augmentations, such as a web agent widget.",[344,2272,2276],{"className":2273,"code":2274,"language":2275,"meta":349,"style":349},"language-js shiki shiki-themes catppuccin-latte night-owl","const domSnapshot = await browser.webfuseSession\n    .automation\n    .take_dom_snapshot({ modifier: 'downsample' })\n","js",[150,2277,2278,2283,2288],{"__ignoreMap":349},[353,2279,2280],{"class":355,"line":356},[353,2281,2282],{},"const domSnapshot = await browser.webfuseSession\n",[353,2284,2285],{"class":355,"line":363},[353,2286,2287],{},"    .automation\n",[353,2289,2290],{"class":355,"line":392},[353,2291,2292],{},"    .take_dom_snapshot({ modifier: 'downsample' })\n",[11,2294,2295,2296,2298],{},"Need precise control over the underlying ",[78,2297,1089],{}," invocation? Configure it exactly how you want:",[344,2300,2302],{"className":2273,"code":2301,"language":2275,"meta":349,"style":349},"const domSnapshot = await browser.webfuseSession\n    .automation\n    .take_dom_snapshot({\n        modifier: {\n            name: 'D2Snap',\n            params: { hierarchyRatio: 0.6, textRatio: 0.2, attributeRatio: 0.8 }\n        }\n    })\n",[150,2303,2304,2308,2312,2317,2322,2327,2332,2337],{"__ignoreMap":349},[353,2305,2306],{"class":355,"line":356},[353,2307,2282],{},[353,2309,2310],{"class":355,"line":363},[353,2311,2287],{},[353,2313,2314],{"class":355,"line":392},[353,2315,2316],{},"    .take_dom_snapshot({\n",[353,2318,2319],{"class":355,"line":400},[353,2320,2321],{},"        modifier: {\n",[353,2323,2324],{"class":355,"line":407},[353,2325,2326],{},"            name: 'D2Snap',\n",[353,2328,2329],{"class":355,"line":428},[353,2330,2331],{},"            params: { hierarchyRatio: 0.6, textRatio: 0.2, attributeRatio: 0.8 }\n",[353,2333,2334],{"class":355,"line":444},[353,2335,2336],{},"        }\n",[353,2338,2339],{"class":355,"line":457},[353,2340,2341],{},"    })\n",[102,2343,2345],{"id":2344},"performance-evaluation","Performance Evaluation",[11,2347,2348,2349,2351,2352,2354,2355,2357],{},"Now for the moment of truth: How does ",[78,2350,1089],{}," stack up against the industry standard? We evaluated ",[78,2353,1089],{}," in comparison to a grounded GUI snapshot baseline close to those used by ",[78,2356,971],{}," – coloured bounding boxes around visible interactive elements.",[11,2359,2360,2361,2366],{},"To evaluate snapshots isolated from specific agent logic, we crafted a dataset that spans all UI states that occur while solving a related task. We sampled our dataset from the existing ",[51,2362,2365],{"href":2363,"rel":2364},"https://github.com/OSU-NLP-Group/Online-Mind2Web",[55],"Online-Mind2Web"," dataset.",[83,2368],{":width":2369,"alt":2370,"format":112,"loading":88,"src":2371},"800","Exemplary solution UI state trajectory of a defined web-based task","/blog/dom-downsampling-for-web-agents/3.png",[11,2373,2374],{},[1014,2375,2376],{},"Exemplary solution UI state trajectory for the task: “View the pricing plan for 'Business'. Specifically, we have 100 users. We need a 1PB storage quota and a 50 TB transfer quota.”",[11,2378,2379],{},"These are our key findings...",[1158,2381,2383],{"id":2382},"substantial-success-rates","Substantial Success Rates",[11,2385,2386,2387,2389],{},"The results exceeded our expectations. Not only did ",[78,2388,1089],{}," meet the baseline's performance – our best configuration outperformed it by a significant margin. Full linearisation matches performance, and estimated model input token size order of the baseline.",[83,2391],{":width":2392,"alt":2393,"format":112,"loading":88,"src":2394},"550","Success rate per web agent snapshot subject evaluated across the dataset","/blog/dom-downsampling-for-web-agents/4.png",[1014,2396,2397,2398,2405,2406,2408,2409,2412,2413,2416,2417,2420,2421,2424,2425,2428,2429,2432],{},"\n  Success rate per web agent snapshot subject evaluated across the dataset.\n  Labels: ",[150,2399,2400,2401],{},"GUI",[2402,2403,2404],"sub",{}," gr.",": Baseline, ",[150,2407,2139],{},": Raw DOM (cut-off at ~8K tokens), ",[150,2410,2411],{},"k( l m)",": Parameter values; e.g., ",[150,2414,2415],{},".9 .3 .6",", or ",[150,2418,2419],{},".4"," if equal). ",[150,2422,2423],{},"∞",": Linearisation,  ",[150,2426,2427],{},"8192 / 32768",": via token-limited (resp.) ",[2430,2431,1988],"i",{},".\n",[1158,2434,2436],{"id":2435},"containable-token-and-byte-size","Containable Token and Byte Size",[11,2438,2439,2440,2442],{},"Even light downsampling delivers dramatic size reductions. Most ",[78,2441,1089],{}," configurations average just one token order above the baseline – a massive improvement over raw DOM snapshots. Better yet, most DOMs from the dataset could actually be downsampled to the baseline order. And while image data balloons in file size, our text-based approach stays lean and efficient.",[83,2444],{":width":2369,"alt":2445,"format":112,"loading":88,"src":2446},"Comparison of mean input size across and per subject","/blog/dom-downsampling-for-web-agents/5.png",[1014,2448,2449,2450,2453,2454,2456],{},"\n  Left: Comparison of mean input size (tokens vs bytes) across and per subject.",[2451,2452],"br",{},"\n  Right: Estimated input token size across the dataset created by a single ",[2430,2455,1089],{}," evaluation subject.\n",[1158,2458,2460],{"id":2459},"hierarchy-actually-matters","Hierarchy Actually Matters",[11,2462,2463],{},"Which UI feature matters most for LLM web agent backend performance? We alternated parameter configurations to find out. Interestingly, hierarchy reveals itself as the strongest of the three assessed features. Element extraction throws away hierarchy, which suggests that downsampling is a superior technique.",[1169,2465,2468,2473],{"className":2466,"dataFootnotes":349},[2467],"footnotes",[70,2469,2472],{"className":2470,"id":1031},[2471],"sr-only","Footnotes",[846,2474,2475,2489,2500,2511],{},[137,2476,2478,317,2482],{"id":2477},"user-content-fn-1",[51,2479,2480],{"href":2480,"rel":2481},"https://arxiv.org/abs/2210.03945",[55],[51,2483,2488],{"href":2484,"ariaLabel":2485,"className":2486,"dataFootnoteBackref":349},"#user-content-fnref-1","Back to reference 1",[2487],"data-footnote-backref","↩",[137,2490,2492,317,2495],{"id":2491},"user-content-fn-2",[51,2493,1095],{"href":1095,"rel":2494},[55],[51,2496,2488],{"href":2497,"ariaLabel":2498,"className":2499,"dataFootnoteBackref":349},"#user-content-fnref-2","Back to reference 2",[2487],[137,2501,2503,317,2506],{"id":2502},"user-content-fn-3",[51,2504,2026],{"href":2026,"rel":2505},[55],[51,2507,2488],{"href":2508,"ariaLabel":2509,"className":2510,"dataFootnoteBackref":349},"#user-content-fnref-3","Back to reference 3",[2487],[137,2512,2514,317,2518],{"id":2513},"user-content-fn-4",[51,2515,2516],{"href":2516,"rel":2517},"https://aclanthology.org/W04-3252",[55],[51,2519,2488],{"href":2520,"ariaLabel":2521,"className":2522,"dataFootnoteBackref":349},"#user-content-fnref-4","Back to reference 4",[2487],[876,2524,2525],{},"html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .s9rnR, html code.shiki .s9rnR{--shiki-default:#179299;--shiki-dark:#7FDBCA}html pre.shiki code .sY2RG, html code.shiki .sY2RG{--shiki-default:#1E66F5;--shiki-dark:#CAECE6}html pre.shiki code .swkLt, html code.shiki .swkLt{--shiki-default:#DF8E1D;--shiki-default-font-style:inherit;--shiki-dark:#C5E478;--shiki-dark-font-style:italic}html pre.shiki code .sbuKk, html code.shiki .sbuKk{--shiki-default:#40A02B;--shiki-dark:#D9F5DD}html pre.shiki code .sfrMT, html code.shiki .sfrMT{--shiki-default:#40A02B;--shiki-dark:#ECC48D}html pre.shiki code .s2kId, html code.shiki .s2kId{--shiki-default:#4C4F69;--shiki-dark:#D6DEEB}html pre.shiki code .s76yb, html code.shiki .s76yb{--shiki-default:#8839EF;--shiki-dark:#C792EA}html pre.shiki code .sXbZB, html code.shiki .sXbZB{--shiki-default:#DF8E1D;--shiki-default-font-style:italic;--shiki-dark:#D6DEEB;--shiki-dark-font-style:inherit}html pre.shiki code .s-_ek, html code.shiki .s-_ek{--shiki-default:#179299;--shiki-dark:#C792EA}html pre.shiki code .s-DR7, html code.shiki .s-DR7{--shiki-default:#DF8E1D;--shiki-default-font-style:italic;--shiki-dark:#FFCB8B;--shiki-dark-font-style:inherit}html pre.shiki code .scrte, html code.shiki .scrte{--shiki-default:#8839EF;--shiki-dark:#C5E478}html pre.shiki code .scGhl, html code.shiki .scGhl{--shiki-default:#7C7F93;--shiki-dark:#D6DEEB}html pre.shiki code .swl0y, html code.shiki .swl0y{--shiki-default:#4C4F69;--shiki-default-font-style:italic;--shiki-dark:#D6DEEB;--shiki-dark-font-style:inherit}html pre.shiki code .sDmS1, html code.shiki .sDmS1{--shiki-default:#7C7F93;--shiki-default-font-style:italic;--shiki-dark:#637777;--shiki-dark-font-style:italic}html pre.shiki code .s5FwJ, html code.shiki .s5FwJ{--shiki-default:#179299;--shiki-default-font-style:inherit;--shiki-dark:#C792EA;--shiki-dark-font-style:italic}html pre.shiki code .sNstc, html code.shiki .sNstc{--shiki-default:#1E66F5;--shiki-default-font-style:italic;--shiki-dark:#82AAFF;--shiki-dark-font-style:italic}html pre.shiki code .sqxXB, html code.shiki .sqxXB{--shiki-default:#4C4F69;--shiki-dark:#82AAFF}html pre.shiki code .s8Irk, html code.shiki .s8Irk{--shiki-default:#DF8E1D;--shiki-default-font-style:italic;--shiki-dark:#C5E478;--shiki-dark-font-style:inherit}html pre.shiki code .sZ_Zo, html code.shiki .sZ_Zo{--shiki-default:#FE640B;--shiki-dark:#F78C6C}",{"title":349,"searchDepth":363,"depth":363,"links":2527},[2528,2532,2533,2540],{"id":979,"depth":363,"text":980,"children":2529},[2530,2531],{"id":990,"depth":392,"text":991},{"id":1019,"depth":392,"text":1020},{"id":1068,"depth":363,"text":1069},{"id":1086,"depth":363,"text":1089,"children":2534},[2535,2536,2537,2538,2539],{"id":1123,"depth":392,"text":1124},{"id":1244,"depth":392,"text":1245},{"id":1985,"depth":392,"text":1988},{"id":2000,"depth":392,"text":2001},{"id":2344,"depth":392,"text":2345},{"id":1031,"depth":363,"text":2472},"2025-08-18","We propose D2Snap – a first-of-its-kind downsampling algorithm for DOMs. D2Snap can be used as a pre-processing technique for DOM snapshots to optimise web agency context quality and token costs.",{"homepage":403,"relatedLinks":2544},[2545,2549,2552],{"text":2546,"href":2547,"description":2548},"What is a Website Snapshot?","/blog/snapshots-provide-llms-with-website-state","Learn what a website snapshot is and how to utilise it for web agents",{"text":2550,"href":2015,"description":2551},"What is a Web Agent?","Learn the basics of web agents",{"text":2263,"href":2553,"external":403,"description":2554},"https://dev.webfuse.com/automation-api#take_dom_snapshot","Check out the Webfuse Automation API",{"title":945,"description":2542},{"loc":121},"blog/1012.dom-downsampling-for-llm-based-web-agents",[893,2559,2560,2561,939,2562],"browser-agents","llms","llm-context","web-automation","bGJtg_9k7O95O2CJswaRFj4ONGhX4hGr_8aL5dhDZms",{"id":2565,"title":2566,"authorId":946,"body":2567,"category":893,"created":3295,"description":3296,"extension":896,"faqs":917,"featurePriority":917,"head":917,"landingPath":917,"meta":3297,"navigation":403,"ogImage":917,"path":2015,"robots":917,"schemaOrg":917,"seo":3306,"sitemap":3307,"stem":3308,"tags":3309,"__hash__":3310},"blog/blog/1011.a-gentle-introduction-to-ai-agents-for-the-web.md","A Gentle Introduction to AI Agents for the Web",{"type":8,"value":2568,"toc":3276},[2569,2583,2586,2593,2599,2603,2606,2621,2625,2635,2639,2643,2656,2660,2664,2667,2672,2676,2685,2689,2700,2705,2709,2727,2731,2737,2840,2843,3076,3092,3096,3099,3104,3108,3111,3115,3133,3158,3165,3169,3207,3210,3221,3225,3228,3256,3260,3268,3273],[11,2570,2571,2572,961,2576,1137,2579,2582],{},"In no time, AI became a natural part of modern web interfaces. AI agents for the web enjoy a recent hype, sparked by the means of ",[51,2573,960],{"href":2574,"rel":2575},"https://openai.com/index/introducing-operator/",[55],[51,2577,966],{"href":964,"rel":2578},[55],[51,2580,971],{"href":969,"rel":2581},[55],". By now, it is within reach to automate arbitrary web-based tasks, such as booking the cheapest flight from Berlin to Amsterdam.",[70,2584,2550],{"id":2585},"what-is-a-web-agent",[11,2587,2588,2589,2592],{},"For starters, let us break down the term ",[15,2590,2591],{},"web AI agent",": An agent is an entity that autonomously acts on behalf of another entity. An artificially intelligent agent is an application that acts on behalf of a human. In contrast to non-AI computer agents, it solves complex tasks with at least human-grade effectiveness and efficiency. For a human-centric web, web agents have deliberately been designed to browse the web in a human fashion – through UIs rather than APIs.",[83,2594],{":width":2595,"alt":2596,"format":2597,"loading":88,"src":2598},"610","High-level agent description comparing human and computer agents","svg","/blog/a-gentle-introduction-to-ai-agents-for-the-web/1.svg",[102,2600,2602],{"id":2601},"the-role-of-frontier-llms","The Role of Frontier LLMs",[11,2604,2605],{},"Web agents have been a vague desire for a long time. AI agents used to rely on complete models of a problem domain in order to allow (heuristic) search through problem states. Such models would comprise the problem world (e.g., a chessboard), actors (pawns, rooks, etc.), possible actions per actor (rook moves straight), and constraints (i.a., max one piece per field). A heterogeneous space of web application UIs describes the problem domain of a web agent: how to understand a web page, and how to interact with it to solve the declared task?",[11,2607,2608,2609,2616,2617,2620],{},"Frontier LLMs disrupted the AI agent world: explicit problem domain models beyond feasibility can now be replaced by an LLM. The LLM thereby acts as an instantaneous domain model backend that can be consulted with twofold context: serialised problem state, such as a chess position code (",[78,2610,2611,2612,2615],{},"“",[353,2613,2614],{},"..."," e4 e5 2. Nc3 f5”","), and the respective task (",[78,2618,2619],{},"“What is the best move for white?”","). For web agents, problem state corresponds to the currently browsed web application's runtime state, for instance, a screenshot.",[102,2622,2624],{"id":2623},"generalist-web-agents","Generalist Web Agents",[11,2626,2627,2628,1137,2631,2634],{},"Generalist web agents are supposed to solve arbitrary tasks through a web browser. Web-based tasks can be as diverse as ",[78,2629,2630],{},"“Find a picture of a cat.”",[78,2632,2633],{},"“Book the cheapest flight from Berlin to Amsterdam tomorrow afternoon (business class, window seat).”"," In reality, generalist agents still fail uncommon or too precise tasks. While they have been critically acclaimed, they mainly act as early proofs-of-concept. Tasks that are indeed solvable with a generalist agent promise great results with an according specialist agent.",[83,2636],{":width":951,"alt":2637,"format":112,"loading":88,"src":2638},"Screenshot of a generalist web agent UI (Director)","/blog/a-gentle-introduction-to-ai-agents-for-the-web/2.png",[102,2640,2642],{"id":2641},"specialist-web-agents","Specialist Web Agents",[11,2644,2645,2646,2649,2650,2655],{},"Other than generalist agents, specialist web agents are constrained to a certain task and application domain. Specialist agents bear the major share of commercial value. Most prominently, modal chat agents that provide users with on-page help. Picture a little floating widget that can be chatted to via text or voice input. In most cases, in fact, the term ",[78,2647,2648],{},"web (AI) agent"," refers to chat agents. Chat agents – text or voice – can be implemented on top of virtually any existing website. Frontier LLMs provide a lot of commonsense out-of-the-box. A ",[51,2651,2654],{"href":2652,"rel":2653},"https://docs.claude.com/en/docs/build-with-claude/prompt-engineering/system-prompts",[55],"system prompt"," can, moreover, be leveraged to drive specialist agent quality for the respective problem domain.",[83,2657],{":width":951,"alt":2658,"format":112,"loading":88,"src":2659},"Screenshots of two modal specialist web agent UIs augmenting an underlying website's UI","/blog/a-gentle-introduction-to-ai-agents-for-the-web/3.png",[70,2661,2663],{"id":2662},"how-does-a-web-agent-work","How Does a Web Agent Work?",[11,2665,2666],{},"LLM-based web agents are premised on a more or less uniform architecture. The agent application embodies a mediator between a web browser (environment), and the LLM backend (model).",[83,2668],{":width":2669,"alt":2670,"format":2597,"loading":88,"src":2671},"480","High-level web agent architecture component view","/blog/a-gentle-introduction-to-ai-agents-for-the-web/4.svg",[102,2673,2675],{"id":2674},"the-agent-lifecycle","The Agent Lifecycle",[11,2677,2678,2679,2684],{},"To reduce a user's cognitive load, solving a web-based task is usually chunked into a sequence of UI states. Consider looking for rental apartments on ",[51,2680,2683],{"href":2681,"rel":2682},"https://www.redfin.com",[55],"redfin.com",": In the first step, you specify a location. Only subsequently are you provided with a grid of available apartments for that location.",[83,2686],{":width":951,"alt":2687,"format":112,"loading":88,"src":2688},"Example of separated UI states in a rental home search application","/blog/a-gentle-introduction-to-ai-agents-for-the-web/5.png",[11,2690,2691,2692,2699],{},"Web agent logic is iterative; not least for a sequential web interaction model, but also for a conversational agent interaction model. Browsing the web, human and computer agents represent users alike. That said, Norman's well-known ",[51,2693,2696],{"href":2694,"rel":2695},"https://mitpress.mit.edu/9780262640374/the-design-of-everyday-things/",[55],[78,2697,2698],{},"Seven Stages of Action",", which hierarchically model the human cognition cycle, transfer to the web agent lifecycle. For each UI state in a web browser (environment) and web-based task (action intention); decide where to click, type, etc. (action planning), and perform those clicks, etc. (action execution). Afterwards, perceive, interpret, and evaluate the results of those actions in the web browser (state). As long as there is a mismatch between the evaluated state and the declared goal state, repeat that cycle. Potentially prompt the user with more required information.",[83,2701],{":width":2702,"alt":2703,"format":2597,"loading":88,"src":2704},"580","Donald 'Norman's Seven Stages of Action' model of the human cognition cycle that transfers to non-human agents","/blog/a-gentle-introduction-to-ai-agents-for-the-web/6.svg",[102,2706,2708],{"id":2707},"web-context-for-llms","Web Context for LLMs",[11,2710,2711,2712,2714,2715,2718,2719,2722,2723,2726],{},"The gap from an agent towards the environment, according to ",[78,2713,2698],{},", is known as the ",[78,2716,2717],{},"gulf of execution",". In real-world scenarios, how to act in the environment in respect to a planned sequence of actions might be difficult (e.g., how to actually open the trunk of a new car?). Arguably, web agents face a novel ",[78,2720,2721],{},"gulf of intention"," towards the action planning stage: how to serialise a currently browsed web page's runtime state for LLMs? ",[78,2724,2725],{},"Snapshot"," is a more comprehensive term to describe the serialisation of a web page's current runtime state. Screenshots, for instance, represent a type of snapshot that closely resembles how humans perceive a web page at a given point in time. But are they as accessible to LLMs?",[102,2728,2730],{"id":2729},"agentic-ui-interaction","Agentic UI Interaction",[11,2732,2733,2734,2736],{},"With a qualified set of well-defined actuation methods, web agents are able to close the ",[78,2735,2717],{}," quite well. HTML element types strongly afford a certain action (e.g., click a button, type to a field). Below is how an actuation schema to present the LLM backend with could look like:",[344,2738,2740],{"className":2032,"code":2739,"language":2034,"meta":349,"style":349},"interface ActuationSchema = {\n    thought: string;\n    action: \"click\"\n        | \"scroll\"\n        | \"type\";\n    cssSelector: string;\n    data?: string;\n}[];\n",[150,2741,2742,2756,2768,2784,2796,2808,2819,2830],{"__ignoreMap":349},[353,2743,2744,2747,2750,2753],{"class":355,"line":356},[353,2745,2746],{"class":410},"interface",[353,2748,2749],{"class":2044}," ActuationSchema",[353,2751,2752],{"class":370}," = ",[353,2754,2755],{"class":380},"{\n",[353,2757,2758,2761,2764,2766],{"class":355,"line":363},[353,2759,2760],{"class":370},"    thought",[353,2762,2763],{"class":1260},":",[353,2765,2063],{"class":2062},[353,2767,2066],{"class":380},[353,2769,2770,2773,2775,2778,2781],{"class":355,"line":392},[353,2771,2772],{"class":370},"    action",[353,2774,2763],{"class":1260},[353,2776,2777],{"class":466}," \"",[353,2779,772],{"class":2780},"sgAC-",[353,2782,2783],{"class":466},"\"\n",[353,2785,2786,2789,2791,2794],{"class":355,"line":400},[353,2787,2788],{"class":1260},"        |",[353,2790,2777],{"class":466},[353,2792,2793],{"class":2780},"scroll",[353,2795,2783],{"class":466},[353,2797,2798,2800,2802,2804,2806],{"class":355,"line":407},[353,2799,2788],{"class":1260},[353,2801,2777],{"class":466},[353,2803,2041],{"class":2780},[353,2805,467],{"class":466},[353,2807,2066],{"class":380},[353,2809,2810,2813,2815,2817],{"class":355,"line":428},[353,2811,2812],{"class":370},"    cssSelector",[353,2814,2763],{"class":1260},[353,2816,2063],{"class":2062},[353,2818,2066],{"class":380},[353,2820,2821,2824,2826,2828],{"class":355,"line":444},[353,2822,2823],{"class":370},"    data",[353,2825,2087],{"class":1260},[353,2827,2063],{"class":2062},[353,2829,2066],{"class":380},[353,2831,2832,2835,2838],{"class":355,"line":457},[353,2833,2834],{"class":380},"}",[353,2836,2837],{"class":370},"[]",[353,2839,2066],{"class":380},[11,2841,2842],{},"And a suggested actions response could, in turn, look as follows:",[344,2844,2848],{"className":2845,"code":2846,"language":2847,"meta":349,"style":349},"language-json shiki shiki-themes catppuccin-latte night-owl","[\n    {\n        \"thought\": \"Scroll newsletter cta into view\",\n        \"action\": \"scroll\",\n        \"cssSelector\": \"section#newsletter\"\n    },\n    {\n        \"thought\": \"Type email address to newsletter cta\",\n        \"action\": \"type\",\n        \"cssSelector\": \"section#newsletter > input\",\n        \"data\": \"user@example.org\"\n    },\n    {\n        \"thought\": \"Submit newsletter sign up\",\n        \"action\": \"click\",\n        \"cssSelector\": \"section#newsletter > button\"\n    }\n]\n","json",[150,2849,2850,2855,2860,2884,2903,2921,2926,2930,2949,2967,2986,3004,3008,3012,3031,3049,3066,3071],{"__ignoreMap":349},[353,2851,2852],{"class":355,"line":356},[353,2853,2854],{"class":380},"[\n",[353,2856,2857],{"class":355,"line":363},[353,2858,2859],{"class":380},"    {\n",[353,2861,2862,2866,2870,2872,2874,2876,2880,2882],{"class":355,"line":392},[353,2863,2865],{"class":2864},"srFR9","        \"",[353,2867,2869],{"class":2868},"s30W1","thought",[353,2871,467],{"class":2864},[353,2873,2763],{"class":380},[353,2875,2777],{"class":466},[353,2877,2879],{"class":2878},"sCC8C","Scroll newsletter cta into view",[353,2881,467],{"class":466},[353,2883,476],{"class":380},[353,2885,2886,2888,2891,2893,2895,2897,2899,2901],{"class":355,"line":400},[353,2887,2865],{"class":2864},[353,2889,2890],{"class":2868},"action",[353,2892,467],{"class":2864},[353,2894,2763],{"class":380},[353,2896,2777],{"class":466},[353,2898,2793],{"class":2878},[353,2900,467],{"class":466},[353,2902,476],{"class":380},[353,2904,2905,2907,2910,2912,2914,2916,2919],{"class":355,"line":407},[353,2906,2865],{"class":2864},[353,2908,2909],{"class":2868},"cssSelector",[353,2911,467],{"class":2864},[353,2913,2763],{"class":380},[353,2915,2777],{"class":466},[353,2917,2918],{"class":2878},"section#newsletter",[353,2920,2783],{"class":466},[353,2922,2923],{"class":355,"line":428},[353,2924,2925],{"class":380},"    },\n",[353,2927,2928],{"class":355,"line":444},[353,2929,2859],{"class":380},[353,2931,2932,2934,2936,2938,2940,2942,2945,2947],{"class":355,"line":457},[353,2933,2865],{"class":2864},[353,2935,2869],{"class":2868},[353,2937,467],{"class":2864},[353,2939,2763],{"class":380},[353,2941,2777],{"class":466},[353,2943,2944],{"class":2878},"Type email address to newsletter cta",[353,2946,467],{"class":466},[353,2948,476],{"class":380},[353,2950,2951,2953,2955,2957,2959,2961,2963,2965],{"class":355,"line":479},[353,2952,2865],{"class":2864},[353,2954,2890],{"class":2868},[353,2956,467],{"class":2864},[353,2958,2763],{"class":380},[353,2960,2777],{"class":466},[353,2962,2041],{"class":2878},[353,2964,467],{"class":466},[353,2966,476],{"class":380},[353,2968,2969,2971,2973,2975,2977,2979,2982,2984],{"class":355,"line":493},[353,2970,2865],{"class":2864},[353,2972,2909],{"class":2868},[353,2974,467],{"class":2864},[353,2976,2763],{"class":380},[353,2978,2777],{"class":466},[353,2980,2981],{"class":2878},"section#newsletter > input",[353,2983,467],{"class":466},[353,2985,476],{"class":380},[353,2987,2988,2990,2993,2995,2997,2999,3002],{"class":355,"line":507},[353,2989,2865],{"class":2864},[353,2991,2992],{"class":2868},"data",[353,2994,467],{"class":2864},[353,2996,2763],{"class":380},[353,2998,2777],{"class":466},[353,3000,3001],{"class":2878},"user@example.org",[353,3003,2783],{"class":466},[353,3005,3006],{"class":355,"line":513},[353,3007,2925],{"class":380},[353,3009,3010],{"class":355,"line":529},[353,3011,2859],{"class":380},[353,3013,3014,3016,3018,3020,3022,3024,3027,3029],{"class":355,"line":534},[353,3015,2865],{"class":2864},[353,3017,2869],{"class":2868},[353,3019,467],{"class":2864},[353,3021,2763],{"class":380},[353,3023,2777],{"class":466},[353,3025,3026],{"class":2878},"Submit newsletter sign up",[353,3028,467],{"class":466},[353,3030,476],{"class":380},[353,3032,3033,3035,3037,3039,3041,3043,3045,3047],{"class":355,"line":1526},[353,3034,2865],{"class":2864},[353,3036,2890],{"class":2868},[353,3038,467],{"class":2864},[353,3040,2763],{"class":380},[353,3042,2777],{"class":466},[353,3044,772],{"class":2878},[353,3046,467],{"class":466},[353,3048,476],{"class":380},[353,3050,3051,3053,3055,3057,3059,3061,3064],{"class":355,"line":1535},[353,3052,2865],{"class":2864},[353,3054,2909],{"class":2868},[353,3056,467],{"class":2864},[353,3058,2763],{"class":380},[353,3060,2777],{"class":466},[353,3062,3063],{"class":2878},"section#newsletter > button",[353,3065,2783],{"class":466},[353,3067,3068],{"class":355,"line":1541},[353,3069,3070],{"class":380},"    }\n",[353,3072,3073],{"class":355,"line":1547},[353,3074,3075],{"class":380},"]\n",[1146,3077,3078],{},[11,3079,3080,3085,3086,3091],{},[51,3081,3084],{"href":3082,"rel":3083},"https://platform.openai.com/docs/guides/function-calling",[55],"Function Calling"," and the ",[51,3087,3090],{"href":3088,"rel":3089},"https://modelcontextprotocol.io",[55],"Model Context Protocol"," represent two ends to outsource an explicit actuation model – server- and client-side, respectively.",[102,3093,3095],{"id":3094},"agentic-ui-augmentation","Agentic UI Augmentation",[11,3097,3098],{},"An agent represents yet another feature to integrate with an application and its UI. Discoverability and availability, however, are among the most fundamental requirements of a web agent. Evidently, when a user experiences UI/UX friction, at least the agent should be interactive. That said, a scrolling modal web agent UI has been the go-to approach, that is, a little floating widget on top of the underlying application's UI. It comes with a major advantage: the agent application can be decoupled from the underlying, self-contained application.",[83,3100],{":width":3101,"alt":3102,"format":2597,"loading":88,"src":3103},"360","Depiction of a web agent application augmenting an underlying application in an isolated layer","/blog/a-gentle-introduction-to-ai-agents-for-the-web/7.svg",[70,3105,3107],{"id":3106},"how-to-build-a-web-agent","How to Build a Web Agent?",[11,3109,3110],{},"Believe it or not: enhancing an existing web application with a purposeful agent is a lower-hanging fruit. The evolving agent ecosystem provides you with a spectrum of solutions: instantly use a pre-compiled agent, tweak a templated agent, or develop an agent from scratch. Either way, LLMs and web browsers exist for reuse, boiling down agent development to LLM context engineering, and UI augmentation.",[102,3112,3114],{"id":3113},"develop-a-web-agent","Develop a Web Agent",[11,3116,3117,3118,3121,3122,1137,3127,3132],{},"Opting for a ",[15,3119,3120],{},"pre-compiled agent"," does not necessarily involve any actual development step. Instead, pre-compiled agents allow for high-level configuration through an agent-as-a-service provider's interface. Popular agent-as-a-service providers are, i.a., ",[51,3123,3126],{"href":3124,"rel":3125},"https://elevenlabs.io/conversational-ai",[55],"ElevenLabs",[51,3128,3131],{"href":3129,"rel":3130},"https://www.intercom.com/drlp/ai-agent",[55],"Intercom",". Serviced agents hide LLM communication and potentially interaction with a web browser behind the configuration interface.",[11,3134,3135,3136,3139,3140,3145,3146,3151,3152,3157],{},"Using a ",[15,3137,3138],{},"templated agent"," resembles the agent-as-a-service approach on a lower level. Openly sourced from a ",[51,3141,3144],{"href":3142,"rel":3143},"https://github.com/webfuse-com/agent-extension-blueprint",[55],"code repository",", templated agents allow for any kind of development tweaks. Favourably, agent templates shortcut integration with ",[51,3147,3150],{"href":3148,"rel":3149},"https://openai.com/api/",[55],"LLM APIs"," and web ",[51,3153,3156],{"href":3154,"rel":3155},"https://developer.mozilla.org/en-US/docs/Web/API",[55],"browser APIs",". Using a templated agent usually represents the preferable, best-of-both-worlds approach; common- and best-practice code snippets are available from the beginning, but everything can be customised as desired.",[11,3159,3160,3161,3164],{},"Of course, developing an ",[15,3162,3163],{},"agent from scratch"," is always an option. It is preferable whenever agent requirements deviate to a large extent from what exists in the service or template landscape.",[102,3166,3168],{"id":3167},"deploy-a-web-agent","Deploy a Web Agent",[11,3170,3171,3172,192,3177,3182,3183,3188,3189,3194,3195,3200,3201,3206],{},"When web agent code lives side-by-side with the augmented application's code, agent deployment is covered by a generic pipeline. Something like: ",[51,3173,3176],{"href":3174,"rel":3175},"https://eslint.org",[55],"linting",[51,3178,3181],{"href":3179,"rel":3180},"https://prettier.io",[55],"formatting"," agent code, ",[51,3184,3187],{"href":3185,"rel":3186},"https://esbuild.github.io",[55],"transpiling and bundling"," agent modules, ",[51,3190,3193],{"href":3191,"rel":3192},"https://www.cypress.io",[55],"testing"," agent, ",[51,3196,3199],{"href":3197,"rel":3198},"https://pages.cloudflare.com",[55],"hosting"," agent bundle, and ",[51,3202,3205],{"href":3203,"rel":3204},"https://docs.github.com/en/actions/get-started/continuous-integration",[55],"tiggering"," post deployment events. In that case, an agent represents a modular feature component in the application, no different than, for instance, a sign-up component.",[11,3208,3209],{},"Web agent source code right inside the application codebase comes at a cost:",[134,3211,3212,3215,3218],{},[137,3213,3214],{},"Agent developers can manipulate the source code of the underlying application.",[137,3216,3217],{},"Agent functionality could introduce side effects on the underlying application.",[137,3219,3220],{},"Agent changes require deployment of the entire application.",[102,3222,3224],{"id":3223},"best-practices-of-agentic-ux","Best Practices of Agentic UX",[11,3226,3227],{},"When designing user experiences for agent-enhanced applications, there are a few things to consider:",[134,3229,3230,3231,3230,3240,3230,3248],{},"\n    ",[137,3232,3233,3234,3233,3237,3239],{},"\n        ",[15,3235,3236],{},"Stream input and output to reduce latency",[2451,3238],{},"\n        LLMs (re-)introduce noticeable communication round-trip time. To reduce wait time for the human user, stream chunks of data whenever they are available.\n    ",[137,3241,3233,3242,3233,3245,3247],{},[15,3243,3244],{},"Provide fine-grained feedback to bridge high-latency",[2451,3246],{},"\n        Human attention is sensitive to several seconds of [system response time](https://www.nngroup.com/articles/response-times-3-important-limits/). Periodically provide agent _thoughts_ as feedback to perceptibly break down round-trip time.\n    ",[137,3249,3233,3250,3233,3253,3255],{},[15,3251,3252],{},"Always prompt the human user for consent to perform critical actions",[2451,3254],{},"\n        Some actions in a web application lead to irreversible or significant changes of state. Never have the agent perform such actions on behalf of the user without explicitly asking for the permission.\n    ",[102,3257,3259],{"id":3258},"non-invasive-web-agents-with-webfuse","Non-Invasive Web Agents with Webfuse",[11,3261,3262,3267],{},[51,3263,3265],{"href":2267,"rel":3264},[55],[15,3266,2269],{}," is a configurable web proxy that lets you augment any web application. As pictured, web agents represent highly self-contained applications. Moreover, web agents and underlying applications communicate at runtime in the client. This does, in fact, render opportunities to bridge the above-mentioned drawbacks with Webfuse: Develop web agents with a sandbox extension methodology, and deploy them through the low-latency proxy layer. On demand, seamlessly serve users with your agent-enhanced website. Benefit from information hiding, safe code, and fewer deployments.",[249,3269],{":demoAction":3270,"heading":3271,"subtitle":3272},"{\"text\":\"Read more\",\"showIcon\":false,\"href\":\"https://www.webfuse.com/blog/category/ai-agents\"}","Deploy Web Agents with Webfuse","Develop or deploy web agents in minutes; serve agent-enhanced websites through an isolated application layer.",[876,3274,3275],{},"html pre.shiki code .s76yb, html code.shiki .s76yb{--shiki-default:#8839EF;--shiki-dark:#C792EA}html pre.shiki code .sXbZB, html code.shiki .sXbZB{--shiki-default:#DF8E1D;--shiki-default-font-style:italic;--shiki-dark:#D6DEEB;--shiki-dark-font-style:inherit}html pre.shiki code .s2kId, html code.shiki .s2kId{--shiki-default:#4C4F69;--shiki-dark:#D6DEEB}html pre.shiki code .scGhl, html code.shiki .scGhl{--shiki-default:#7C7F93;--shiki-dark:#D6DEEB}html pre.shiki code .s9rnR, html code.shiki .s9rnR{--shiki-default:#179299;--shiki-dark:#7FDBCA}html pre.shiki code .scrte, html code.shiki .scrte{--shiki-default:#8839EF;--shiki-dark:#C5E478}html pre.shiki code .sbuKk, html code.shiki .sbuKk{--shiki-default:#40A02B;--shiki-dark:#D9F5DD}html pre.shiki code .sgAC-, html code.shiki .sgAC-{--shiki-default:#40A02B;--shiki-default-font-style:italic;--shiki-dark:#ECC48D;--shiki-dark-font-style:inherit}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .srFR9, html code.shiki .srFR9{--shiki-default:#7C7F93;--shiki-dark:#7FDBCA}html pre.shiki code .s30W1, html code.shiki .s30W1{--shiki-default:#1E66F5;--shiki-dark:#7FDBCA}html pre.shiki code .sCC8C, html code.shiki .sCC8C{--shiki-default:#40A02B;--shiki-dark:#C789D6}",{"title":349,"searchDepth":363,"depth":363,"links":3277},[3278,3283,3289],{"id":2585,"depth":363,"text":2550,"children":3279},[3280,3281,3282],{"id":2601,"depth":392,"text":2602},{"id":2623,"depth":392,"text":2624},{"id":2641,"depth":392,"text":2642},{"id":2662,"depth":363,"text":2663,"children":3284},[3285,3286,3287,3288],{"id":2674,"depth":392,"text":2675},{"id":2707,"depth":392,"text":2708},{"id":2729,"depth":392,"text":2730},{"id":3094,"depth":392,"text":3095},{"id":3106,"depth":363,"text":3107,"children":3290},[3291,3292,3293,3294],{"id":3113,"depth":392,"text":3114},{"id":3167,"depth":392,"text":3168},{"id":3223,"depth":392,"text":3224},{"id":3258,"depth":392,"text":3259},"2025-06-15","LLMs only recently enabled serviceable web agents: autonomous systems that browse web on behalf of a human. Get started with fundamental methodology, key design challenges, and technological opportunities.",{"homepage":403,"relatedLinks":3298},[3299,3300,3304],{"text":2546,"href":2547,"description":2548},{"text":3301,"href":3302,"description":3303},"Develop an AI Agent for Any Website with Webfuse","/blog/develop-an-ai-agent-for-any-website-with-webfuse","Learn how to develop and deploy a web agent for any website with Webfuse",{"text":2263,"href":3305,"external":403,"description":2554},"https://dev.webfuse.com/automation-api/",{"title":2566,"description":3296},{"loc":2015},"blog/1011.a-gentle-introduction-to-ai-agents-for-the-web",[893,2559,2560,939,2562],"9anWTMfg6llLSdye3e9qWZZZcEAZcELLMk_vpnixn3M",1780390077360]