[{"data":1,"prerenderedAt":4185},["ShallowReactive",2],{"/blog/how-to-build-an-ai-support-agent-with-alibaba-page-agent":3,"related-/blog/how-to-build-an-ai-support-agent-with-alibaba-page-agent":1846},{"id":4,"title":5,"authorId":6,"body":7,"category":1804,"created":1805,"description":1806,"extension":1807,"faqs":1808,"featurePriority":1821,"head":1822,"landingPath":1822,"meta":1823,"navigation":238,"ogImage":1822,"path":1837,"robots":1822,"schemaOrg":1822,"seo":1838,"sitemap":1839,"stem":1840,"tags":1841,"__hash__":1845},"blog/blog/1042.how-to-build-an-ai-support-agent-with-alibaba-page-agent.md","How to Build an AI Support Agent with Alibaba Page Agent","salome-koshadze",{"type":8,"value":9,"toc":1788},"minimark",[10,19,22,29,34,56,65,68,73,76,83,135,150,154,157,162,165,191,194,397,402,405,463,468,471,475,478,498,501,507,511,514,525,529,536,540,544,561,568,911,915,922,929,1300,1304,1307,1310,1588,1591,1605,1609,1612,1618,1624,1634,1640,1649,1653,1659,1664,1670,1674,1680,1685,1691,1695,1698,1701,1707,1713,1737,1743,1753,1759,1768,1772,1775,1778,1781,1784],[11,12,13,14,18],"p",{},"Alibaba's Page Agent flips the usual browser-automation model on its head: instead of driving a browser from the outside with Selenium or Puppeteer, it runs an AI agent ",[15,16,17],"em",{},"inside"," the page that reads the live DOM, reasons about it with an LLM, and clicks and types as the real user. This guide explains how that works, then walks through building a working support chatbot for a fictional e-commerce store that handles a lost-package case end-to-end.",[11,20,21],{},"The bigger lesson is about where web agents should live. Page Agent is a strong pattern when you control the application and can add code to the page. Many production workflows are messier: they happen in third-party apps, legacy tools, customer sessions, or locked-down environments where you cannot change the source code or ask every user to install an extension. That is the gap Webfuse is built around: making the live web session itself programmable, shareable, and agent-ready without depending on the original application's release cycle.",[23,24],"video-frame",{"caption":25,"poster":26,"src":27,"srcMp4":28},"Page Agent handling a lost-package support flow end-to-end inside the live page.","/blog/how-to-build-an-ai-support-agent-with-alibaba-page-agent/intro-poster.jpg","/blog/how-to-build-an-ai-support-agent-with-alibaba-page-agent/intro.webm","/blog/how-to-build-an-ai-support-agent-with-alibaba-page-agent/intro.mp4",[30,31,33],"h2",{"id":32},"what-alibaba-page-agent-actually-does","What Alibaba Page Agent Actually Does",[11,35,36,43,44,49,50,55],{},[37,38,42],"a",{"href":39,"rel":40},"https://github.com/alibaba/page-agent",[41],"nofollow","Page Agent"," is an open-source JavaScript/TypeScript library developed by Alibaba that embeds a GUI AI agent directly into any webpage. It allows for natural language control over web interfaces through direct DOM interaction, requiring minimal or zero infrastructure. This approach is different from traditional browser automation tools like ",[37,45,48],{"href":46,"rel":47},"https://www.selenium.dev/",[41],"Selenium"," or ",[37,51,54],{"href":52,"rel":53},"https://pptr.dev/",[41],"Puppeteer",", which operate externally. Page Agent runs entirely client-side within the browser. This means it functions inside the live page, inheriting the user's session, cookies, and authentication without needing headless browsers or Python backends.",[57,58],"nuxt-picture",{":height":59,":width":60,"alt":61,"loading":62,"src":63,"format":64},"1508","2880","Architectural diagram of Alibaba Page Agent showing the agent running inside the browser page, parsing the DOM into semantic text, calling an LLM, and dispatching native browser events","lazy","/blog/how-to-build-an-ai-support-agent-with-alibaba-page-agent/1.png","webp",[11,66,67],{},"The library follows an \"inside-out\" paradigm where the agent resides within the webpage itself rather than controlling it from an external process. It works by parsing and \"dehydrating\" the live HTML into semantic text, which it feeds to a large language model (LLM). The LLM reasons about the necessary actions, and the agent dispatches native browser events like clicks and typing to execute them.",[69,70,72],"h3",{"id":71},"core-capabilities-that-make-page-agent-different","Core Capabilities That Make Page Agent Different",[11,74,75],{},"Page Agent is built with a focus on direct integration and user collaboration. It includes several features suitable for developers looking to add AI copilot functionality to their applications.",[57,77],{":height":78,":width":79,"alt":80,"loading":62,"src":81,"provider":82},"400","820","Overview of Page Agent's core features - text-based DOM manipulation, human-in-the-loop confirmation, privacy-first client-side processing, bring-your-own-model support, and SPA compatibility","/blog/how-to-build-an-ai-support-agent-with-alibaba-page-agent/2.svg","none",[84,85,86,94,100,106,123],"ul",{},[87,88,89,93],"li",{},[90,91,92],"strong",{},"Text-Based DOM Manipulation",": It parses the live HTML, indexes interactive elements, and translates this structure into semantic text for the LLM.",[87,95,96,99],{},[90,97,98],{},"Human-in-the-Loop",": The agent has a built-in UI panel that shows planned actions and seeks user confirmation before execution, preventing fully blind automation.",[87,101,102,105],{},[90,103,104],{},"Privacy-First",": All data processing happens in the browser, and you control the LLM calls. No data is sent to external servers by default.",[87,107,108,111,112,49,117,122],{},[90,109,110],{},"Bring Your Own Model (BYOM)",": It is compatible with any OpenAI-compatible API, offering support for models from OpenAI, Anthropic, DeepSeek, Google, and local models via ",[37,113,116],{"href":114,"rel":115},"https://ollama.com/",[41],"Ollama",[37,118,121],{"href":119,"rel":120},"https://lmstudio.ai/",[41],"LM Studio",".",[87,124,125,128,129,134],{},[90,126,127],{},"SPA Compatibility",": It functions correctly with single-page applications built with frameworks like ",[37,130,133],{"href":131,"rel":132},"https://react.dev/",[41],"React"," or Vue because it interacts with the live DOM.",[11,136,137,138,141,142,145,146,149],{},"The project is a monorepo using npm workspaces and TypeScript. Its main packages include ",[90,139,140],{},"@page-agent/core"," for the core logic, ",[90,143,144],{},"page-agent"," for the full agent with a UI panel, and ",[90,147,148],{},"@page-agent/page-controller"," for DOM extraction and element indexing.",[69,151,153],{"id":152},"how-to-add-page-agent-to-your-app","How to Add Page Agent to Your App",[11,155,156],{},"You can integrate Page Agent into a project using a simple script tag for quick tests or through npm for production use. The npm approach is recommended for most development workflows.",[11,158,159],{},[90,160,161],{},"1. NPM Installation",[11,163,164],{},"First, add the package to your project.",[166,167,172],"pre",{"className":168,"code":169,"language":170,"meta":171,"style":171},"language-bash shiki shiki-themes catppuccin-latte night-owl","npm install page-agent\n","bash","",[173,174,175],"code",{"__ignoreMap":171},[176,177,180,184,188],"span",{"class":178,"line":179},"line",1,[176,181,183],{"class":182},"sNstc","npm",[176,185,187],{"class":186},"sfrMT"," install",[176,189,190],{"class":186}," page-agent\n",[11,192,193],{},"Next, you can import and initialize it in your TypeScript or JavaScript code.",[166,195,199],{"className":196,"code":197,"language":198,"meta":171,"style":171},"language-ts shiki shiki-themes catppuccin-latte night-owl","import { PageAgent } from 'page-agent';\n\nconst agent = new PageAgent({\n  model: 'gpt-5.1', // or another model like 'qwen3.5-plus'\n  baseURL: 'https://api.openai.com/v1', // Your provider's endpoint\n  apiKey: 'YOUR_API_KEY',\n  language: 'en-US',\n});\n\nawait agent.execute('Fill out the expense report for last week\\'s trip');\n","ts",[173,200,201,233,240,269,293,313,331,348,358,363],{"__ignoreMap":171},[176,202,203,207,211,215,218,221,225,227,230],{"class":178,"line":179},[176,204,206],{"class":205},"srhcd","import",[176,208,210],{"class":209},"scGhl"," {",[176,212,214],{"class":213},"s2kId"," PageAgent ",[176,216,217],{"class":209},"}",[176,219,220],{"class":205}," from",[176,222,224],{"class":223},"sbuKk"," '",[176,226,144],{"class":186},[176,228,229],{"class":223},"'",[176,231,232],{"class":209},";\n",[176,234,236],{"class":178,"line":235},2,[176,237,239],{"emptyLinePlaceholder":238},true,"\n",[176,241,243,247,251,255,259,262,265],{"class":178,"line":242},3,[176,244,246],{"class":245},"s76yb","const",[176,248,250],{"class":249},"scsc5"," agent",[176,252,254],{"class":253},"s-_ek"," =",[176,256,258],{"class":257},"szhwX"," new",[176,260,261],{"class":182}," PageAgent",[176,263,264],{"class":213},"(",[176,266,268],{"class":267},"sgNGR","{\n",[176,270,272,276,279,281,284,286,289],{"class":178,"line":271},4,[176,273,275],{"class":274},"s3XBt","  model",[176,277,278],{"class":253},":",[176,280,224],{"class":223},[176,282,283],{"class":186},"gpt-5.1",[176,285,229],{"class":223},[176,287,288],{"class":267},",",[176,290,292],{"class":291},"sDmS1"," // or another model like 'qwen3.5-plus'\n",[176,294,296,299,301,303,306,308,310],{"class":178,"line":295},5,[176,297,298],{"class":274},"  baseURL",[176,300,278],{"class":253},[176,302,224],{"class":223},[176,304,305],{"class":186},"https://api.openai.com/v1",[176,307,229],{"class":223},[176,309,288],{"class":267},[176,311,312],{"class":291}," // Your provider's endpoint\n",[176,314,316,319,321,323,326,328],{"class":178,"line":315},6,[176,317,318],{"class":274},"  apiKey",[176,320,278],{"class":253},[176,322,224],{"class":223},[176,324,325],{"class":186},"YOUR_API_KEY",[176,327,229],{"class":223},[176,329,330],{"class":267},",\n",[176,332,334,337,339,341,344,346],{"class":178,"line":333},7,[176,335,336],{"class":274},"  language",[176,338,278],{"class":253},[176,340,224],{"class":223},[176,342,343],{"class":186},"en-US",[176,345,229],{"class":223},[176,347,330],{"class":267},[176,349,351,353,356],{"class":178,"line":350},8,[176,352,217],{"class":267},[176,354,355],{"class":213},")",[176,357,232],{"class":209},[176,359,361],{"class":178,"line":360},9,[176,362,239],{"emptyLinePlaceholder":238},[176,364,366,369,371,374,377,379,381,384,388,391,393,395],{"class":178,"line":365},10,[176,367,368],{"class":205},"await",[176,370,250],{"class":213},[176,372,122],{"class":373},"s5FwJ",[176,375,376],{"class":182},"execute",[176,378,264],{"class":213},[176,380,229],{"class":223},[176,382,383],{"class":186},"Fill out the expense report for last week",[176,385,387],{"class":386},"slYKy","\\'",[176,389,390],{"class":186},"s trip",[176,392,229],{"class":223},[176,394,355],{"class":213},[176,396,232],{"class":209},[11,398,399],{},[90,400,401],{},"2. Simple Script Tag",[11,403,404],{},"For demos or quick tests without a build step, you can use the IIFE build from a CDN. This method allows you to get started with zero configuration.",[166,406,410],{"className":407,"code":408,"language":409,"meta":171,"style":171},"language-html shiki shiki-themes catppuccin-latte night-owl","\u003Cscript src=\"https://cdn.jsdelivr.net/npm/page-agent@latest/dist/iife/page-agent.demo.js\" crossorigin=\"true\">\u003C/script>\n","html",[173,411,412],{"__ignoreMap":171},[176,413,414,418,422,426,429,432,435,437,440,442,444,447,449,452,455,458,460],{"class":178,"line":179},[176,415,417],{"class":416},"s9rnR","\u003C",[176,419,421],{"class":420},"sY2RG","script",[176,423,425],{"class":424},"swkLt"," src",[176,427,428],{"class":416},"=",[176,430,431],{"class":223},"\"",[176,433,434],{"class":186},"https://cdn.jsdelivr.net/npm/page-agent@latest/dist/iife/page-agent.demo.js",[176,436,431],{"class":223},[176,438,439],{"class":424}," crossorigin",[176,441,428],{"class":416},[176,443,431],{"class":223},[176,445,446],{"class":186},"true",[176,448,431],{"class":223},[176,450,451],{"class":416},">",[176,453,417],{"class":454},"s1hO9",[176,456,457],{"class":416},"/",[176,459,421],{"class":420},[176,461,462],{"class":416},">\n",[11,464,465],{},[90,466,467],{},"3. Chrome Extension",[11,469,470],{},"An optional Chrome extension is available for more advanced use cases. It adds capabilities for multi-page and cross-tab navigation, which are not part of the core single-page library.",[69,472,474],{"id":473},"where-page-agent-fits-best","Where Page Agent Fits Best",[11,476,477],{},"Page Agent is designed for enhancing existing web applications with AI copilot features rather than for large-scale, server-side automation or scraping. It is well-suited for a variety of applications.",[84,479,480,486,492],{},[87,481,482,485],{},[90,483,484],{},"SaaS AI Copilots",": You can embed it into ERP, CRM, or admin panels to translate complex multi-step workflows into simple natural language commands.",[87,487,488,491],{},[90,489,490],{},"Accessibility Improvements",": It can serve as the basis for a voice or natural language interface for any web application.",[87,493,494,497],{},[90,495,496],{},"Internal Tools",": It can simplify processes on internal dashboards or legacy systems that require many clicks to perform a single task.",[11,499,500],{},"It is important to be aware of its limitations. The agent's reasoning capabilities are dependent on the quality of the connected LLM. Its core functionality is focused on single-page interactions, though an extension is available to manage multi-page tasks.",[502,503],"article-signup-cta",{":demoAction":504,"heading":505,"subtitle":506},"{\"text\":\"See How Webfuse Works\",\"href\":\"/use-case/ai-agents\"}","Ship AI Agents Into Any Website - No Source Code Needed","Page Agent shows what an in-page agent can do when you control the app. Webfuse is for the next problem: bringing agents, copilots, and custom workflows into live web sessions for apps you do not own, without browser installs.",[30,508,510],{"id":509},"build-a-lost-package-support-agent","Build a Lost-Package Support Agent",[11,512,513],{},"This walkthrough demonstrates how to use Page Agent to build a support chatbot for a typical e-commerce store. The agent will handle a common customer issue: a lost package.",[11,515,516,519,520,122],{},[90,517,518],{},"The full source code"," is available on ",[37,521,524],{"href":522,"rel":523},"https://github.com/webfuse-com/page-agent-demo",[41],"GitHub",[69,526,528],{"id":527},"demo-setup-acme-outfitters","Demo Setup: ACME Outfitters",[11,530,531,532,535],{},"The demonstration takes place on a fictional e-commerce site called ",[90,533,534],{},"ACME Outfitters",". This site has standard features like a product catalog, a shopping cart, an order history page, and an account management section. The support chatbot we will build appears as a floating action button in the bottom-right corner of the screen, expanding into a chat window when clicked.",[57,537],{":height":59,":width":60,"alt":538,"loading":62,"src":539,"format":64},"Screenshot of the ACME Outfitters fictional e-commerce site with the Page Agent support chatbot rendered as a floating action button in the bottom-right corner","/blog/how-to-build-an-ai-support-agent-with-alibaba-page-agent/3.png",[69,541,543],{"id":542},"initialize-page-agent-and-hide-the-default-panel","Initialize Page Agent and Hide the Default Panel",[11,545,546,547,550,551,556,557,560],{},"The core logic for our agent is encapsulated in an ",[173,548,549],{},"agent.ts"," file. Here, we instantiate Page Agent and configure it to work with our custom chat interface. We connect it to ",[37,552,555],{"href":553,"rel":554},"https://ai.google.dev/",[41],"Google Gemini"," via an OpenAI-compatible endpoint. A small but important configuration detail is to ensure the ",[173,558,559],{},"baseURL"," includes a trailing slash, as some API providers require it.",[11,562,563,564,567],{},"A major step is disabling the default user interface that comes with the library. Page Agent includes a built-in panel for showing its plan and asking for user confirmation. Since we are creating our own chat UI, we call ",[173,565,566],{},"agent.panel.dispose()"," immediately after initialization to prevent the default panel from appearing.",[166,569,571],{"className":196,"code":570,"language":198,"meta":171,"style":171},"// src/agent.ts\nimport { PageAgent } from 'page-agent';\n\nlet agent: PageAgent | null = null;\n\nexport function initializeAgent() {\n  if (agent) {\n    return agent;\n  }\n\n  agent = new PageAgent({\n    model: 'gemini-1.5-pro-latest',\n    // The trailing slash is important for certain endpoints.\n    baseURL: 'https://generativelanguage.googleapis.com/v1beta/models/',\n    apiKey: 'YOUR_GEMINI_API_KEY',\n    language: 'en-US',\n  });\n\n  // Hide the default Page Agent UI panel.\n  agent.panel.dispose();\n\n  return agent;\n}\n\nexport function getAgent() {\n  if (!agent) {\n    throw new Error('Agent not initialized.');\n  }\n  return agent;\n}\n",[173,572,573,578,598,602,629,633,651,661,670,675,679,695,714,720,737,754,770,780,785,791,812,817,827,833,838,852,868,892,897,906],{"__ignoreMap":171},[176,574,575],{"class":178,"line":179},[176,576,577],{"class":291},"// src/agent.ts\n",[176,579,580,582,584,586,588,590,592,594,596],{"class":178,"line":235},[176,581,206],{"class":205},[176,583,210],{"class":209},[176,585,214],{"class":213},[176,587,217],{"class":209},[176,589,220],{"class":205},[176,591,224],{"class":223},[176,593,144],{"class":186},[176,595,229],{"class":223},[176,597,232],{"class":209},[176,599,600],{"class":178,"line":242},[176,601,239],{"emptyLinePlaceholder":238},[176,603,604,607,610,612,615,618,622,624,627],{"class":178,"line":271},[176,605,606],{"class":245},"let",[176,608,250],{"class":609},"s5Zek",[176,611,278],{"class":416},[176,613,261],{"class":614},"soKit",[176,616,617],{"class":416}," |",[176,619,621],{"class":620},"sZlfg"," null",[176,623,254],{"class":253},[176,625,621],{"class":626},"sZgVk",[176,628,232],{"class":209},[176,630,631],{"class":178,"line":295},[176,632,239],{"emptyLinePlaceholder":238},[176,634,635,638,641,644,648],{"class":178,"line":315},[176,636,637],{"class":205},"export",[176,639,640],{"class":245}," function",[176,642,643],{"class":182}," initializeAgent",[176,645,647],{"class":646},"sMtgK","()",[176,649,650],{"class":209}," {\n",[176,652,653,656,659],{"class":178,"line":333},[176,654,655],{"class":245},"  if",[176,657,658],{"class":213}," (agent) ",[176,660,268],{"class":209},[176,662,663,666,668],{"class":178,"line":350},[176,664,665],{"class":205},"    return",[176,667,250],{"class":213},[176,669,232],{"class":209},[176,671,672],{"class":178,"line":360},[176,673,674],{"class":209},"  }\n",[176,676,677],{"class":178,"line":365},[176,678,239],{"emptyLinePlaceholder":238},[176,680,682,685,687,689,691,693],{"class":178,"line":681},11,[176,683,684],{"class":213},"  agent ",[176,686,428],{"class":253},[176,688,258],{"class":257},[176,690,261],{"class":182},[176,692,264],{"class":213},[176,694,268],{"class":209},[176,696,698,701,704,706,709,711],{"class":178,"line":697},12,[176,699,700],{"class":213},"    model",[176,702,278],{"class":703},"sVS64",[176,705,224],{"class":223},[176,707,708],{"class":186},"gemini-1.5-pro-latest",[176,710,229],{"class":223},[176,712,330],{"class":713},"sdjIP",[176,715,717],{"class":178,"line":716},13,[176,718,719],{"class":291},"    // The trailing slash is important for certain endpoints.\n",[176,721,723,726,728,730,733,735],{"class":178,"line":722},14,[176,724,725],{"class":213},"    baseURL",[176,727,278],{"class":703},[176,729,224],{"class":223},[176,731,732],{"class":186},"https://generativelanguage.googleapis.com/v1beta/models/",[176,734,229],{"class":223},[176,736,330],{"class":713},[176,738,740,743,745,747,750,752],{"class":178,"line":739},15,[176,741,742],{"class":213},"    apiKey",[176,744,278],{"class":703},[176,746,224],{"class":223},[176,748,749],{"class":186},"YOUR_GEMINI_API_KEY",[176,751,229],{"class":223},[176,753,330],{"class":713},[176,755,757,760,762,764,766,768],{"class":178,"line":756},16,[176,758,759],{"class":213},"    language",[176,761,278],{"class":703},[176,763,224],{"class":223},[176,765,343],{"class":186},[176,767,229],{"class":223},[176,769,330],{"class":713},[176,771,773,776,778],{"class":178,"line":772},17,[176,774,775],{"class":209},"  }",[176,777,355],{"class":213},[176,779,232],{"class":209},[176,781,783],{"class":178,"line":782},18,[176,784,239],{"emptyLinePlaceholder":238},[176,786,788],{"class":178,"line":787},19,[176,789,790],{"class":291},"  // Hide the default Page Agent UI panel.\n",[176,792,794,797,799,803,805,808,810],{"class":178,"line":793},20,[176,795,796],{"class":213},"  agent",[176,798,122],{"class":373},[176,800,802],{"class":801},"sHY1S","panel",[176,804,122],{"class":373},[176,806,807],{"class":182},"dispose",[176,809,647],{"class":213},[176,811,232],{"class":209},[176,813,815],{"class":178,"line":814},21,[176,816,239],{"emptyLinePlaceholder":238},[176,818,820,823,825],{"class":178,"line":819},22,[176,821,822],{"class":205},"  return",[176,824,250],{"class":213},[176,826,232],{"class":209},[176,828,830],{"class":178,"line":829},23,[176,831,832],{"class":209},"}\n",[176,834,836],{"class":178,"line":835},24,[176,837,239],{"emptyLinePlaceholder":238},[176,839,841,843,845,848,850],{"class":178,"line":840},25,[176,842,637],{"class":205},[176,844,640],{"class":245},[176,846,847],{"class":182}," getAgent",[176,849,647],{"class":646},[176,851,650],{"class":209},[176,853,855,857,860,863,866],{"class":178,"line":854},26,[176,856,655],{"class":245},[176,858,859],{"class":213}," (",[176,861,862],{"class":253},"!",[176,864,865],{"class":213},"agent) ",[176,867,268],{"class":209},[176,869,871,874,876,879,881,883,886,888,890],{"class":178,"line":870},27,[176,872,873],{"class":205},"    throw",[176,875,258],{"class":257},[176,877,878],{"class":182}," Error",[176,880,264],{"class":213},[176,882,229],{"class":223},[176,884,885],{"class":186},"Agent not initialized.",[176,887,229],{"class":223},[176,889,355],{"class":213},[176,891,232],{"class":209},[176,893,895],{"class":178,"line":894},28,[176,896,674],{"class":209},[176,898,900,902,904],{"class":178,"line":899},29,[176,901,822],{"class":205},[176,903,250],{"class":213},[176,905,232],{"class":209},[176,907,909],{"class":178,"line":908},30,[176,910,832],{"class":209},[69,912,914],{"id":913},"build-the-chat-interface-around-stateless-agent-calls","Build the Chat Interface Around Stateless Agent Calls",[11,916,917,918,921],{},"The chat interface is a React component named ",[173,919,920],{},"SupportChat.tsx",". It manages the visual elements of the conversation: a floating button to open the chat, a message area with bubbles for the user and the bot, a typing indicator to show when the agent is working, and an input field with a send button.",[11,923,924,925,928],{},"One of the main considerations when building a conversational agent is memory. Each call to ",[173,926,927],{},"agent.execute(task)"," is stateless; the agent has no memory of previous interactions. To create a continuous conversation, we need to provide the history manually. Before sending a new user message to the agent, we prepend the last six turns of the conversation transcript. This gives the LLM enough context to understand the user's intent and provide a relevant response.",[166,930,934],{"className":931,"code":932,"language":933,"meta":171,"style":171},"language-javascript shiki shiki-themes catppuccin-latte night-owl","// A simplified excerpt from src/components/SupportChat.tsx\n\n// ... (imports and state management for messages, isLoading, etc.)\n\nconst handleSend = async () => {\n  setIsLoading(true);\n  const agent = getAgent();\n\n  // Create conversational history for the agent.\n  const history = messages.slice(-6).map(m => `${m.sender}: ${m.text}`).join('\\n');\n  const task = `${history}\\n\\nUser: ${currentUserInput}`;\n\n  // Set the system prompt and execute the task.\n  agent.setSystemInstruction(SYSTEM_PROMPT); // SYSTEM_PROMPT is defined elsewhere\n  const response = await agent.execute(task);\n\n  // Add agent's final message to the chat.\n  setMessages(prev => [...prev, { sender: 'bot', text: response.result }]);\n  setIsLoading(false);\n};\n\n// ... (JSX for the chat window)\n","javascript",[173,935,936,941,945,950,954,974,988,1003,1007,1012,1110,1147,1151,1156,1178,1205,1209,1214,1272,1285,1291,1295],{"__ignoreMap":171},[176,937,938],{"class":178,"line":179},[176,939,940],{"class":291},"// A simplified excerpt from src/components/SupportChat.tsx\n",[176,942,943],{"class":178,"line":235},[176,944,239],{"emptyLinePlaceholder":238},[176,946,947],{"class":178,"line":242},[176,948,949],{"class":291},"// ... (imports and state management for messages, isLoading, etc.)\n",[176,951,952],{"class":178,"line":271},[176,953,239],{"emptyLinePlaceholder":238},[176,955,956,958,961,963,966,969,972],{"class":178,"line":295},[176,957,246],{"class":245},[176,959,960],{"class":182}," handleSend",[176,962,254],{"class":253},[176,964,965],{"class":205}," async",[176,967,968],{"class":646}," ()",[176,970,971],{"class":245}," =>",[176,973,650],{"class":267},[176,975,976,979,981,984,986],{"class":178,"line":315},[176,977,978],{"class":182},"  setIsLoading",[176,980,264],{"class":213},[176,982,446],{"class":983},"sV9Lp",[176,985,355],{"class":213},[176,987,232],{"class":267},[176,989,990,993,995,997,999,1001],{"class":178,"line":333},[176,991,992],{"class":245},"  const",[176,994,250],{"class":249},[176,996,254],{"class":253},[176,998,847],{"class":182},[176,1000,647],{"class":213},[176,1002,232],{"class":267},[176,1004,1005],{"class":178,"line":350},[176,1006,239],{"emptyLinePlaceholder":238},[176,1008,1009],{"class":178,"line":360},[176,1010,1011],{"class":291},"  // Create conversational history for the agent.\n",[176,1013,1014,1016,1019,1021,1025,1027,1030,1032,1035,1039,1041,1043,1046,1048,1052,1054,1058,1062,1064,1066,1070,1072,1076,1078,1080,1082,1085,1087,1090,1092,1094,1097,1099,1101,1104,1106,1108],{"class":178,"line":365},[176,1015,992],{"class":245},[176,1017,1018],{"class":249}," history",[176,1020,254],{"class":253},[176,1022,1024],{"class":1023},"sP4PM"," messages",[176,1026,122],{"class":373},[176,1028,1029],{"class":182},"slice",[176,1031,264],{"class":213},[176,1033,1034],{"class":253},"-",[176,1036,1038],{"class":1037},"sZ_Zo","6",[176,1040,355],{"class":213},[176,1042,122],{"class":373},[176,1044,1045],{"class":182},"map",[176,1047,264],{"class":213},[176,1049,1051],{"class":1050},"svrsB","m",[176,1053,971],{"class":245},[176,1055,1057],{"class":1056},"sizNf"," `",[176,1059,1061],{"class":1060},"sDF9U","${",[176,1063,1051],{"class":1023},[176,1065,122],{"class":373},[176,1067,1069],{"class":1068},"s8apv","sender",[176,1071,217],{"class":1060},[176,1073,1075],{"class":1074},"sjYFf",": ",[176,1077,1061],{"class":1060},[176,1079,1051],{"class":1023},[176,1081,122],{"class":373},[176,1083,1084],{"class":1068},"text",[176,1086,217],{"class":1060},[176,1088,1089],{"class":1056},"`",[176,1091,355],{"class":213},[176,1093,122],{"class":373},[176,1095,1096],{"class":182},"join",[176,1098,264],{"class":213},[176,1100,229],{"class":223},[176,1102,1103],{"class":386},"\\n",[176,1105,229],{"class":223},[176,1107,355],{"class":213},[176,1109,232],{"class":267},[176,1111,1112,1114,1117,1119,1121,1123,1127,1129,1133,1136,1138,1141,1143,1145],{"class":178,"line":681},[176,1113,992],{"class":245},[176,1115,1116],{"class":249}," task",[176,1118,254],{"class":253},[176,1120,1057],{"class":1056},[176,1122,1061],{"class":1060},[176,1124,1126],{"class":1125},"soAP-","history",[176,1128,217],{"class":1060},[176,1130,1132],{"class":1131},"spSUq","\\n\\n",[176,1134,1135],{"class":1074},"User: ",[176,1137,1061],{"class":1060},[176,1139,1140],{"class":1125},"currentUserInput",[176,1142,217],{"class":1060},[176,1144,1089],{"class":1056},[176,1146,232],{"class":267},[176,1148,1149],{"class":178,"line":697},[176,1150,239],{"emptyLinePlaceholder":238},[176,1152,1153],{"class":178,"line":716},[176,1154,1155],{"class":291},"  // Set the system prompt and execute the task.\n",[176,1157,1158,1160,1162,1165,1167,1170,1172,1175],{"class":178,"line":722},[176,1159,796],{"class":1023},[176,1161,122],{"class":373},[176,1163,1164],{"class":182},"setSystemInstruction",[176,1166,264],{"class":213},[176,1168,1169],{"class":249},"SYSTEM_PROMPT",[176,1171,355],{"class":213},[176,1173,1174],{"class":267},";",[176,1176,1177],{"class":291}," // SYSTEM_PROMPT is defined elsewhere\n",[176,1179,1180,1182,1185,1187,1190,1192,1194,1196,1198,1201,1203],{"class":178,"line":739},[176,1181,992],{"class":245},[176,1183,1184],{"class":249}," response",[176,1186,254],{"class":253},[176,1188,1189],{"class":205}," await",[176,1191,250],{"class":1023},[176,1193,122],{"class":373},[176,1195,376],{"class":182},[176,1197,264],{"class":213},[176,1199,1200],{"class":1125},"task",[176,1202,355],{"class":213},[176,1204,232],{"class":267},[176,1206,1207],{"class":178,"line":756},[176,1208,239],{"emptyLinePlaceholder":238},[176,1210,1211],{"class":178,"line":772},[176,1212,1213],{"class":291},"  // Add agent's final message to the chat.\n",[176,1215,1216,1219,1221,1224,1226,1229,1232,1234,1236,1238,1241,1243,1245,1248,1250,1252,1255,1257,1259,1261,1264,1267,1270],{"class":178,"line":782},[176,1217,1218],{"class":182},"  setMessages",[176,1220,264],{"class":213},[176,1222,1223],{"class":1050},"prev",[176,1225,971],{"class":245},[176,1227,1228],{"class":213}," [",[176,1230,1231],{"class":416},"...",[176,1233,1223],{"class":1125},[176,1235,288],{"class":209},[176,1237,210],{"class":209},[176,1239,1240],{"class":609}," sender",[176,1242,278],{"class":703},[176,1244,224],{"class":223},[176,1246,1247],{"class":186},"bot",[176,1249,229],{"class":223},[176,1251,288],{"class":209},[176,1253,1254],{"class":609}," text",[176,1256,278],{"class":703},[176,1258,1184],{"class":1023},[176,1260,122],{"class":373},[176,1262,1263],{"class":1068},"result",[176,1265,1266],{"class":209}," }",[176,1268,1269],{"class":213},"])",[176,1271,232],{"class":267},[176,1273,1274,1276,1278,1281,1283],{"class":178,"line":787},[176,1275,978],{"class":182},[176,1277,264],{"class":213},[176,1279,1280],{"class":983},"false",[176,1282,355],{"class":213},[176,1284,232],{"class":267},[176,1286,1287,1289],{"class":178,"line":793},[176,1288,217],{"class":267},[176,1290,232],{"class":209},[176,1292,1293],{"class":178,"line":814},[176,1294,239],{"emptyLinePlaceholder":238},[176,1296,1297],{"class":178,"line":819},[176,1298,1299],{"class":291},"// ... (JSX for the chat window)\n",[69,1301,1303],{"id":1302},"write-the-guardrails-that-control-the-agent","Write the Guardrails That Control the Agent",[11,1305,1306],{},"The agent's personality, rules, and workflow are defined in the system prompt. This text is sent to the LLM with every request and guides its behavior. A well-constructed prompt is needed for creating a reliable and helpful support agent.",[11,1308,1309],{},"Below is the full system prompt used for the ACME Outfitters chatbot.",[166,1311,1315],{"className":1312,"code":1313,"language":1314,"meta":171,"style":171},"language-shell shiki shiki-themes catppuccin-latte night-owl","You are a friendly and helpful customer support agent for ACME Outfitters.\nYour goal is to resolve customer issues by taking actions on their behalf directly on the website.\n\n**Your Rules:**\n\n1.  **Communicate Naturally**: Talk like a human. Do not narrate your actions.\n    - BAD: \"Okay, I will now navigate to the /orders page to check the status of order ACM-10318.\"\n    - GOOD: \"Let me check on that order for you. One moment.\"\n\n2.  **Follow a Two-Step Compensation Flow**: When a user's package is lost (shipped but not delivered), you must first diagnose the problem and then offer compensation options. Do not take action until the user chooses an option.\n    - Step 1: Check the order status. Confirm it was shipped but not delivered.\n    - Step 2: Offer the user two choices: (a) instant store credit or (b) a refund to their original payment method that will take 5-10 business days.\n    - Step 3: Wait for the user to reply with their choice.\n    - Step 4: Execute the chosen action.\n\n3.  **Diagnose Before Offering**: Always explain *why* you are offering a solution. For a lost package, you should state that you see it was shipped but not delivered according to the tracking information.\n\n4.  **Confirm Sensitive Actions**: Never submit a payment, apply a discount, or issue a refund without first presenting the options to the user and getting their explicit confirmation. For refunds, you will use the built-in refund modal on the order details page.\n\n5.  **Be Concise**: Keep your answers short and to the point.\n","shell",[173,1316,1317,1354,1405,1409,1420,1424,1468,1485,1499,1503,1541,1546,1551,1556,1561,1565,1570,1574,1579,1583],{"__ignoreMap":171},[176,1318,1319,1322,1325,1328,1331,1334,1337,1340,1343,1345,1348,1351],{"class":178,"line":179},[176,1320,1321],{"class":182},"You",[176,1323,1324],{"class":186}," are",[176,1326,1327],{"class":186}," a",[176,1329,1330],{"class":186}," friendly",[176,1332,1333],{"class":186}," and",[176,1335,1336],{"class":186}," helpful",[176,1338,1339],{"class":186}," customer",[176,1341,1342],{"class":186}," support",[176,1344,250],{"class":186},[176,1346,1347],{"class":186}," for",[176,1349,1350],{"class":186}," ACME",[176,1352,1353],{"class":186}," Outfitters.\n",[176,1355,1356,1359,1362,1365,1368,1371,1373,1376,1379,1382,1385,1388,1391,1394,1397,1399,1402],{"class":178,"line":235},[176,1357,1358],{"class":182},"Your",[176,1360,1361],{"class":186}," goal",[176,1363,1364],{"class":186}," is",[176,1366,1367],{"class":186}," to",[176,1369,1370],{"class":186}," resolve",[176,1372,1339],{"class":186},[176,1374,1375],{"class":186}," issues",[176,1377,1378],{"class":186}," by",[176,1380,1381],{"class":186}," taking",[176,1383,1384],{"class":186}," actions",[176,1386,1387],{"class":186}," on",[176,1389,1390],{"class":186}," their",[176,1392,1393],{"class":186}," behalf",[176,1395,1396],{"class":186}," directly",[176,1398,1387],{"class":186},[176,1400,1401],{"class":186}," the",[176,1403,1404],{"class":186}," website.\n",[176,1406,1407],{"class":178,"line":242},[176,1408,239],{"emptyLinePlaceholder":238},[176,1410,1411,1414,1417],{"class":178,"line":271},[176,1412,1413],{"class":416},"**",[176,1415,1416],{"class":213},"Your Rules:",[176,1418,1419],{"class":416},"**\n",[176,1421,1422],{"class":178,"line":295},[176,1423,239],{"emptyLinePlaceholder":238},[176,1425,1426,1429,1432,1435,1438,1440,1442,1445,1448,1450,1453,1456,1459,1462,1465],{"class":178,"line":315},[176,1427,1428],{"class":182},"1.",[176,1430,1431],{"class":454},"  **",[176,1433,1434],{"class":186},"Communicate",[176,1436,1437],{"class":186}," Naturally",[176,1439,1413],{"class":454},[176,1441,278],{"class":186},[176,1443,1444],{"class":186}," Talk",[176,1446,1447],{"class":186}," like",[176,1449,1327],{"class":186},[176,1451,1452],{"class":186}," human.",[176,1454,1455],{"class":186}," Do",[176,1457,1458],{"class":186}," not",[176,1460,1461],{"class":186}," narrate",[176,1463,1464],{"class":186}," your",[176,1466,1467],{"class":186}," actions.\n",[176,1469,1470,1473,1476,1479,1482],{"class":178,"line":333},[176,1471,1472],{"class":182},"    -",[176,1474,1475],{"class":186}," BAD:",[176,1477,1478],{"class":223}," \"",[176,1480,1481],{"class":186},"Okay, I will now navigate to the /orders page to check the status of order ACM-10318.",[176,1483,1484],{"class":223},"\"\n",[176,1486,1487,1489,1492,1494,1497],{"class":178,"line":350},[176,1488,1472],{"class":182},[176,1490,1491],{"class":186}," GOOD:",[176,1493,1478],{"class":223},[176,1495,1496],{"class":186},"Let me check on that order for you. One moment.",[176,1498,1484],{"class":223},[176,1500,1501],{"class":178,"line":360},[176,1502,239],{"emptyLinePlaceholder":238},[176,1504,1505,1508,1510,1513,1515,1518,1521,1524,1526,1528,1531,1533,1536,1538],{"class":178,"line":365},[176,1506,1507],{"class":182},"2.",[176,1509,1431],{"class":454},[176,1511,1512],{"class":186},"Follow",[176,1514,1327],{"class":186},[176,1516,1517],{"class":186}," Two-Step",[176,1519,1520],{"class":186}," Compensation",[176,1522,1523],{"class":186}," Flow",[176,1525,1413],{"class":454},[176,1527,278],{"class":186},[176,1529,1530],{"class":186}," When",[176,1532,1327],{"class":186},[176,1534,1535],{"class":186}," user",[176,1537,229],{"class":223},[176,1539,1540],{"class":186},"s package is lost (shipped but not delivered), you must first diagnose the problem and then offer compensation options. Do not take action until the user chooses an option.\n",[176,1542,1543],{"class":178,"line":681},[176,1544,1545],{"class":186},"    - Step 1: Check the order status. Confirm it was shipped but not delivered.\n",[176,1547,1548],{"class":178,"line":697},[176,1549,1550],{"class":186},"    - Step 2: Offer the user two choices: (a) instant store credit or (b) a refund to their original payment method that will take 5-10 business days.\n",[176,1552,1553],{"class":178,"line":716},[176,1554,1555],{"class":186},"    - Step 3: Wait for the user to reply with their choice.\n",[176,1557,1558],{"class":178,"line":722},[176,1559,1560],{"class":186},"    - Step 4: Execute the chosen action.\n",[176,1562,1563],{"class":178,"line":739},[176,1564,239],{"emptyLinePlaceholder":238},[176,1566,1567],{"class":178,"line":756},[176,1568,1569],{"class":186},"3.  **Diagnose Before Offering**: Always explain *why* you are offering a solution. For a lost package, you should state that you see it was shipped but not delivered according to the tracking information.\n",[176,1571,1572],{"class":178,"line":772},[176,1573,239],{"emptyLinePlaceholder":238},[176,1575,1576],{"class":178,"line":782},[176,1577,1578],{"class":186},"4.  **Confirm Sensitive Actions**: Never submit a payment, apply a discount, or issue a refund without first presenting the options to the user and getting their explicit confirmation. For refunds, you will use the built-in refund modal on the order details page.\n",[176,1580,1581],{"class":178,"line":787},[176,1582,239],{"emptyLinePlaceholder":238},[176,1584,1585],{"class":178,"line":793},[176,1586,1587],{"class":186},"5.  **Be Concise**: Keep your answers short and to the point.\n",[11,1589,1590],{},"This prompt establishes several important guardrails.",[84,1592,1593,1596,1599,1602],{},[87,1594,1595],{},"The first rule ensures the user experience feels like a conversation, not a technical log of an automation script.",[87,1597,1598],{},"The two-step flow is the core of the business logic. It forces the agent to assess a situation, present choices, and wait for input before acting. This human-in-the-loop pattern is a good practice for actions with financial consequences.",[87,1600,1601],{},"The diagnosis rule builds user trust by showing the agent has correctly understood the problem.",[87,1603,1604],{},"The final rule on sensitive actions is a safety measure to prevent the agent from making unapproved changes to a user's account.",[69,1606,1608],{"id":1607},"run-the-full-lost-package-workflow","Run the Full Lost-Package Workflow",[11,1610,1611],{},"We can now walk through a complete lost-package scenario to see how the UI, the agent, and the prompt work together.",[11,1613,1614,1617],{},[90,1615,1616],{},"1. Customer Initiates Chat."," The user, having not received their order, opens the chat widget and types their complaint.",[11,1619,1620,1623],{},[90,1621,1622],{},"Customer",": \"My package never arrived. Order ACM-10318.\"",[11,1625,1626,1629,1630,1633],{},[90,1627,1628],{},"2. Bot Diagnoses and Responds."," The agent receives the message, prepended with the (empty) chat history. Guided by the system prompt, it first navigates to the ",[173,1631,1632],{},"/orders"," page, finds order ACM-10318, and inspects its status. It sees the order was shipped five days ago but the tracking information does not confirm delivery. Following the two-step flow, it presents the compensation options.",[11,1635,1636,1639],{},[90,1637,1638],{},"Bot",": \"I'm sorry to hear about that. I see that order ACM-10318 shipped 5 days ago, but the tracking shows it hasn't been delivered. I can help you with that. I can offer you two options:",[1641,1642,1643,1646],"ol",{},[87,1644,1645],{},"An instant store credit for the full amount of $189.00.",[87,1647,1648],{},"A refund to your original payment method, which should appear in 5–10 business days.\"",[57,1650],{":height":59,":width":60,"alt":1651,"loading":62,"src":1652,"format":64},"Page Agent rendering the support chatbot's diagnostic response with two compensation options after inspecting the order status on the ACME Outfitters orders page","/blog/how-to-build-an-ai-support-agent-with-alibaba-page-agent/4.png",[11,1654,1655,1658],{},[90,1656,1657],{},"3. Customer Makes a Choice."," The user prefers the immediate resolution.",[11,1660,1661,1663],{},[90,1662,1622],{},": \"Store credit please.\"",[11,1665,1666,1669],{},[90,1667,1668],{},"4. Bot Takes Action."," This message triggers the agent to execute the second part of its instructed flow. It navigates back to the order details page for ACM-10318 and opens the refund modal. Inside the modal, it selects the \"Store Credit\" radio button, types \"Package lost in transit\" into the reason field, and submits the form. Page Agent can overlay a mask on the screen to show what elements it is interacting with, giving the user visibility into its actions.",[57,1671],{":height":59,":width":60,"alt":1672,"loading":62,"src":1673,"format":64},"Page Agent operating the refund modal on the ACME Outfitters order details page, with the \"Store Credit\" option selected and the reason field filled in","/blog/how-to-build-an-ai-support-agent-with-alibaba-page-agent/5.png",[11,1675,1676,1679],{},[90,1677,1678],{},"5. Bot Confirms Completion."," Once the action is complete, the agent provides a final confirmation message to the user.",[11,1681,1682,1684],{},[90,1683,1638],{},": \"Done - $189.00 in store credit has been added to your account.\"",[11,1686,1687,1690],{},[90,1688,1689],{},"6. Verification."," The user can now verify the action. Navigating to their account page shows a new balance of $189.00 in store credit. The order history for ACM-10318 also shows a note indicating that a credit was issued. The task is complete.",[30,1692,1694],{"id":1693},"production-trade-offs-and-limitations","Production Trade-Offs and Limitations",[11,1696,1697],{},"While Page Agent is a capable tool, its client-side, text-first architecture involves specific trade-offs. Recognizing these limitations is important for planning a production deployment and setting correct expectations for its performance and scope.",[11,1699,1700],{},"Here are six points for developers to consider.",[11,1702,1703,1706],{},[90,1704,1705],{},"1. Constrained by the Document Object Model (DOM)."," The agent operates exclusively through the user interface, meaning it can only perform actions that a human user can. It interacts with the rendered DOM and cannot directly call backend APIs or access functions not exposed through UI elements. For example, before instructing the agent to process a refund, a corresponding modal or form must already exist and be accessible in the UI.",[11,1708,1709,1712],{},[90,1710,1711],{},"2. Sensitivity to UI Implementation."," The agent's effectiveness is tied to the structure of the webpage's HTML. It performs best with clear, semantic code, and its ability to interact with elements can be diminished in certain scenarios.",[84,1714,1715,1721,1727],{},[87,1716,1717,1720],{},[90,1718,1719],{},"Complex or Non-Standard DOM",": UIs with heavily nested shadow DOM can obscure elements from the agent's context.",[87,1722,1723,1726],{},[90,1724,1725],{},"Virtualized Lists",": Long lists that only render a subset of items to the DOM at any time will prevent the agent from interacting with off-screen elements.",[87,1728,1729,1732,1733,1736],{},[90,1730,1731],{},"Canvas-Based Interfaces",": Interfaces built entirely within a ",[173,1734,1735],{},"\u003Ccanvas>"," element are not suitable, as the agent relies on parsing HTML elements, not pixels.",[11,1738,1739,1742],{},[90,1740,1741],{},"3. Performance and Cost Considerations."," Executing a task involves multiple round-trips to an LLM for reasoning and action planning, which introduces two main considerations for developers. The first is latency, as a multi-step workflow can take several seconds to complete. This makes the agent unsuitable for tasks requiring immediate, real-time interaction. The second is token consumption. Each step sends a sizeable context to the LLM, including the system prompt, the dehydrated DOM, and conversational history. A single complex task can use thousands of tokens, making it important to model API costs before a large-scale deployment.",[11,1744,1745,1748,1749,1752],{},[90,1746,1747],{},"4. Lack of Built-in State Management."," Each ",[173,1750,1751],{},"execute()"," call is treated as a new, independent task. The core library does not include built-in memory for multi-turn conversations or sequential operations. As a developer, you are responsible for managing the application state and providing the necessary conversational history with each request. This adds a layer of complexity to your integration code.",[11,1754,1755,1758],{},[90,1756,1757],{},"5. Reliability of Prompt-Based Guardrails."," The agent's safety mechanisms, such as instructions to \"never auto-submit a payment form,\" are defined within its system prompt. These text-based instructions are persuasive guides, not absolute rules. A highly capable or misaligned LLM could potentially find ways to bypass them. For this reason, prompt-based safety measures should not replace server-side validation for sensitive or destructive actions.",[11,1760,1761,1764,1765,1767],{},[90,1762,1763],{},"6. Single-Page Focus in the Core Library."," The main ",[173,1766,144],{}," package is designed primarily for single-page applications and interactions. While it excels at complex workflows within a single view, it cannot navigate across different tabs or browser windows on its own. Multi-page automation requires the optional Page Agent Chrome Extension, which needs separate installation and user permissions.",[30,1769,1771],{"id":1770},"conclusion-agents-need-a-deployment-layer","Conclusion: Agents Need a Deployment Layer",[11,1773,1774],{},"Page Agent is valuable because it proves a simple point: a web agent is much more useful when it can work with the real interface, the real DOM, and the user's actual session. That is a better mental model than treating every web workflow as a remote browser recording, a brittle script, or a backend-only API integration.",[11,1776,1777],{},"But Page Agent also exposes the production problem. If the agent must be inserted into the application, then you either need source-code access, vendor cooperation, or a browser extension. Agent-browser products run into a related limitation: they can automate inside their own controlled environment, but they are often separated from the user's live session, support context, audit trail, masking rules, and human handoff.",[11,1779,1780],{},"Webfuse's angle is that agents need a deployment layer, not just a smarter browser. A Webfuse session can wrap an existing web application, inject extension logic, add collaboration tools, connect automation, and expose the live session through a shareable link. The original app does not need to be rewritten, and the end user does not need to install a browser extension.",[11,1782,1783],{},"So the practical split is clear: use Page Agent when you own the app and want an embedded client-side copilot. Use Webfuse when the harder problem is bringing agents into applications you do not own, inside real user sessions, with centralized control over the experience. For production AI agents, that deployment surface is often the difference between an impressive demo and something a team can actually roll out.",[1785,1786,1787],"style",{},"html pre.shiki code .sNstc, html code.shiki .sNstc{--shiki-default:#1E66F5;--shiki-default-font-style:italic;--shiki-dark:#82AAFF;--shiki-dark-font-style:italic}html pre.shiki code .sfrMT, html code.shiki .sfrMT{--shiki-default:#40A02B;--shiki-dark:#ECC48D}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .srhcd, html code.shiki .srhcd{--shiki-default:#8839EF;--shiki-default-font-style:inherit;--shiki-dark:#C792EA;--shiki-dark-font-style:italic}html pre.shiki code .scGhl, html code.shiki .scGhl{--shiki-default:#7C7F93;--shiki-dark:#D6DEEB}html pre.shiki code .s2kId, html code.shiki .s2kId{--shiki-default:#4C4F69;--shiki-dark:#D6DEEB}html pre.shiki code .sbuKk, html code.shiki .sbuKk{--shiki-default:#40A02B;--shiki-dark:#D9F5DD}html pre.shiki code .s76yb, html code.shiki .s76yb{--shiki-default:#8839EF;--shiki-dark:#C792EA}html pre.shiki code .scsc5, html code.shiki .scsc5{--shiki-default:#4C4F69;--shiki-default-font-style:inherit;--shiki-dark:#82AAFF;--shiki-dark-font-style:italic}html pre.shiki code .s-_ek, html code.shiki .s-_ek{--shiki-default:#179299;--shiki-dark:#C792EA}html pre.shiki code .szhwX, html code.shiki .szhwX{--shiki-default:#8839EF;--shiki-default-font-weight:bold;--shiki-dark:#7FDBCA;--shiki-dark-font-weight:inherit}html pre.shiki code .sgNGR, html code.shiki .sgNGR{--shiki-default:#7C7F93;--shiki-dark:#C792EA}html pre.shiki code .s3XBt, html code.shiki .s3XBt{--shiki-default:#4C4F69;--shiki-default-font-style:inherit;--shiki-dark:#C792EA;--shiki-dark-font-style:italic}html pre.shiki code .sDmS1, html code.shiki .sDmS1{--shiki-default:#7C7F93;--shiki-default-font-style:italic;--shiki-dark:#637777;--shiki-dark-font-style:italic}html pre.shiki code .s5FwJ, html code.shiki .s5FwJ{--shiki-default:#179299;--shiki-default-font-style:inherit;--shiki-dark:#C792EA;--shiki-dark-font-style:italic}html pre.shiki code .slYKy, html code.shiki .slYKy{--shiki-default:#EA76CB;--shiki-dark:#F78C6C}html pre.shiki code .s9rnR, html code.shiki .s9rnR{--shiki-default:#179299;--shiki-dark:#7FDBCA}html pre.shiki code .sY2RG, html code.shiki .sY2RG{--shiki-default:#1E66F5;--shiki-dark:#CAECE6}html pre.shiki code .swkLt, html code.shiki .swkLt{--shiki-default:#DF8E1D;--shiki-default-font-style:inherit;--shiki-dark:#C5E478;--shiki-dark-font-style:italic}html pre.shiki code .s1hO9, html code.shiki .s1hO9{--shiki-default:#4C4F69;--shiki-dark:#7FDBCA}html pre.shiki code .s5Zek, html code.shiki .s5Zek{--shiki-default:#4C4F69;--shiki-default-font-style:inherit;--shiki-dark:#D6DEEB;--shiki-dark-font-style:italic}html pre.shiki code .soKit, html code.shiki .soKit{--shiki-default:#DF8E1D;--shiki-default-font-style:italic;--shiki-dark:#FFCB8B;--shiki-dark-font-style:italic}html pre.shiki code .sZlfg, html code.shiki .sZlfg{--shiki-default:#8839EF;--shiki-default-font-style:italic;--shiki-dark:#C5E478;--shiki-dark-font-style:italic}html pre.shiki code .sZgVk, html code.shiki .sZgVk{--shiki-default:#8839EF;--shiki-default-font-style:inherit;--shiki-dark:#FF5874;--shiki-dark-font-style:italic}html pre.shiki code .sMtgK, html code.shiki .sMtgK{--shiki-default:#7C7F93;--shiki-dark:#D9F5DD}html pre.shiki code .sVS64, html code.shiki .sVS64{--shiki-default:#179299;--shiki-dark:#D6DEEB}html pre.shiki code .sdjIP, html code.shiki .sdjIP{--shiki-default:#7C7F93;--shiki-dark:#5F7E97}html pre.shiki code .sHY1S, html code.shiki .sHY1S{--shiki-default:#4C4F69;--shiki-default-font-style:inherit;--shiki-dark:#FAF39F;--shiki-dark-font-style:italic}html pre.shiki code .sV9Lp, html code.shiki .sV9Lp{--shiki-default:#FE640B;--shiki-default-font-style:inherit;--shiki-dark:#FF5874;--shiki-dark-font-style:italic}html pre.shiki code .sP4PM, html code.shiki .sP4PM{--shiki-default:#4C4F69;--shiki-default-font-style:inherit;--shiki-dark:#7FDBCA;--shiki-dark-font-style:italic}html pre.shiki code .sZ_Zo, html code.shiki .sZ_Zo{--shiki-default:#FE640B;--shiki-dark:#F78C6C}html pre.shiki code .svrsB, html code.shiki .svrsB{--shiki-default:#E64553;--shiki-default-font-style:italic;--shiki-dark:#D7DBE0;--shiki-dark-font-style:italic}html pre.shiki code .sizNf, html code.shiki .sizNf{--shiki-default:#40A02B;--shiki-dark:#D6DEEB}html pre.shiki code .sDF9U, html code.shiki .sDF9U{--shiki-default:#7C7F93;--shiki-dark:#D3423E}html pre.shiki code .s8apv, html code.shiki .s8apv{--shiki-default:#4C4F69;--shiki-default-font-style:inherit;--shiki-dark:#BAEBE2;--shiki-dark-font-style:italic}html pre.shiki code .sjYFf, html code.shiki .sjYFf{--shiki-default:#40A02B;--shiki-default-font-style:inherit;--shiki-dark:#ECC48D;--shiki-dark-font-style:italic}html pre.shiki code .soAP-, html code.shiki .soAP-{--shiki-default:#4C4F69;--shiki-dark:#D7DBE0}html pre.shiki code .spSUq, html code.shiki .spSUq{--shiki-default:#EA76CB;--shiki-default-font-style:inherit;--shiki-dark:#F78C6C;--shiki-dark-font-style:italic}",{"title":171,"searchDepth":235,"depth":235,"links":1789},[1790,1795,1802,1803],{"id":32,"depth":235,"text":33,"children":1791},[1792,1793,1794],{"id":71,"depth":242,"text":72},{"id":152,"depth":242,"text":153},{"id":473,"depth":242,"text":474},{"id":509,"depth":235,"text":510,"children":1796},[1797,1798,1799,1800,1801],{"id":527,"depth":242,"text":528},{"id":542,"depth":242,"text":543},{"id":913,"depth":242,"text":914},{"id":1302,"depth":242,"text":1303},{"id":1607,"depth":242,"text":1608},{"id":1693,"depth":235,"text":1694},{"id":1770,"depth":235,"text":1771},"ai-agents","2026-05-19","A practical guide to Alibaba's open-source Page Agent library: how it works, how to build a support chatbot with it, and what its limitations reveal about deploying AI agents into real web sessions.","md",[1809,1812,1815,1818],{"question":1810,"answer":1811},"What is Alibaba Page Agent?","Page Agent is an open-source JavaScript/TypeScript library that embeds a GUI AI agent directly inside a webpage. It interacts with the live DOM client-side, inheriting the user's session and cookies, without needing a headless browser or backend service.",{"question":1813,"answer":1814},"How is Page Agent different from Selenium or Puppeteer?","Selenium and Puppeteer drive a browser from the outside through an automation protocol. Page Agent runs inside the page itself, parses the live DOM into semantic text for an LLM, and dispatches native browser events - so it works with the real user session and any single-page app framework.",{"question":1816,"answer":1817},"Can I use Page Agent with any LLM?","Yes. Page Agent supports any OpenAI-compatible API, including OpenAI, Anthropic, Google Gemini, DeepSeek, and local models served by Ollama or LM Studio. You bring your own model and endpoint at initialization.",{"question":1819,"answer":1820},"What are the main limitations of Page Agent in production?","It is constrained to what the DOM exposes, sensitive to UI structure, and each task costs multiple LLM round-trips in latency and tokens. It also assumes you can add the agent to the target app or rely on an extension for broader workflows, which is a deployment challenge for third-party applications.",0,null,{"shortTitle":1824,"homepage":238,"relatedLinks":1825},"Building with Alibaba Page Agent",[1826,1829,1833],{"text":1827,"href":522,"description":1828},"GitHub Repository","Full source code for the Page Agent support chatbot demo.",{"text":1830,"href":1831,"description":1832},"Develop an AI Agent for Any Website","/blog/develop-an-ai-agent-for-any-website-with-webfuse","How to embed an AI agent into any web application without modifying its source code.",{"text":1834,"href":1835,"description":1836},"A Gentle Introduction to AI Agents for the Web","/blog/a-gentle-introduction-to-ai-agents-for-the-web","The fundamentals of web-controlling AI agents and how they perceive the page.","/blog/how-to-build-an-ai-support-agent-with-alibaba-page-agent",{"title":5,"description":1806},{"loc":1837},"blog/1042.how-to-build-an-ai-support-agent-with-alibaba-page-agent",[144,1804,1842,1843,1844],"alibaba","support-chatbot","web-agents","4Mp_-cy5G7OIqrvGcSCqEOACGiJ9_xZxjCKzg7S4fAc",[1847,3445],{"id":1848,"title":1849,"authorId":1850,"body":1851,"category":1804,"created":3421,"description":3422,"extension":1807,"faqs":1822,"featurePriority":1822,"head":1822,"landingPath":1822,"meta":3423,"navigation":238,"ogImage":1822,"path":3435,"robots":1822,"schemaOrg":1822,"seo":3436,"sitemap":3437,"stem":3438,"tags":3439,"__hash__":3444},"blog/blog/1012.dom-downsampling-for-llm-based-web-agents.md","DOM Downsampling for LLM-Based Web Agents","thassilo-schiepanski",{"type":8,"value":1852,"toc":3406},[1853,1858,1881,1885,1892,1896,1911,1915,1921,1925,1943,1968,1971,1975,1978,1989,1995,2026,2030,2048,2060,2065,2081,2095,2098,2102,2122,2126,2134,2146,2150,2153,2506,2512,2519,2683,2690,2781,2788,2860,2869,2875,2884,2888,2894,2903,2915,3132,3150,3172,3178,3221,3225,3237,3246,3251,3256,3259,3263,3269,3274,3312,3316,3322,3326,3336,3340,3343,3403],[57,1854],{":width":1855,"alt":1856,"format":64,"loading":62,"src":1857},"900","Downsampling visualised for digital images and HTML","/blog/dom-downsampling-for-web-agents/1.png",[11,1859,1860,1865,1866,1865,1871,1876,1877,1880],{},[37,1861,1864],{"href":1862,"rel":1863},"https://operator.chatgpt.com",[41],"Operator (OpenAI)",", ",[37,1867,1870],{"href":1868,"rel":1869},"https://www.director.ai",[41],"Director (Browserbase)",[37,1872,1875],{"href":1873,"rel":1874},"https://browser-use.com",[41],"Browser Use"," – we are currently witnessing the rise of ",[90,1878,1879],{},"web AI agents",". The first iteration of serviceable web agents was enabled by frontier LLMs, which act as instantaneous domain model backends. The domain, hereby, corresponds to the landscape of web application UIs.",[30,1882,1884],{"id":1883},"what-is-a-snapshot","What is a Snapshot?",[11,1886,1887,1888,1891],{},"Web agents provide an LLM with a task, and serialised runtime state of a currently browsed web application (e.g., a screenshot). The LLM is ought to suggest relevant actions to perform in the web application. Serialisation of such runtime state is referred to as a ",[90,1889,1890],{},"snapshot",". And the snapshot technique primarily decides the quality of LLM interaction suggestions.",[69,1893,1895],{"id":1894},"gui-snapshots","GUI Snapshots",[11,1897,1898,1899,1902,1903,1906,1907,1910],{},"Screenshots – for consistency reasons referred to as ",[90,1900,1901],{},"GUI snapshots"," – resemble how humans visually perceive web application UIs. LLM APIs subsidise the use of image input through upstream compression. Compresssion, however, irreversibly affects image dimensions, which takes away pixel precision; no way to suggest interactions like ",[15,1904,1905],{},"“click at 100, 735”",". As a workaround, early web agents used ",[15,1908,1909],{},"grounded"," GUI snapshots. Grounding describes adding visual cues to the GUI, such as bounding boxes with numerical identifiers. Grounding lets the LLM refer to specific parts of the page by identifier, so the agent can trace back interaction targets.",[57,1912],{":width":1855,"alt":1913,"format":64,"loading":62,"src":1914},"Grounded GUI snapshot as implemented by Browser Use","/blog/dom-downsampling-for-web-agents/2.png",[11,1916,1917],{},[1918,1919,1920],"small",{},"Grounded GUI snapshot as implemented by Browser Use.",[69,1922,1924],{"id":1923},"dom-snapshots","DOM Snapshots",[11,1926,1927,1928,1938,1939,1942],{},"LLMs arguably are much better at understanding code than images. Research supports they excel at describing and classifying HTML, and also navigating an inherent UI",[1929,1930,1931],"sup",{},[37,1932,1937],{"href":1933,"ariaDescribedBy":1934,"dataFootnoteRef":171,"id":1936},"#user-content-fn-1",[1935],"footnote-label","user-content-fnref-1","1",". The DOM (document object model) – a web browser's runtime state model of a web application – translates back to HTML. For this reason, ",[90,1940,1941],{},"DOM snapshots"," offer a compelling alternative to GUI snapshots. DOM snapshots offer a handful of key advantages:",[1641,1944,1945,1948,1951,1954,1957],{},[87,1946,1947],{},"DOM snapshots connect with LLM code (HTML) interpretation abilities.",[87,1949,1950],{},"DOM snapshots can be compiled from deep clones, hidden from supervision (unlike GUI grounding).",[87,1952,1953],{},"DOM snapshots render text input that on average consume less bandwidth than screnshots.",[87,1955,1956],{},"DOM snapshots allow for exact programmatic targeting of elements (e.g., via CSS selectors).",[87,1958,1959,1960,1963,1964,1967],{},"DOM snapshots are available with the ",[173,1961,1962],{},"DOMContentLoaded"," event (whereas the GUI completes initial rendering with ",[173,1965,1966],{},"load",").",[11,1969,1970],{},"Yet, DOM snapshots have a major problem: potentially exhaustive model context. Whereas GUI snapshot commonly cost four figures of tokens, a raw DOM snapshot can cost into hundreds of thousands of tokens. To connect with LLM code interpretation abilities, however, developers have used element extraction techniques – picking only (likely) important elements from the DOM. Element extraction flattens the DOM tree, which disregards hierarchy as a potential UI feature (how do elements relate to each other?).",[30,1972,1974],{"id":1973},"dom-downsampling-a-novel-approach","DOM Downsampling: A Novel Approach",[11,1976,1977],{},"To enable DOM snapshots for use with web agents, it requires client-side pre-processing – similar to how LLM vision APIs process image input. Downsampling is a fundamental signal processing technique that reduces data that scales out of time or space constraints under the assumption that the majority of relevant features is retained. Picture JPEG compression as an example: put simply, a JPEG image stores only an average colour for patches of pixels. The bigger the patches, the smaller the file. Although some detail is lost, key image features – colours, edges, objects – keep being recognisable – up to a large patch size.",[11,1979,1980,1981,1984,1985,1988],{},"We transfer the concept of ",[90,1982,1983],{},"downsampling"," to ",[90,1986,1987],{},"DOMs",". Particularly, since such an approach retains HTML characteristics that might be valuable for an LLM backend. We define UI features as concepts that, to a substantial degree, facilitate LLM suggestions on how to act in the UI in order to solve related web-based tasks.",[30,1990,1992],{"id":1991},"d2snap",[15,1993,1994],{},"D2Snap",[11,1996,1997,1998,2006,2014,2022,2023,2025],{},"We recently proposed ",[37,1999,2002],{"href":2000,"rel":2001},"https://arxiv.org/abs/2508.04412",[41],[90,2003,2004],{},[15,2005,1994],{},[1929,2007,2008],{},[37,2009,2013],{"href":2010,"ariaDescribedBy":2011,"dataFootnoteRef":171,"id":2012},"#user-content-fn-2",[1935],"user-content-fnref-2","2",[1929,2015,2016],{},[37,2017,2021],{"href":2018,"ariaDescribedBy":2019,"dataFootnoteRef":171,"id":2020},"#user-content-fn-3",[1935],"user-content-fnref-3","3"," – a first-of-its-kind downsampling algorithm for DOMs. Herein, we'll briefly explain how the ",[15,2024,1994],{}," algorithm works, and how it can be utilised to build efficient and performant web agents.",[69,2027,2029],{"id":2028},"how-it-works","How it works",[11,2031,2032,2033,2035,2036,1865,2039,2042,2043,859,2045,1967],{},"There are basically three redundant types of DOM nodes, and HTML concepts: elements, text, and attributes. We defined and empirically adjusted three node-specific procedures. ",[15,2034,1994],{}," downsamples at a variable ratio, configured through procedure-specific parameters  ",[173,2037,2038],{},"k",[173,2040,2041],{},"l",", and ",[173,2044,1051],{},[173,2046,2047],{},"∈ [0, 1]",[2049,2050,2051],"blockquote",{},[11,2052,2053,2054,2059],{},"We used ",[37,2055,2058],{"href":2056,"rel":2057},"https://openai.com/index/hello-gpt-4o/",[41],"GPT-4o"," to create a downsampling ground truth dataset by having it classify HTML elements and scoring semantics regarding relevance for understanding the inherent UI – a UI feature degree.",[2061,2062,2064],"h4",{"id":2063},"procedure-elements","Procedure: Elements",[11,2066,2067,2069,2070,2073,2074,2077,2078,2080],{},[15,2068,1994],{}," downsamples (simplifies) elements by merging container elements like ",[173,2071,2072],{},"section"," and ",[173,2075,2076],{},"div"," together. A parameter ",[173,2079,2038],{}," controls the merge ratio depending on the total DOM tree height. For competing concepts, such as element name, the ground truth determines which element's characterisitics to keep – comparing UI feature scores.",[11,2082,2083,2084,1865,2086,2088,2089,2094],{},"Elements in content elements (",[173,2085,11],{},[173,2087,2049],{},", ...) are translated to a more comprehensive ",[37,2090,2093],{"href":2091,"rel":2092},"https://www.markdownguide.org/basic-syntax/",[41],"Markdown"," representation.",[11,2096,2097],{},"Interactive elements, definite interaction target candidates, are kept as is.",[2061,2099,2101],{"id":2100},"procedure-text","Procedure: Text",[11,2103,2104,2106,2107,2110,2118,2119,2121],{},[15,2105,1994],{}," downsamples text by dropping a fraction. Natural units of text are space-separated words, or punctuation-separated sentences. We reuse the ",[15,2108,2109],{},"TextRank",[1929,2111,2112],{},[37,2113,2117],{"href":2114,"ariaDescribedBy":2115,"dataFootnoteRef":171,"id":2116},"#user-content-fn-4",[1935],"user-content-fnref-4","4"," algorithm to rank sentences in text nodes. The lowest-ranking fraction of sentences, denoted by parameter ",[173,2120,2041],{},", is dropped.",[2061,2123,2125],{"id":2124},"procedure-attributes","Procedure: Attributes",[11,2127,2128,2130,2131,2133],{},[15,2129,1994],{}," downsamples attributes by dropping those with a name that, according to ground truth, holds a UI feature degree below a threshold. Parameter ",[173,2132,1051],{}," denotes this threshold.",[2049,2135,2136],{},[11,2137,2138,2139,2145],{},"Check out the ",[37,2140,2142,2144],{"href":2000,"rel":2141},[41],[15,2143,1994],{}," paper"," to learn about the algorithm in-depth.",[69,2147,2149],{"id":2148},"example-of-a-downsampled-dom","Example of a Downsampled DOM",[11,2151,2152],{},"Consider a partial DOM state, serialised as HTML:",[166,2154,2156],{"className":407,"code":2155,"language":409,"meta":171,"style":171},"\u003Csection class=\"container\" tabindex=\"3\" required=\"true\" type=\"example\">\n  \u003Cdiv class=\"mx-auto\" data-topic=\"products\" required=\"false\">\n    \u003Ch1>Our Pizza\u003C/h1>\n    \u003Cdiv>\n      \u003Cdiv class=\"shadow-lg\">\n        \u003Ch2>Margherita\u003C/h2>\n        \u003Cp>\n          A simple classic: mozzarela, tomatoes and basil.\n          An everyday choice!\n        \u003C/p>\n        \u003Cbutton type=\"button\">Add\u003C/button>\n      \u003C/div>\n      \u003Cdiv class=\"shadow-lg\">\n        \u003Ch2>Capricciosa\u003C/h2>\n        \u003Cp>\n          A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\n          A true favourite!\n          \u003C/p>\n        \u003Cbutton type=\"button\">Add\u003C/button>\n      \u003C/div>\n    \u003C/div>\n  \u003C/div>\n\u003C/section>\n",[173,2157,2158,2212,2254,2274,2282,2302,2320,2328,2333,2338,2347,2375,2384,2402,2419,2427,2432,2437,2446,2472,2480,2489,2498],{"__ignoreMap":171},[176,2159,2160,2162,2164,2167,2169,2171,2174,2176,2179,2181,2183,2185,2187,2190,2192,2194,2196,2198,2201,2203,2205,2208,2210],{"class":178,"line":179},[176,2161,417],{"class":416},[176,2163,2072],{"class":420},[176,2165,2166],{"class":424}," class",[176,2168,428],{"class":416},[176,2170,431],{"class":223},[176,2172,2173],{"class":186},"container",[176,2175,431],{"class":223},[176,2177,2178],{"class":424}," tabindex",[176,2180,428],{"class":416},[176,2182,431],{"class":223},[176,2184,2021],{"class":186},[176,2186,431],{"class":223},[176,2188,2189],{"class":424}," required",[176,2191,428],{"class":416},[176,2193,431],{"class":223},[176,2195,446],{"class":186},[176,2197,431],{"class":223},[176,2199,2200],{"class":424}," type",[176,2202,428],{"class":416},[176,2204,431],{"class":223},[176,2206,2207],{"class":186},"example",[176,2209,431],{"class":223},[176,2211,462],{"class":416},[176,2213,2214,2217,2219,2221,2223,2225,2228,2230,2233,2235,2237,2240,2242,2244,2246,2248,2250,2252],{"class":178,"line":235},[176,2215,2216],{"class":416},"  \u003C",[176,2218,2076],{"class":420},[176,2220,2166],{"class":424},[176,2222,428],{"class":416},[176,2224,431],{"class":223},[176,2226,2227],{"class":186},"mx-auto",[176,2229,431],{"class":223},[176,2231,2232],{"class":424}," data-topic",[176,2234,428],{"class":416},[176,2236,431],{"class":223},[176,2238,2239],{"class":186},"products",[176,2241,431],{"class":223},[176,2243,2189],{"class":424},[176,2245,428],{"class":416},[176,2247,431],{"class":223},[176,2249,1280],{"class":186},[176,2251,431],{"class":223},[176,2253,462],{"class":416},[176,2255,2256,2259,2262,2264,2267,2270,2272],{"class":178,"line":242},[176,2257,2258],{"class":416},"    \u003C",[176,2260,2261],{"class":420},"h1",[176,2263,451],{"class":416},[176,2265,2266],{"class":213},"Our Pizza",[176,2268,2269],{"class":416},"\u003C/",[176,2271,2261],{"class":420},[176,2273,462],{"class":416},[176,2275,2276,2278,2280],{"class":178,"line":271},[176,2277,2258],{"class":416},[176,2279,2076],{"class":420},[176,2281,462],{"class":416},[176,2283,2284,2287,2289,2291,2293,2295,2298,2300],{"class":178,"line":295},[176,2285,2286],{"class":416},"      \u003C",[176,2288,2076],{"class":420},[176,2290,2166],{"class":424},[176,2292,428],{"class":416},[176,2294,431],{"class":223},[176,2296,2297],{"class":186},"shadow-lg",[176,2299,431],{"class":223},[176,2301,462],{"class":416},[176,2303,2304,2307,2309,2311,2314,2316,2318],{"class":178,"line":315},[176,2305,2306],{"class":416},"        \u003C",[176,2308,30],{"class":420},[176,2310,451],{"class":416},[176,2312,2313],{"class":213},"Margherita",[176,2315,2269],{"class":416},[176,2317,30],{"class":420},[176,2319,462],{"class":416},[176,2321,2322,2324,2326],{"class":178,"line":333},[176,2323,2306],{"class":416},[176,2325,11],{"class":420},[176,2327,462],{"class":416},[176,2329,2330],{"class":178,"line":350},[176,2331,2332],{"class":213},"          A simple classic: mozzarela, tomatoes and basil.\n",[176,2334,2335],{"class":178,"line":360},[176,2336,2337],{"class":213},"          An everyday choice!\n",[176,2339,2340,2343,2345],{"class":178,"line":365},[176,2341,2342],{"class":416},"        \u003C/",[176,2344,11],{"class":420},[176,2346,462],{"class":416},[176,2348,2349,2351,2354,2356,2358,2360,2362,2364,2366,2369,2371,2373],{"class":178,"line":681},[176,2350,2306],{"class":416},[176,2352,2353],{"class":420},"button",[176,2355,2200],{"class":424},[176,2357,428],{"class":416},[176,2359,431],{"class":223},[176,2361,2353],{"class":186},[176,2363,431],{"class":223},[176,2365,451],{"class":416},[176,2367,2368],{"class":213},"Add",[176,2370,2269],{"class":416},[176,2372,2353],{"class":420},[176,2374,462],{"class":416},[176,2376,2377,2380,2382],{"class":178,"line":697},[176,2378,2379],{"class":416},"      \u003C/",[176,2381,2076],{"class":420},[176,2383,462],{"class":416},[176,2385,2386,2388,2390,2392,2394,2396,2398,2400],{"class":178,"line":716},[176,2387,2286],{"class":416},[176,2389,2076],{"class":420},[176,2391,2166],{"class":424},[176,2393,428],{"class":416},[176,2395,431],{"class":223},[176,2397,2297],{"class":186},[176,2399,431],{"class":223},[176,2401,462],{"class":416},[176,2403,2404,2406,2408,2410,2413,2415,2417],{"class":178,"line":722},[176,2405,2306],{"class":416},[176,2407,30],{"class":420},[176,2409,451],{"class":416},[176,2411,2412],{"class":213},"Capricciosa",[176,2414,2269],{"class":416},[176,2416,30],{"class":420},[176,2418,462],{"class":416},[176,2420,2421,2423,2425],{"class":178,"line":739},[176,2422,2306],{"class":416},[176,2424,11],{"class":420},[176,2426,462],{"class":416},[176,2428,2429],{"class":178,"line":756},[176,2430,2431],{"class":213},"          A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\n",[176,2433,2434],{"class":178,"line":772},[176,2435,2436],{"class":213},"          A true favourite!\n",[176,2438,2439,2442,2444],{"class":178,"line":782},[176,2440,2441],{"class":416},"          \u003C/",[176,2443,11],{"class":420},[176,2445,462],{"class":416},[176,2447,2448,2450,2452,2454,2456,2458,2460,2462,2464,2466,2468,2470],{"class":178,"line":787},[176,2449,2306],{"class":416},[176,2451,2353],{"class":420},[176,2453,2200],{"class":424},[176,2455,428],{"class":416},[176,2457,431],{"class":223},[176,2459,2353],{"class":186},[176,2461,431],{"class":223},[176,2463,451],{"class":416},[176,2465,2368],{"class":213},[176,2467,2269],{"class":416},[176,2469,2353],{"class":420},[176,2471,462],{"class":416},[176,2473,2474,2476,2478],{"class":178,"line":793},[176,2475,2379],{"class":416},[176,2477,2076],{"class":420},[176,2479,462],{"class":416},[176,2481,2482,2485,2487],{"class":178,"line":814},[176,2483,2484],{"class":416},"    \u003C/",[176,2486,2076],{"class":420},[176,2488,462],{"class":416},[176,2490,2491,2494,2496],{"class":178,"line":819},[176,2492,2493],{"class":416},"  \u003C/",[176,2495,2076],{"class":420},[176,2497,462],{"class":416},[176,2499,2500,2502,2504],{"class":178,"line":829},[176,2501,2269],{"class":416},[176,2503,2072],{"class":420},[176,2505,462],{"class":416},[11,2507,2508,2509,2511],{},"Here are some ",[15,2510,1994],{}," downsampling results, which are based on different parametric configurations. A percentage denotes the reduced size.",[2061,2513,2515,2518],{"id":2514},"k3-l3-m3-55",[173,2516,2517],{},"k=.3, l=.3, m=.3"," (55%)",[166,2520,2522],{"className":407,"code":2521,"language":409,"meta":171,"style":171},"\u003Csection tabindex=\"3\" type=\"example\" class=\"container\" required=\"true\">\n  # Our Pizza\n  \u003Cdiv class=\"shadow-lg\">\n    ## Margherita\n    A simple classic: mozzarela, tomatoes, and basil.\n    \u003Cbutton type=\"button\">Add\u003C/button>\n    ## Capricciosa\n    A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\n    \u003Cbutton type=\"button\">Add\u003C/button>\n  \u003C/div>\n\u003C/section>\n",[173,2523,2524,2572,2577,2595,2600,2605,2631,2636,2641,2667,2675],{"__ignoreMap":171},[176,2525,2526,2528,2530,2532,2534,2536,2538,2540,2542,2544,2546,2548,2550,2552,2554,2556,2558,2560,2562,2564,2566,2568,2570],{"class":178,"line":179},[176,2527,417],{"class":416},[176,2529,2072],{"class":420},[176,2531,2178],{"class":424},[176,2533,428],{"class":416},[176,2535,431],{"class":223},[176,2537,2021],{"class":186},[176,2539,431],{"class":223},[176,2541,2200],{"class":424},[176,2543,428],{"class":416},[176,2545,431],{"class":223},[176,2547,2207],{"class":186},[176,2549,431],{"class":223},[176,2551,2166],{"class":424},[176,2553,428],{"class":416},[176,2555,431],{"class":223},[176,2557,2173],{"class":186},[176,2559,431],{"class":223},[176,2561,2189],{"class":424},[176,2563,428],{"class":416},[176,2565,431],{"class":223},[176,2567,446],{"class":186},[176,2569,431],{"class":223},[176,2571,462],{"class":416},[176,2573,2574],{"class":178,"line":235},[176,2575,2576],{"class":213},"  # Our Pizza\n",[176,2578,2579,2581,2583,2585,2587,2589,2591,2593],{"class":178,"line":242},[176,2580,2216],{"class":416},[176,2582,2076],{"class":420},[176,2584,2166],{"class":424},[176,2586,428],{"class":416},[176,2588,431],{"class":223},[176,2590,2297],{"class":186},[176,2592,431],{"class":223},[176,2594,462],{"class":416},[176,2596,2597],{"class":178,"line":271},[176,2598,2599],{"class":213},"    ## Margherita\n",[176,2601,2602],{"class":178,"line":295},[176,2603,2604],{"class":213},"    A simple classic: mozzarela, tomatoes, and basil.\n",[176,2606,2607,2609,2611,2613,2615,2617,2619,2621,2623,2625,2627,2629],{"class":178,"line":315},[176,2608,2258],{"class":416},[176,2610,2353],{"class":420},[176,2612,2200],{"class":424},[176,2614,428],{"class":416},[176,2616,431],{"class":223},[176,2618,2353],{"class":186},[176,2620,431],{"class":223},[176,2622,451],{"class":416},[176,2624,2368],{"class":213},[176,2626,2269],{"class":416},[176,2628,2353],{"class":420},[176,2630,462],{"class":416},[176,2632,2633],{"class":178,"line":333},[176,2634,2635],{"class":213},"    ## Capricciosa\n",[176,2637,2638],{"class":178,"line":350},[176,2639,2640],{"class":213},"    A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\n",[176,2642,2643,2645,2647,2649,2651,2653,2655,2657,2659,2661,2663,2665],{"class":178,"line":360},[176,2644,2258],{"class":416},[176,2646,2353],{"class":420},[176,2648,2200],{"class":424},[176,2650,428],{"class":416},[176,2652,431],{"class":223},[176,2654,2353],{"class":186},[176,2656,431],{"class":223},[176,2658,451],{"class":416},[176,2660,2368],{"class":213},[176,2662,2269],{"class":416},[176,2664,2353],{"class":420},[176,2666,462],{"class":416},[176,2668,2669,2671,2673],{"class":178,"line":365},[176,2670,2493],{"class":416},[176,2672,2076],{"class":420},[176,2674,462],{"class":416},[176,2676,2677,2679,2681],{"class":178,"line":681},[176,2678,2269],{"class":416},[176,2680,2072],{"class":420},[176,2682,462],{"class":416},[2061,2684,2686,2689],{"id":2685},"k4-l6-m8-27",[173,2687,2688],{},"k=.4, l=.6, m=.8"," (27%)",[166,2691,2693],{"className":407,"code":2692,"language":409,"meta":171,"style":171},"\u003Csection>\n  # Our Pizza\n  \u003Cdiv>\n    ## Margherita\n    A simple classic:\n    \u003Cbutton>Add\u003C/button>\n    ## Capricciosa\n    A rich taste:\n    \u003Cbutton>Add\u003C/button>\n  \u003C/div>\n\u003C/section>\n",[173,2694,2695,2703,2707,2715,2719,2724,2740,2744,2749,2765,2773],{"__ignoreMap":171},[176,2696,2697,2699,2701],{"class":178,"line":179},[176,2698,417],{"class":416},[176,2700,2072],{"class":420},[176,2702,462],{"class":416},[176,2704,2705],{"class":178,"line":235},[176,2706,2576],{"class":213},[176,2708,2709,2711,2713],{"class":178,"line":242},[176,2710,2216],{"class":416},[176,2712,2076],{"class":420},[176,2714,462],{"class":416},[176,2716,2717],{"class":178,"line":271},[176,2718,2599],{"class":213},[176,2720,2721],{"class":178,"line":295},[176,2722,2723],{"class":213},"    A simple classic:\n",[176,2725,2726,2728,2730,2732,2734,2736,2738],{"class":178,"line":315},[176,2727,2258],{"class":416},[176,2729,2353],{"class":420},[176,2731,451],{"class":416},[176,2733,2368],{"class":213},[176,2735,2269],{"class":416},[176,2737,2353],{"class":420},[176,2739,462],{"class":416},[176,2741,2742],{"class":178,"line":333},[176,2743,2635],{"class":213},[176,2745,2746],{"class":178,"line":350},[176,2747,2748],{"class":213},"    A rich taste:\n",[176,2750,2751,2753,2755,2757,2759,2761,2763],{"class":178,"line":360},[176,2752,2258],{"class":416},[176,2754,2353],{"class":420},[176,2756,451],{"class":416},[176,2758,2368],{"class":213},[176,2760,2269],{"class":416},[176,2762,2353],{"class":420},[176,2764,462],{"class":416},[176,2766,2767,2769,2771],{"class":178,"line":365},[176,2768,2493],{"class":416},[176,2770,2076],{"class":420},[176,2772,462],{"class":416},[176,2774,2775,2777,2779],{"class":178,"line":681},[176,2776,2269],{"class":416},[176,2778,2072],{"class":420},[176,2780,462],{"class":416},[2061,2782,2784,2787],{"id":2783},"k-l0-m-35",[173,2785,2786],{},"k→∞, l=0, ∀m"," (35%)",[166,2789,2791],{"className":407,"code":2790,"language":409,"meta":171,"style":171},"# Our Pizza\n## Margherita\nA simple classic: mozzarela, tomatoes, and basil.\nAn everyday choice!\n\u003Cbutton>Add\u003C/button>\n## Capricciosa\nA rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\nA true favourite!\n\u003Cbutton>Add\u003C/button>\n",[173,2792,2793,2798,2803,2808,2813,2829,2834,2839,2844],{"__ignoreMap":171},[176,2794,2795],{"class":178,"line":179},[176,2796,2797],{"class":213},"# Our Pizza\n",[176,2799,2800],{"class":178,"line":235},[176,2801,2802],{"class":213},"## Margherita\n",[176,2804,2805],{"class":178,"line":242},[176,2806,2807],{"class":213},"A simple classic: mozzarela, tomatoes, and basil.\n",[176,2809,2810],{"class":178,"line":271},[176,2811,2812],{"class":213},"An everyday choice!\n",[176,2814,2815,2817,2819,2821,2823,2825,2827],{"class":178,"line":295},[176,2816,417],{"class":416},[176,2818,2353],{"class":420},[176,2820,451],{"class":416},[176,2822,2368],{"class":213},[176,2824,2269],{"class":416},[176,2826,2353],{"class":420},[176,2828,462],{"class":416},[176,2830,2831],{"class":178,"line":315},[176,2832,2833],{"class":213},"## Capricciosa\n",[176,2835,2836],{"class":178,"line":333},[176,2837,2838],{"class":213},"A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\n",[176,2840,2841],{"class":178,"line":350},[176,2842,2843],{"class":213},"A true favourite!\n",[176,2845,2846,2848,2850,2852,2854,2856,2858],{"class":178,"line":360},[176,2847,417],{"class":416},[176,2849,2353],{"class":420},[176,2851,451],{"class":416},[176,2853,2368],{"class":213},[176,2855,2269],{"class":416},[176,2857,2353],{"class":420},[176,2859,462],{"class":416},[11,2861,2862,2863,2865,2866,2868],{},"Asymptotic ",[173,2864,2038],{}," (kind of 'infinite' ",[173,2867,2038],{},") completely flattens the DOM, that is, leads to a full content linearisation similar to reader views as present in most browsers. Notably, it preserves all interactive elements like buttons – which are essential for a web agent.",[69,2870,2872],{"id":2871},"adaptived2snap",[15,2873,2874],{},"AdaptiveD2Snap",[11,2876,2877,2878,2880,2881,2883],{},"Fixed parameters might not be ideal for arbitrary DOMs – sourced from a landscape of web applications. We created ",[15,2879,2874],{}," – a wrapper for ",[15,2882,1994],{}," that infers suitable parameters from a given DOM in order to hit a certain token budget.",[69,2885,2887],{"id":2886},"implementation-integration","Implementation & Integration",[11,2889,2890,2891,2893],{},"Picture an LLM-based weg agent that is premised on DOM snapshots. Implementing ",[15,2892,1994],{}," is simple: Deep clone the DOM, and feed it to the algorithm. Now, take the snapshot; this is, serialise the resulting DOM. Done.",[2049,2895,2896],{},[11,2897,2898,2899,2902],{},"Read our ",[37,2900,2901],{"href":1835},"gentle introduction to AI agents for the web"," to get started with high-level web agent concepts.",[11,2904,2905,2906,2908,2909,2914],{},"The open source ",[15,2907,1994],{}," API, provided as a ",[37,2910,2913],{"href":2911,"rel":2912},"https://github.com/webfuse-com/D2Snap",[41],"package on GitHub"," provides the following signature:",[166,2916,2918],{"className":196,"code":2917,"language":198,"meta":171,"style":171},"type DOM = Document | Element | string;\ntype Options = {\n  assignUniqueIDs?: boolean; // false\n  debug?: boolean;           // true\n};\n\nD2Snap.d2Snap(\n  dom: DOM,\n  k: number, l: number, m: number,\n  options?: Options\n): Promise\u003Cstring>\n\nD2Snap.adaptiveD2Snap(\n  dom: DOM,\n  maxTokens: number = 4096,\n  maxIterations: number = 5,\n  options?: Options\n): Promise\u003Cstring>\n\n",[173,2919,2920,2948,2959,2976,2990,2995,2999,3011,3022,3039,3049,3065,3069,3080,3088,3100,3112,3120],{"__ignoreMap":171},[176,2921,2922,2925,2929,2931,2935,2937,2940,2942,2946],{"class":178,"line":179},[176,2923,2924],{"class":245},"type",[176,2926,2928],{"class":2927},"sXbZB"," DOM ",[176,2930,428],{"class":253},[176,2932,2934],{"class":2933},"s-DR7"," Document",[176,2936,617],{"class":416},[176,2938,2939],{"class":2933}," Element",[176,2941,617],{"class":416},[176,2943,2945],{"class":2944},"scrte"," string",[176,2947,232],{"class":209},[176,2949,2950,2952,2955,2957],{"class":178,"line":235},[176,2951,2924],{"class":245},[176,2953,2954],{"class":2927}," Options ",[176,2956,428],{"class":253},[176,2958,650],{"class":209},[176,2960,2961,2965,2968,2971,2973],{"class":178,"line":242},[176,2962,2964],{"class":2963},"swl0y","  assignUniqueIDs",[176,2966,2967],{"class":416},"?:",[176,2969,2970],{"class":2944}," boolean",[176,2972,1174],{"class":209},[176,2974,2975],{"class":291}," // false\n",[176,2977,2978,2981,2983,2985,2987],{"class":178,"line":271},[176,2979,2980],{"class":2963},"  debug",[176,2982,2967],{"class":416},[176,2984,2970],{"class":2944},[176,2986,1174],{"class":209},[176,2988,2989],{"class":291},"           // true\n",[176,2991,2992],{"class":178,"line":295},[176,2993,2994],{"class":209},"};\n",[176,2996,2997],{"class":178,"line":315},[176,2998,239],{"emptyLinePlaceholder":238},[176,3000,3001,3003,3005,3008],{"class":178,"line":333},[176,3002,1994],{"class":213},[176,3004,122],{"class":373},[176,3006,3007],{"class":182},"d2Snap",[176,3009,3010],{"class":213},"(\n",[176,3012,3013,3016,3020],{"class":178,"line":350},[176,3014,3015],{"class":213},"  dom: ",[176,3017,3019],{"class":3018},"sqxXB","DOM",[176,3021,330],{"class":209},[176,3023,3024,3027,3029,3032,3034,3037],{"class":178,"line":360},[176,3025,3026],{"class":213},"  k: number",[176,3028,288],{"class":209},[176,3030,3031],{"class":213}," l: number",[176,3033,288],{"class":209},[176,3035,3036],{"class":213}," m: number",[176,3038,330],{"class":209},[176,3040,3041,3044,3046],{"class":178,"line":365},[176,3042,3043],{"class":213},"  options",[176,3045,2967],{"class":253},[176,3047,3048],{"class":213}," Options\n",[176,3050,3051,3054,3058,3060,3063],{"class":178,"line":681},[176,3052,3053],{"class":213},"): ",[176,3055,3057],{"class":3056},"s8Irk","Promise",[176,3059,417],{"class":253},[176,3061,3062],{"class":213},"string",[176,3064,462],{"class":253},[176,3066,3067],{"class":178,"line":697},[176,3068,239],{"emptyLinePlaceholder":238},[176,3070,3071,3073,3075,3078],{"class":178,"line":716},[176,3072,1994],{"class":213},[176,3074,122],{"class":373},[176,3076,3077],{"class":182},"adaptiveD2Snap",[176,3079,3010],{"class":213},[176,3081,3082,3084,3086],{"class":178,"line":722},[176,3083,3015],{"class":213},[176,3085,3019],{"class":3018},[176,3087,330],{"class":209},[176,3089,3090,3093,3095,3098],{"class":178,"line":739},[176,3091,3092],{"class":213},"  maxTokens: number ",[176,3094,428],{"class":253},[176,3096,3097],{"class":1037}," 4096",[176,3099,330],{"class":209},[176,3101,3102,3105,3107,3110],{"class":178,"line":756},[176,3103,3104],{"class":213},"  maxIterations: number ",[176,3106,428],{"class":253},[176,3108,3109],{"class":1037}," 5",[176,3111,330],{"class":209},[176,3113,3114,3116,3118],{"class":178,"line":772},[176,3115,3043],{"class":213},[176,3117,2967],{"class":253},[176,3119,3048],{"class":213},[176,3121,3122,3124,3126,3128,3130],{"class":178,"line":782},[176,3123,3053],{"class":213},[176,3125,3057],{"class":3056},[176,3127,417],{"class":253},[176,3129,3062],{"class":213},[176,3131,462],{"class":253},[11,3133,3134,3135,3137,3138,3143,3144,3149],{},"Moreover, ",[15,3136,1994],{}," it is available on the ",[37,3139,3142],{"href":3140,"rel":3141},"https://dev.webfuse.com/automation-api",[41],"Webfuse Automation API",". ",[37,3145,3148],{"href":3146,"rel":3147},"https://www.webfuse.com",[41],"Webfuse"," essentially is a proxy to seamlessly serve any existing web application with custom augmentations, such as a web agent widget.",[166,3151,3155],{"className":3152,"code":3153,"language":3154,"meta":171,"style":171},"language-js shiki shiki-themes catppuccin-latte night-owl","const domSnapshot = await browser.webfuseSession\n    .automation\n    .take_dom_snapshot({ modifier: 'downsample' })\n","js",[173,3156,3157,3162,3167],{"__ignoreMap":171},[176,3158,3159],{"class":178,"line":179},[176,3160,3161],{},"const domSnapshot = await browser.webfuseSession\n",[176,3163,3164],{"class":178,"line":235},[176,3165,3166],{},"    .automation\n",[176,3168,3169],{"class":178,"line":242},[176,3170,3171],{},"    .take_dom_snapshot({ modifier: 'downsample' })\n",[11,3173,3174,3175,3177],{},"Need precise control over the underlying ",[15,3176,1994],{}," invocation? Configure it exactly how you want:",[166,3179,3181],{"className":3152,"code":3180,"language":3154,"meta":171,"style":171},"const domSnapshot = await browser.webfuseSession\n    .automation\n    .take_dom_snapshot({\n        modifier: {\n            name: 'D2Snap',\n            params: { hierarchyRatio: 0.6, textRatio: 0.2, attributeRatio: 0.8 }\n        }\n    })\n",[173,3182,3183,3187,3191,3196,3201,3206,3211,3216],{"__ignoreMap":171},[176,3184,3185],{"class":178,"line":179},[176,3186,3161],{},[176,3188,3189],{"class":178,"line":235},[176,3190,3166],{},[176,3192,3193],{"class":178,"line":242},[176,3194,3195],{},"    .take_dom_snapshot({\n",[176,3197,3198],{"class":178,"line":271},[176,3199,3200],{},"        modifier: {\n",[176,3202,3203],{"class":178,"line":295},[176,3204,3205],{},"            name: 'D2Snap',\n",[176,3207,3208],{"class":178,"line":315},[176,3209,3210],{},"            params: { hierarchyRatio: 0.6, textRatio: 0.2, attributeRatio: 0.8 }\n",[176,3212,3213],{"class":178,"line":333},[176,3214,3215],{},"        }\n",[176,3217,3218],{"class":178,"line":350},[176,3219,3220],{},"    })\n",[69,3222,3224],{"id":3223},"performance-evaluation","Performance Evaluation",[11,3226,3227,3228,3230,3231,3233,3234,3236],{},"Now for the moment of truth: How does ",[15,3229,1994],{}," stack up against the industry standard? We evaluated ",[15,3232,1994],{}," in comparison to a grounded GUI snapshot baseline close to those used by ",[15,3235,1875],{}," – coloured bounding boxes around visible interactive elements.",[11,3238,3239,3240,3245],{},"To evaluate snapshots isolated from specific agent logic, we crafted a dataset that spans all UI states that occur while solving a related task. We sampled our dataset from the existing ",[37,3241,3244],{"href":3242,"rel":3243},"https://github.com/OSU-NLP-Group/Online-Mind2Web",[41],"Online-Mind2Web"," dataset.",[57,3247],{":width":3248,"alt":3249,"format":64,"loading":62,"src":3250},"800","Exemplary solution UI state trajectory of a defined web-based task","/blog/dom-downsampling-for-web-agents/3.png",[11,3252,3253],{},[1918,3254,3255],{},"Exemplary solution UI state trajectory for the task: “View the pricing plan for 'Business'. Specifically, we have 100 users. We need a 1PB storage quota and a 50 TB transfer quota.”",[11,3257,3258],{},"These are our key findings...",[2061,3260,3262],{"id":3261},"substantial-success-rates","Substantial Success Rates",[11,3264,3265,3266,3268],{},"The results exceeded our expectations. Not only did ",[15,3267,1994],{}," meet the baseline's performance – our best configuration outperformed it by a significant margin. Full linearisation matches performance, and estimated model input token size order of the baseline.",[57,3270],{":width":3271,"alt":3272,"format":64,"loading":62,"src":3273},"550","Success rate per web agent snapshot subject evaluated across the dataset","/blog/dom-downsampling-for-web-agents/4.png",[1918,3275,3276,3277,3284,3285,3287,3288,3291,3292,3295,3296,3299,3300,3303,3304,3307,3308,3311],{},"\n  Success rate per web agent snapshot subject evaluated across the dataset.\n  Labels: ",[173,3278,3279,3280],{},"GUI",[3281,3282,3283],"sub",{}," gr.",": Baseline, ",[173,3286,3019],{},": Raw DOM (cut-off at ~8K tokens), ",[173,3289,3290],{},"k( l m)",": Parameter values; e.g., ",[173,3293,3294],{},".9 .3 .6",", or ",[173,3297,3298],{},".4"," if equal). ",[173,3301,3302],{},"∞",": Linearisation,  ",[173,3305,3306],{},"8192 / 32768",": via token-limited (resp.) ",[3309,3310,2874],"i",{},".\n",[2061,3313,3315],{"id":3314},"containable-token-and-byte-size","Containable Token and Byte Size",[11,3317,3318,3319,3321],{},"Even light downsampling delivers dramatic size reductions. Most ",[15,3320,1994],{}," configurations average just one token order above the baseline – a massive improvement over raw DOM snapshots. Better yet, most DOMs from the dataset could actually be downsampled to the baseline order. And while image data balloons in file size, our text-based approach stays lean and efficient.",[57,3323],{":width":3248,"alt":3324,"format":64,"loading":62,"src":3325},"Comparison of mean input size across and per subject","/blog/dom-downsampling-for-web-agents/5.png",[1918,3327,3328,3329,3332,3333,3335],{},"\n  Left: Comparison of mean input size (tokens vs bytes) across and per subject.",[3330,3331],"br",{},"\n  Right: Estimated input token size across the dataset created by a single ",[3309,3334,1994],{}," evaluation subject.\n",[2061,3337,3339],{"id":3338},"hierarchy-actually-matters","Hierarchy Actually Matters",[11,3341,3342],{},"Which UI feature matters most for LLM web agent backend performance? We alternated parameter configurations to find out. Interestingly, hierarchy reveals itself as the strongest of the three assessed features. Element extraction throws away hierarchy, which suggests that downsampling is a superior technique.",[2072,3344,3347,3352],{"className":3345,"dataFootnotes":171},[3346],"footnotes",[30,3348,3351],{"className":3349,"id":1935},[3350],"sr-only","Footnotes",[1641,3353,3354,3369,3380,3391],{},[87,3355,3357,3361,3362],{"id":3356},"user-content-fn-1",[37,3358,3359],{"href":3359,"rel":3360},"https://arxiv.org/abs/2210.03945",[41]," ",[37,3363,3368],{"href":3364,"ariaLabel":3365,"className":3366,"dataFootnoteBackref":171},"#user-content-fnref-1","Back to reference 1",[3367],"data-footnote-backref","↩",[87,3370,3372,3361,3375],{"id":3371},"user-content-fn-2",[37,3373,2000],{"href":2000,"rel":3374},[41],[37,3376,3368],{"href":3377,"ariaLabel":3378,"className":3379,"dataFootnoteBackref":171},"#user-content-fnref-2","Back to reference 2",[3367],[87,3381,3383,3361,3386],{"id":3382},"user-content-fn-3",[37,3384,2911],{"href":2911,"rel":3385},[41],[37,3387,3368],{"href":3388,"ariaLabel":3389,"className":3390,"dataFootnoteBackref":171},"#user-content-fnref-3","Back to reference 3",[3367],[87,3392,3394,3361,3398],{"id":3393},"user-content-fn-4",[37,3395,3396],{"href":3396,"rel":3397},"https://aclanthology.org/W04-3252",[41],[37,3399,3368],{"href":3400,"ariaLabel":3401,"className":3402,"dataFootnoteBackref":171},"#user-content-fnref-4","Back to reference 4",[3367],[1785,3404,3405],{},"html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .s9rnR, html code.shiki .s9rnR{--shiki-default:#179299;--shiki-dark:#7FDBCA}html pre.shiki code .sY2RG, html code.shiki .sY2RG{--shiki-default:#1E66F5;--shiki-dark:#CAECE6}html pre.shiki code .swkLt, html code.shiki .swkLt{--shiki-default:#DF8E1D;--shiki-default-font-style:inherit;--shiki-dark:#C5E478;--shiki-dark-font-style:italic}html pre.shiki code .sbuKk, html code.shiki .sbuKk{--shiki-default:#40A02B;--shiki-dark:#D9F5DD}html pre.shiki code .sfrMT, html code.shiki .sfrMT{--shiki-default:#40A02B;--shiki-dark:#ECC48D}html pre.shiki code .s2kId, html code.shiki .s2kId{--shiki-default:#4C4F69;--shiki-dark:#D6DEEB}html pre.shiki code .s76yb, html code.shiki .s76yb{--shiki-default:#8839EF;--shiki-dark:#C792EA}html pre.shiki code .sXbZB, html code.shiki .sXbZB{--shiki-default:#DF8E1D;--shiki-default-font-style:italic;--shiki-dark:#D6DEEB;--shiki-dark-font-style:inherit}html pre.shiki code .s-_ek, html code.shiki .s-_ek{--shiki-default:#179299;--shiki-dark:#C792EA}html pre.shiki code .s-DR7, html code.shiki .s-DR7{--shiki-default:#DF8E1D;--shiki-default-font-style:italic;--shiki-dark:#FFCB8B;--shiki-dark-font-style:inherit}html pre.shiki code .scrte, html code.shiki .scrte{--shiki-default:#8839EF;--shiki-dark:#C5E478}html pre.shiki code .scGhl, html code.shiki .scGhl{--shiki-default:#7C7F93;--shiki-dark:#D6DEEB}html pre.shiki code .swl0y, html code.shiki .swl0y{--shiki-default:#4C4F69;--shiki-default-font-style:italic;--shiki-dark:#D6DEEB;--shiki-dark-font-style:inherit}html pre.shiki code .sDmS1, html code.shiki .sDmS1{--shiki-default:#7C7F93;--shiki-default-font-style:italic;--shiki-dark:#637777;--shiki-dark-font-style:italic}html pre.shiki code .s5FwJ, html code.shiki .s5FwJ{--shiki-default:#179299;--shiki-default-font-style:inherit;--shiki-dark:#C792EA;--shiki-dark-font-style:italic}html pre.shiki code .sNstc, html code.shiki .sNstc{--shiki-default:#1E66F5;--shiki-default-font-style:italic;--shiki-dark:#82AAFF;--shiki-dark-font-style:italic}html pre.shiki code .sqxXB, html code.shiki .sqxXB{--shiki-default:#4C4F69;--shiki-dark:#82AAFF}html pre.shiki code .s8Irk, html code.shiki .s8Irk{--shiki-default:#DF8E1D;--shiki-default-font-style:italic;--shiki-dark:#C5E478;--shiki-dark-font-style:inherit}html pre.shiki code .sZ_Zo, html code.shiki .sZ_Zo{--shiki-default:#FE640B;--shiki-dark:#F78C6C}",{"title":171,"searchDepth":235,"depth":235,"links":3407},[3408,3412,3413,3420],{"id":1883,"depth":235,"text":1884,"children":3409},[3410,3411],{"id":1894,"depth":242,"text":1895},{"id":1923,"depth":242,"text":1924},{"id":1973,"depth":235,"text":1974},{"id":1991,"depth":235,"text":1994,"children":3414},[3415,3416,3417,3418,3419],{"id":2028,"depth":242,"text":2029},{"id":2148,"depth":242,"text":2149},{"id":2871,"depth":242,"text":2874},{"id":2886,"depth":242,"text":2887},{"id":3223,"depth":242,"text":3224},{"id":1935,"depth":235,"text":3351},"2025-08-18","We propose D2Snap – a first-of-its-kind downsampling algorithm for DOMs. D2Snap can be used as a pre-processing technique for DOM snapshots to optimise web agency context quality and token costs.",{"homepage":238,"relatedLinks":3424},[3425,3429,3432],{"text":3426,"href":3427,"description":3428},"What is a Website Snapshot?","/blog/snapshots-provide-llms-with-website-state","Learn what a website snapshot is and how to utilise it for web agents",{"text":3430,"href":1835,"description":3431},"What is a Web Agent?","Learn the basics of web agents",{"text":3142,"href":3433,"external":238,"description":3434},"https://dev.webfuse.com/automation-api#take_dom_snapshot","Check out the Webfuse Automation API","/blog/dom-downsampling-for-llm-based-web-agents",{"title":1849,"description":3422},{"loc":3435},"blog/1012.dom-downsampling-for-llm-based-web-agents",[1804,3440,3441,3442,1844,3443],"browser-agents","llms","llm-context","web-automation","bGJtg_9k7O95O2CJswaRFj4ONGhX4hGr_8aL5dhDZms",{"id":3446,"title":1834,"authorId":1850,"body":3447,"category":1804,"created":4170,"description":4171,"extension":1807,"faqs":1822,"featurePriority":1822,"head":1822,"landingPath":1822,"meta":4172,"navigation":238,"ogImage":1822,"path":1835,"robots":1822,"schemaOrg":1822,"seo":4180,"sitemap":4181,"stem":4182,"tags":4183,"__hash__":4184},"blog/blog/1011.a-gentle-introduction-to-ai-agents-for-the-web.md",{"type":8,"value":3448,"toc":4151},[3449,3463,3466,3473,3479,3483,3486,3500,3504,3514,3518,3522,3535,3539,3543,3546,3551,3555,3564,3568,3579,3584,3588,3606,3610,3616,3715,3718,3951,3967,3971,3974,3979,3983,3986,3990,4008,4033,4040,4044,4082,4085,4096,4100,4103,4131,4135,4143,4148],[11,3450,3451,3452,1865,3456,2042,3459,3462],{},"In no time, AI became a natural part of modern web interfaces. AI agents for the web enjoy a recent hype, sparked by the means of ",[37,3453,1864],{"href":3454,"rel":3455},"https://openai.com/index/introducing-operator/",[41],[37,3457,1870],{"href":1868,"rel":3458},[41],[37,3460,1875],{"href":1873,"rel":3461},[41],". By now, it is within reach to automate arbitrary web-based tasks, such as booking the cheapest flight from Berlin to Amsterdam.",[30,3464,3430],{"id":3465},"what-is-a-web-agent",[11,3467,3468,3469,3472],{},"For starters, let us break down the term ",[90,3470,3471],{},"web AI agent",": An agent is an entity that autonomously acts on behalf of another entity. An artificially intelligent agent is an application that acts on behalf of a human. In contrast to non-AI computer agents, it solves complex tasks with at least human-grade effectiveness and efficiency. For a human-centric web, web agents have deliberately been designed to browse the web in a human fashion – through UIs rather than APIs.",[57,3474],{":width":3475,"alt":3476,"format":3477,"loading":62,"src":3478},"610","High-level agent description comparing human and computer agents","svg","/blog/a-gentle-introduction-to-ai-agents-for-the-web/1.svg",[69,3480,3482],{"id":3481},"the-role-of-frontier-llms","The Role of Frontier LLMs",[11,3484,3485],{},"Web agents have been a vague desire for a long time. AI agents used to rely on complete models of a problem domain in order to allow (heuristic) search through problem states. Such models would comprise the problem world (e.g., a chessboard), actors (pawns, rooks, etc.), possible actions per actor (rook moves straight), and constraints (i.a., max one piece per field). A heterogeneous space of web application UIs describes the problem domain of a web agent: how to understand a web page, and how to interact with it to solve the declared task?",[11,3487,3488,3489,3495,3496,3499],{},"Frontier LLMs disrupted the AI agent world: explicit problem domain models beyond feasibility can now be replaced by an LLM. The LLM thereby acts as an instantaneous domain model backend that can be consulted with twofold context: serialised problem state, such as a chess position code (",[15,3490,3491,3492,3494],{},"“",[176,3493,1231],{}," e4 e5 2. Nc3 f5”","), and the respective task (",[15,3497,3498],{},"“What is the best move for white?”","). For web agents, problem state corresponds to the currently browsed web application's runtime state, for instance, a screenshot.",[69,3501,3503],{"id":3502},"generalist-web-agents","Generalist Web Agents",[11,3505,3506,3507,2042,3510,3513],{},"Generalist web agents are supposed to solve arbitrary tasks through a web browser. Web-based tasks can be as diverse as ",[15,3508,3509],{},"“Find a picture of a cat.”",[15,3511,3512],{},"“Book the cheapest flight from Berlin to Amsterdam tomorrow afternoon (business class, window seat).”"," In reality, generalist agents still fail uncommon or too precise tasks. While they have been critically acclaimed, they mainly act as early proofs-of-concept. Tasks that are indeed solvable with a generalist agent promise great results with an according specialist agent.",[57,3515],{":width":1855,"alt":3516,"format":64,"loading":62,"src":3517},"Screenshot of a generalist web agent UI (Director)","/blog/a-gentle-introduction-to-ai-agents-for-the-web/2.png",[69,3519,3521],{"id":3520},"specialist-web-agents","Specialist Web Agents",[11,3523,3524,3525,3528,3529,3534],{},"Other than generalist agents, specialist web agents are constrained to a certain task and application domain. Specialist agents bear the major share of commercial value. Most prominently, modal chat agents that provide users with on-page help. Picture a little floating widget that can be chatted to via text or voice input. In most cases, in fact, the term ",[15,3526,3527],{},"web (AI) agent"," refers to chat agents. Chat agents – text or voice – can be implemented on top of virtually any existing website. Frontier LLMs provide a lot of commonsense out-of-the-box. A ",[37,3530,3533],{"href":3531,"rel":3532},"https://docs.claude.com/en/docs/build-with-claude/prompt-engineering/system-prompts",[41],"system prompt"," can, moreover, be leveraged to drive specialist agent quality for the respective problem domain.",[57,3536],{":width":1855,"alt":3537,"format":64,"loading":62,"src":3538},"Screenshots of two modal specialist web agent UIs augmenting an underlying website's UI","/blog/a-gentle-introduction-to-ai-agents-for-the-web/3.png",[30,3540,3542],{"id":3541},"how-does-a-web-agent-work","How Does a Web Agent Work?",[11,3544,3545],{},"LLM-based web agents are premised on a more or less uniform architecture. The agent application embodies a mediator between a web browser (environment), and the LLM backend (model).",[57,3547],{":width":3548,"alt":3549,"format":3477,"loading":62,"src":3550},"480","High-level web agent architecture component view","/blog/a-gentle-introduction-to-ai-agents-for-the-web/4.svg",[69,3552,3554],{"id":3553},"the-agent-lifecycle","The Agent Lifecycle",[11,3556,3557,3558,3563],{},"To reduce a user's cognitive load, solving a web-based task is usually chunked into a sequence of UI states. Consider looking for rental apartments on ",[37,3559,3562],{"href":3560,"rel":3561},"https://www.redfin.com",[41],"redfin.com",": In the first step, you specify a location. Only subsequently are you provided with a grid of available apartments for that location.",[57,3565],{":width":1855,"alt":3566,"format":64,"loading":62,"src":3567},"Example of separated UI states in a rental home search application","/blog/a-gentle-introduction-to-ai-agents-for-the-web/5.png",[11,3569,3570,3571,3578],{},"Web agent logic is iterative; not least for a sequential web interaction model, but also for a conversational agent interaction model. Browsing the web, human and computer agents represent users alike. That said, Norman's well-known ",[37,3572,3575],{"href":3573,"rel":3574},"https://mitpress.mit.edu/9780262640374/the-design-of-everyday-things/",[41],[15,3576,3577],{},"Seven Stages of Action",", which hierarchically model the human cognition cycle, transfer to the web agent lifecycle. For each UI state in a web browser (environment) and web-based task (action intention); decide where to click, type, etc. (action planning), and perform those clicks, etc. (action execution). Afterwards, perceive, interpret, and evaluate the results of those actions in the web browser (state). As long as there is a mismatch between the evaluated state and the declared goal state, repeat that cycle. Potentially prompt the user with more required information.",[57,3580],{":width":3581,"alt":3582,"format":3477,"loading":62,"src":3583},"580","Donald 'Norman's Seven Stages of Action' model of the human cognition cycle that transfers to non-human agents","/blog/a-gentle-introduction-to-ai-agents-for-the-web/6.svg",[69,3585,3587],{"id":3586},"web-context-for-llms","Web Context for LLMs",[11,3589,3590,3591,3593,3594,3597,3598,3601,3602,3605],{},"The gap from an agent towards the environment, according to ",[15,3592,3577],{},", is known as the ",[15,3595,3596],{},"gulf of execution",". In real-world scenarios, how to act in the environment in respect to a planned sequence of actions might be difficult (e.g., how to actually open the trunk of a new car?). Arguably, web agents face a novel ",[15,3599,3600],{},"gulf of intention"," towards the action planning stage: how to serialise a currently browsed web page's runtime state for LLMs? ",[15,3603,3604],{},"Snapshot"," is a more comprehensive term to describe the serialisation of a web page's current runtime state. Screenshots, for instance, represent a type of snapshot that closely resembles how humans perceive a web page at a given point in time. But are they as accessible to LLMs?",[69,3607,3609],{"id":3608},"agentic-ui-interaction","Agentic UI Interaction",[11,3611,3612,3613,3615],{},"With a qualified set of well-defined actuation methods, web agents are able to close the ",[15,3614,3596],{}," quite well. HTML element types strongly afford a certain action (e.g., click a button, type to a field). Below is how an actuation schema to present the LLM backend with could look like:",[166,3617,3619],{"className":196,"code":3618,"language":198,"meta":171,"style":171},"interface ActuationSchema = {\n    thought: string;\n    action: \"click\"\n        | \"scroll\"\n        | \"type\";\n    cssSelector: string;\n    data?: string;\n}[];\n",[173,3620,3621,3634,3645,3660,3672,3684,3695,3706],{"__ignoreMap":171},[176,3622,3623,3626,3629,3632],{"class":178,"line":179},[176,3624,3625],{"class":245},"interface",[176,3627,3628],{"class":2927}," ActuationSchema",[176,3630,3631],{"class":213}," = ",[176,3633,268],{"class":209},[176,3635,3636,3639,3641,3643],{"class":178,"line":235},[176,3637,3638],{"class":213},"    thought",[176,3640,278],{"class":416},[176,3642,2945],{"class":2944},[176,3644,232],{"class":209},[176,3646,3647,3650,3652,3654,3658],{"class":178,"line":242},[176,3648,3649],{"class":213},"    action",[176,3651,278],{"class":416},[176,3653,1478],{"class":223},[176,3655,3657],{"class":3656},"sgAC-","click",[176,3659,1484],{"class":223},[176,3661,3662,3665,3667,3670],{"class":178,"line":271},[176,3663,3664],{"class":416},"        |",[176,3666,1478],{"class":223},[176,3668,3669],{"class":3656},"scroll",[176,3671,1484],{"class":223},[176,3673,3674,3676,3678,3680,3682],{"class":178,"line":295},[176,3675,3664],{"class":416},[176,3677,1478],{"class":223},[176,3679,2924],{"class":3656},[176,3681,431],{"class":223},[176,3683,232],{"class":209},[176,3685,3686,3689,3691,3693],{"class":178,"line":315},[176,3687,3688],{"class":213},"    cssSelector",[176,3690,278],{"class":416},[176,3692,2945],{"class":2944},[176,3694,232],{"class":209},[176,3696,3697,3700,3702,3704],{"class":178,"line":333},[176,3698,3699],{"class":213},"    data",[176,3701,2967],{"class":416},[176,3703,2945],{"class":2944},[176,3705,232],{"class":209},[176,3707,3708,3710,3713],{"class":178,"line":350},[176,3709,217],{"class":209},[176,3711,3712],{"class":213},"[]",[176,3714,232],{"class":209},[11,3716,3717],{},"And a suggested actions response could, in turn, look as follows:",[166,3719,3723],{"className":3720,"code":3721,"language":3722,"meta":171,"style":171},"language-json shiki shiki-themes catppuccin-latte night-owl","[\n    {\n        \"thought\": \"Scroll newsletter cta into view\",\n        \"action\": \"scroll\",\n        \"cssSelector\": \"section#newsletter\"\n    },\n    {\n        \"thought\": \"Type email address to newsletter cta\",\n        \"action\": \"type\",\n        \"cssSelector\": \"section#newsletter > input\",\n        \"data\": \"user@example.org\"\n    },\n    {\n        \"thought\": \"Submit newsletter sign up\",\n        \"action\": \"click\",\n        \"cssSelector\": \"section#newsletter > button\"\n    }\n]\n","json",[173,3724,3725,3730,3735,3759,3778,3796,3801,3805,3824,3842,3861,3879,3883,3887,3906,3924,3941,3946],{"__ignoreMap":171},[176,3726,3727],{"class":178,"line":179},[176,3728,3729],{"class":209},"[\n",[176,3731,3732],{"class":178,"line":235},[176,3733,3734],{"class":209},"    {\n",[176,3736,3737,3741,3745,3747,3749,3751,3755,3757],{"class":178,"line":242},[176,3738,3740],{"class":3739},"srFR9","        \"",[176,3742,3744],{"class":3743},"s30W1","thought",[176,3746,431],{"class":3739},[176,3748,278],{"class":209},[176,3750,1478],{"class":223},[176,3752,3754],{"class":3753},"sCC8C","Scroll newsletter cta into view",[176,3756,431],{"class":223},[176,3758,330],{"class":209},[176,3760,3761,3763,3766,3768,3770,3772,3774,3776],{"class":178,"line":271},[176,3762,3740],{"class":3739},[176,3764,3765],{"class":3743},"action",[176,3767,431],{"class":3739},[176,3769,278],{"class":209},[176,3771,1478],{"class":223},[176,3773,3669],{"class":3753},[176,3775,431],{"class":223},[176,3777,330],{"class":209},[176,3779,3780,3782,3785,3787,3789,3791,3794],{"class":178,"line":295},[176,3781,3740],{"class":3739},[176,3783,3784],{"class":3743},"cssSelector",[176,3786,431],{"class":3739},[176,3788,278],{"class":209},[176,3790,1478],{"class":223},[176,3792,3793],{"class":3753},"section#newsletter",[176,3795,1484],{"class":223},[176,3797,3798],{"class":178,"line":315},[176,3799,3800],{"class":209},"    },\n",[176,3802,3803],{"class":178,"line":333},[176,3804,3734],{"class":209},[176,3806,3807,3809,3811,3813,3815,3817,3820,3822],{"class":178,"line":350},[176,3808,3740],{"class":3739},[176,3810,3744],{"class":3743},[176,3812,431],{"class":3739},[176,3814,278],{"class":209},[176,3816,1478],{"class":223},[176,3818,3819],{"class":3753},"Type email address to newsletter cta",[176,3821,431],{"class":223},[176,3823,330],{"class":209},[176,3825,3826,3828,3830,3832,3834,3836,3838,3840],{"class":178,"line":360},[176,3827,3740],{"class":3739},[176,3829,3765],{"class":3743},[176,3831,431],{"class":3739},[176,3833,278],{"class":209},[176,3835,1478],{"class":223},[176,3837,2924],{"class":3753},[176,3839,431],{"class":223},[176,3841,330],{"class":209},[176,3843,3844,3846,3848,3850,3852,3854,3857,3859],{"class":178,"line":365},[176,3845,3740],{"class":3739},[176,3847,3784],{"class":3743},[176,3849,431],{"class":3739},[176,3851,278],{"class":209},[176,3853,1478],{"class":223},[176,3855,3856],{"class":3753},"section#newsletter > input",[176,3858,431],{"class":223},[176,3860,330],{"class":209},[176,3862,3863,3865,3868,3870,3872,3874,3877],{"class":178,"line":681},[176,3864,3740],{"class":3739},[176,3866,3867],{"class":3743},"data",[176,3869,431],{"class":3739},[176,3871,278],{"class":209},[176,3873,1478],{"class":223},[176,3875,3876],{"class":3753},"user@example.org",[176,3878,1484],{"class":223},[176,3880,3881],{"class":178,"line":697},[176,3882,3800],{"class":209},[176,3884,3885],{"class":178,"line":716},[176,3886,3734],{"class":209},[176,3888,3889,3891,3893,3895,3897,3899,3902,3904],{"class":178,"line":722},[176,3890,3740],{"class":3739},[176,3892,3744],{"class":3743},[176,3894,431],{"class":3739},[176,3896,278],{"class":209},[176,3898,1478],{"class":223},[176,3900,3901],{"class":3753},"Submit newsletter sign up",[176,3903,431],{"class":223},[176,3905,330],{"class":209},[176,3907,3908,3910,3912,3914,3916,3918,3920,3922],{"class":178,"line":739},[176,3909,3740],{"class":3739},[176,3911,3765],{"class":3743},[176,3913,431],{"class":3739},[176,3915,278],{"class":209},[176,3917,1478],{"class":223},[176,3919,3657],{"class":3753},[176,3921,431],{"class":223},[176,3923,330],{"class":209},[176,3925,3926,3928,3930,3932,3934,3936,3939],{"class":178,"line":756},[176,3927,3740],{"class":3739},[176,3929,3784],{"class":3743},[176,3931,431],{"class":3739},[176,3933,278],{"class":209},[176,3935,1478],{"class":223},[176,3937,3938],{"class":3753},"section#newsletter > button",[176,3940,1484],{"class":223},[176,3942,3943],{"class":178,"line":772},[176,3944,3945],{"class":209},"    }\n",[176,3947,3948],{"class":178,"line":782},[176,3949,3950],{"class":209},"]\n",[2049,3952,3953],{},[11,3954,3955,3960,3961,3966],{},[37,3956,3959],{"href":3957,"rel":3958},"https://platform.openai.com/docs/guides/function-calling",[41],"Function Calling"," and the ",[37,3962,3965],{"href":3963,"rel":3964},"https://modelcontextprotocol.io",[41],"Model Context Protocol"," represent two ends to outsource an explicit actuation model – server- and client-side, respectively.",[69,3968,3970],{"id":3969},"agentic-ui-augmentation","Agentic UI Augmentation",[11,3972,3973],{},"An agent represents yet another feature to integrate with an application and its UI. Discoverability and availability, however, are among the most fundamental requirements of a web agent. Evidently, when a user experiences UI/UX friction, at least the agent should be interactive. That said, a scrolling modal web agent UI has been the go-to approach, that is, a little floating widget on top of the underlying application's UI. It comes with a major advantage: the agent application can be decoupled from the underlying, self-contained application.",[57,3975],{":width":3976,"alt":3977,"format":3477,"loading":62,"src":3978},"360","Depiction of a web agent application augmenting an underlying application in an isolated layer","/blog/a-gentle-introduction-to-ai-agents-for-the-web/7.svg",[30,3980,3982],{"id":3981},"how-to-build-a-web-agent","How to Build a Web Agent?",[11,3984,3985],{},"Believe it or not: enhancing an existing web application with a purposeful agent is a lower-hanging fruit. The evolving agent ecosystem provides you with a spectrum of solutions: instantly use a pre-compiled agent, tweak a templated agent, or develop an agent from scratch. Either way, LLMs and web browsers exist for reuse, boiling down agent development to LLM context engineering, and UI augmentation.",[69,3987,3989],{"id":3988},"develop-a-web-agent","Develop a Web Agent",[11,3991,3992,3993,3996,3997,2042,4002,4007],{},"Opting for a ",[90,3994,3995],{},"pre-compiled agent"," does not necessarily involve any actual development step. Instead, pre-compiled agents allow for high-level configuration through an agent-as-a-service provider's interface. Popular agent-as-a-service providers are, i.a., ",[37,3998,4001],{"href":3999,"rel":4000},"https://elevenlabs.io/conversational-ai",[41],"ElevenLabs",[37,4003,4006],{"href":4004,"rel":4005},"https://www.intercom.com/drlp/ai-agent",[41],"Intercom",". Serviced agents hide LLM communication and potentially interaction with a web browser behind the configuration interface.",[11,4009,4010,4011,4014,4015,4020,4021,4026,4027,4032],{},"Using a ",[90,4012,4013],{},"templated agent"," resembles the agent-as-a-service approach on a lower level. Openly sourced from a ",[37,4016,4019],{"href":4017,"rel":4018},"https://github.com/webfuse-com/agent-extension-blueprint",[41],"code repository",", templated agents allow for any kind of development tweaks. Favourably, agent templates shortcut integration with ",[37,4022,4025],{"href":4023,"rel":4024},"https://openai.com/api/",[41],"LLM APIs"," and web ",[37,4028,4031],{"href":4029,"rel":4030},"https://developer.mozilla.org/en-US/docs/Web/API",[41],"browser APIs",". Using a templated agent usually represents the preferable, best-of-both-worlds approach; common- and best-practice code snippets are available from the beginning, but everything can be customised as desired.",[11,4034,4035,4036,4039],{},"Of course, developing an ",[90,4037,4038],{},"agent from scratch"," is always an option. It is preferable whenever agent requirements deviate to a large extent from what exists in the service or template landscape.",[69,4041,4043],{"id":4042},"deploy-a-web-agent","Deploy a Web Agent",[11,4045,4046,4047,2073,4052,4057,4058,4063,4064,4069,4070,4075,4076,4081],{},"When web agent code lives side-by-side with the augmented application's code, agent deployment is covered by a generic pipeline. Something like: ",[37,4048,4051],{"href":4049,"rel":4050},"https://eslint.org",[41],"linting",[37,4053,4056],{"href":4054,"rel":4055},"https://prettier.io",[41],"formatting"," agent code, ",[37,4059,4062],{"href":4060,"rel":4061},"https://esbuild.github.io",[41],"transpiling and bundling"," agent modules, ",[37,4065,4068],{"href":4066,"rel":4067},"https://www.cypress.io",[41],"testing"," agent, ",[37,4071,4074],{"href":4072,"rel":4073},"https://pages.cloudflare.com",[41],"hosting"," agent bundle, and ",[37,4077,4080],{"href":4078,"rel":4079},"https://docs.github.com/en/actions/get-started/continuous-integration",[41],"tiggering"," post deployment events. In that case, an agent represents a modular feature component in the application, no different than, for instance, a sign-up component.",[11,4083,4084],{},"Web agent source code right inside the application codebase comes at a cost:",[84,4086,4087,4090,4093],{},[87,4088,4089],{},"Agent developers can manipulate the source code of the underlying application.",[87,4091,4092],{},"Agent functionality could introduce side effects on the underlying application.",[87,4094,4095],{},"Agent changes require deployment of the entire application.",[69,4097,4099],{"id":4098},"best-practices-of-agentic-ux","Best Practices of Agentic UX",[11,4101,4102],{},"When designing user experiences for agent-enhanced applications, there are a few things to consider:",[84,4104,4105,4106,4105,4115,4105,4123],{},"\n    ",[87,4107,4108,4109,4108,4112,4114],{},"\n        ",[90,4110,4111],{},"Stream input and output to reduce latency",[3330,4113],{},"\n        LLMs (re-)introduce noticeable communication round-trip time. To reduce wait time for the human user, stream chunks of data whenever they are available.\n    ",[87,4116,4108,4117,4108,4120,4122],{},[90,4118,4119],{},"Provide fine-grained feedback to bridge high-latency",[3330,4121],{},"\n        Human attention is sensitive to several seconds of [system response time](https://www.nngroup.com/articles/response-times-3-important-limits/). Periodically provide agent _thoughts_ as feedback to perceptibly break down round-trip time.\n    ",[87,4124,4108,4125,4108,4128,4130],{},[90,4126,4127],{},"Always prompt the human user for consent to perform critical actions",[3330,4129],{},"\n        Some actions in a web application lead to irreversible or significant changes of state. Never have the agent perform such actions on behalf of the user without explicitly asking for the permission.\n    ",[69,4132,4134],{"id":4133},"non-invasive-web-agents-with-webfuse","Non-Invasive Web Agents with Webfuse",[11,4136,4137,4142],{},[37,4138,4140],{"href":3146,"rel":4139},[41],[90,4141,3148],{}," is a configurable web proxy that lets you augment any web application. As pictured, web agents represent highly self-contained applications. Moreover, web agents and underlying applications communicate at runtime in the client. This does, in fact, render opportunities to bridge the above-mentioned drawbacks with Webfuse: Develop web agents with a sandbox extension methodology, and deploy them through the low-latency proxy layer. On demand, seamlessly serve users with your agent-enhanced website. Benefit from information hiding, safe code, and fewer deployments.",[502,4144],{":demoAction":4145,"heading":4146,"subtitle":4147},"{\"text\":\"Read more\",\"showIcon\":false,\"href\":\"https://www.webfuse.com/blog/category/ai-agents\"}","Deploy Web Agents with Webfuse","Develop or deploy web agents in minutes; serve agent-enhanced websites through an isolated application layer.",[1785,4149,4150],{},"html pre.shiki code .s76yb, html code.shiki .s76yb{--shiki-default:#8839EF;--shiki-dark:#C792EA}html pre.shiki code .sXbZB, html code.shiki .sXbZB{--shiki-default:#DF8E1D;--shiki-default-font-style:italic;--shiki-dark:#D6DEEB;--shiki-dark-font-style:inherit}html pre.shiki code .s2kId, html code.shiki .s2kId{--shiki-default:#4C4F69;--shiki-dark:#D6DEEB}html pre.shiki code .scGhl, html code.shiki .scGhl{--shiki-default:#7C7F93;--shiki-dark:#D6DEEB}html pre.shiki code .s9rnR, html code.shiki .s9rnR{--shiki-default:#179299;--shiki-dark:#7FDBCA}html pre.shiki code .scrte, html code.shiki .scrte{--shiki-default:#8839EF;--shiki-dark:#C5E478}html pre.shiki code .sbuKk, html code.shiki .sbuKk{--shiki-default:#40A02B;--shiki-dark:#D9F5DD}html pre.shiki code .sgAC-, html code.shiki .sgAC-{--shiki-default:#40A02B;--shiki-default-font-style:italic;--shiki-dark:#ECC48D;--shiki-dark-font-style:inherit}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .srFR9, html code.shiki .srFR9{--shiki-default:#7C7F93;--shiki-dark:#7FDBCA}html pre.shiki code .s30W1, html code.shiki .s30W1{--shiki-default:#1E66F5;--shiki-dark:#7FDBCA}html pre.shiki code .sCC8C, html code.shiki .sCC8C{--shiki-default:#40A02B;--shiki-dark:#C789D6}",{"title":171,"searchDepth":235,"depth":235,"links":4152},[4153,4158,4164],{"id":3465,"depth":235,"text":3430,"children":4154},[4155,4156,4157],{"id":3481,"depth":242,"text":3482},{"id":3502,"depth":242,"text":3503},{"id":3520,"depth":242,"text":3521},{"id":3541,"depth":235,"text":3542,"children":4159},[4160,4161,4162,4163],{"id":3553,"depth":242,"text":3554},{"id":3586,"depth":242,"text":3587},{"id":3608,"depth":242,"text":3609},{"id":3969,"depth":242,"text":3970},{"id":3981,"depth":235,"text":3982,"children":4165},[4166,4167,4168,4169],{"id":3988,"depth":242,"text":3989},{"id":4042,"depth":242,"text":4043},{"id":4098,"depth":242,"text":4099},{"id":4133,"depth":242,"text":4134},"2025-06-15","LLMs only recently enabled serviceable web agents: autonomous systems that browse web on behalf of a human. Get started with fundamental methodology, key design challenges, and technological opportunities.",{"homepage":238,"relatedLinks":4173},[4174,4175,4178],{"text":3426,"href":3427,"description":3428},{"text":4176,"href":1831,"description":4177},"Develop an AI Agent for Any Website with Webfuse","Learn how to develop and deploy a web agent for any website with Webfuse",{"text":3142,"href":4179,"external":238,"description":3434},"https://dev.webfuse.com/automation-api/",{"title":1834,"description":4171},{"loc":1835},"blog/1011.a-gentle-introduction-to-ai-agents-for-the-web",[1804,3440,3441,1844,3443],"9anWTMfg6llLSdye3e9qWZZZcEAZcELLMk_vpnixn3M",1779282829738]