[{"data":1,"prerenderedAt":2899},["ShallowReactive",2],{"/blog/5-best-tools-to-let-ai-agents-guide-users-through-a-website-2026":3,"related-/blog/5-best-tools-to-let-ai-agents-guide-users-through-a-website-2026":491},{"id":4,"title":5,"authorId":6,"body":7,"category":453,"created":454,"description":455,"extension":456,"faqs":457,"featurePriority":467,"head":468,"landingPath":468,"meta":469,"navigation":480,"ogImage":468,"path":481,"robots":468,"schemaOrg":468,"seo":482,"sitemap":483,"stem":484,"tags":485,"__hash__":490},"blog/blog/1046.5-best-tools-to-let-ai-agents-guide-users-through-a-website-2026.md","5 Best Tools to Let AI Agents Guide Users Through a Website (2026)","salome-koshadze",{"type":8,"value":9,"toc":418},"minimark",[10,14,43,46,49,52,61,64,70,73,78,83,86,93,98,101,104,108,142,146,149,153,156,162,166,169,173,176,179,182,214,217,220,223,226,230,233,237,240,243,246,278,281,284,287,290,294,297,301,304,307,310,342,345,348,351,354,358,361,365,368,371,374,406,409,412,415],[11,12,13],"p",{},"A customer calls their bank about a mortgage. A voice agent answers, not a menu, not holding music. An AI understands the question instantly. \"I can help you calculate that. I'm opening the mortgage calculator on your screen now.\" In the customer's browser, the bank's mortgage portal loads. The voice agent reads the page, understands the form, and begins a conversation. As the customer answers each question, the agent fills the form in real-time, explaining what each field means and scrolling to show the results. The entire interaction happens on the bank's own web infrastructure, fully audited and governed.",[15,16,18,25,31,37],"tldr-box",{"title":17},"TL;DR",[11,19,20,24],{},[21,22,23],"strong",{},"The hard part is execution, not reasoning."," AI models can plan a task but often fail to act reliably on the live web - the \"brain-body disconnect.\"",[11,26,27,30],{},[21,28,29],{},"Latency decides the architecture."," Voice-driven guidance needs responses under ~800ms, so where the agent runs (in-session vs. remote) matters more than how smart it is.",[11,32,33,36],{},[21,34,35],{},"The five tools take different approaches."," Webfuse (in-session augmentation), Tidio (conversational support), Intercom Fin (complex query resolution), Voiceflow (conversation builder), and MultiOn (autonomous navigation).",[11,38,39,42],{},[21,40,41],{},"Pick by job."," Real-time voice guidance on regulated apps → Webfuse. SMB support → Tidio. High-volume complex support → Intercom Fin. Custom conversational flows → Voiceflow. Background autonomous tasks → MultiOn.",[11,44,45],{},"This capability is now possible at production scale. Contact Center as a Service (CCaaS) and Voice AI platforms are evolving from conversation layers into orchestration layers, responsible for the full customer outcome. To make that shift, they need an execution layer that can act inside live web sessions with low latency and strong governance.",[11,47,48],{},"The main difficulty is the \"brain-body disconnect\": AI models can create sophisticated plans but often fail to execute them reliably on the live web.",[11,50,51],{},"For voice-driven interactions, where the tolerance for delay is under 800 milliseconds, adding latency for browser round-trips is not an option.",[53,54],"nuxt-picture",{":height":55,":width":56,"alt":57,"loading":58,"src":59,"provider":60},"500","900","Diagram showing the sub-800-millisecond latency budget for voice-driven interactions and why browser round-trips break it","lazy","/blog/5-best-tools-to-let-ai-agents-guide-users-through-a-website-2026/3.svg","none",[11,62,63],{},"A new category of tools has emerged to solve this execution problem, each with a different architecture for letting AI agents see and interact with websites.",[53,65],{":height":66,":width":67,"alt":68,"loading":58,"src":69,"provider":60},"560","960","Diagram comparing the architectures of five tools for letting AI agents see and act on a live website","/blog/5-best-tools-to-let-ai-agents-guide-users-through-a-website-2026/1.svg",[11,71,72],{},"This article examines five of the best tools available in 2026 for guiding users through a website with AI.",[53,74],{":height":75,":width":67,"alt":76,"loading":58,"src":77,"provider":60},"710","Overview of the five best tools for guiding users through a website with AI in 2026","/blog/5-best-tools-to-let-ai-agents-guide-users-through-a-website-2026/4.svg",[79,80,82],"h2",{"id":81},"webfuse","Webfuse",[11,84,85],{},"Webfuse is a proxy-based augmentation platform that allows developers to add custom AI agents and extensions to any website without altering the original application's code.",[53,87],{":height":88,":width":89,"alt":90,"loading":58,"src":91,"format":92},"1508","2880","Webfuse adding a custom AI agent to a live website through its proxy-based augmentation layer, with no changes to the original application's code","/blog/5-best-tools-to-let-ai-agents-guide-users-through-a-website-2026/8.png","webp",[94,95,97],"h3",{"id":96},"how-it-works","How It Works",[11,99,100],{},"Webfuse operates as a configurable web proxy that sits between the user's browser and the target website. When a page is requested, the proxy injects a virtualization layer that sandboxes the application and provides programmatic control. This \"in-session augmentation\" means the agent's logic runs inside the user's own browser session, not on a remote server. This architecture minimizes latency for actions like clicks and typing, as no network round-trip is needed for execution once the session is live.",[11,102,103],{},"For agents, Webfuse provides an Automation API that exposes tools to \"see\" the page (via DOM snapshots or screenshots) and \"act\" on it (clicking, typing, scrolling). It has a deep awareness of modern web frameworks, allowing it to interact reliably with components in React, Angular, and Salesforce Lightning, even traversing closed Shadow DOM and iFrame boundaries.",[94,105,107],{"id":106},"key-features","Key Features",[109,110,111,118,124,130,136],"ul",{},[112,113,114,117],"li",{},[21,115,116],{},"In-Session Execution:"," Agent actions execute directly in the user's browser, offering very low latency suitable for real-time voice guidance.",[112,119,120,123],{},[21,121,122],{},"Framework Awareness:"," Hooks into the internal state of frameworks like React and Angular to ensure actions are only performed when components are fully ready.",[112,125,126,129],{},[21,127,128],{},"Enterprise Governance:"," Provides a full visual audit trail, session recording, policy enforcement to restrict agent capabilities, and PII masking to protect sensitive data.",[112,131,132,135],{},[21,133,134],{},"No-Code/No-Infrastructure Change:"," Can be deployed on any existing website without requiring source code access, DNS changes, or other infrastructure modifications from the website owner.",[112,137,138,141],{},[21,139,140],{},"Human-in-the-Loop Handoff:"," Supports seamless escalation from an AI agent to a human agent within the same session, with full co-browsing capabilities.",[94,143,145],{"id":144},"best-use-cases","Best Use Cases",[11,147,148],{},"Webfuse is built for enterprise and regulated industries like banking, insurance, and government, where security, compliance, and auditability are major requirements. Its low-latency architecture makes it highly suitable for powering voice agents that provide real-time, interactive guidance on complex web applications, such as filling out mortgage applications, insurance claims, or internal enterprise software.",[94,150,152],{"id":151},"limitations","Limitations",[11,154,155],{},"The proxy model means the origin server sees requests from Webfuse's IP address, not the user's, which may require IP whitelisting. As it operates by rewriting the domain, migrating existing user session tokens requires some initial configuration. While it doesn't require infrastructure changes, a trust relationship with the proxy is a necessary security and compliance consideration.",[157,158],"article-signup-cta",{":demoAction":159,"heading":160,"subtitle":161},"{\"text\":\"See How Webfuse Works\",\"href\":\"/use-case/ai-agents\"}","Let AI Agents Guide Users on Any Website - No Code Changes","Webfuse makes any live web session programmable, shareable, and agent-ready. Add AI agents, copilots, and real-time voice guidance to apps you do not own, with full audit trails and PII masking built in.",[79,163,165],{"id":164},"tidio-lyro-ai","Tidio (Lyro AI)",[11,167,168],{},"Tidio is a customer communication platform that provides an AI chatbot widget for small and medium businesses to automate support and sales conversations.",[53,170],{":height":88,":width":89,"alt":171,"loading":58,"src":172,"format":92},"Tidio AI chatbot widget on a website, automating support and sales conversations for a small business","/blog/5-best-tools-to-let-ai-agents-guide-users-through-a-website-2026/5.png",[94,174,97],{"id":175},"how-it-works-1",[11,177,178],{},"Tidio's primary AI offering, Lyro, is a conversational agent that trains on a company's website content, FAQs, and knowledge bases. Once trained, it can be deployed through a chat widget on the company's website. Lyro uses Natural Language Processing (NLP) to understand user questions and provide answers based on its training data. If it cannot answer a question, it can seamlessly hand the conversation over to a human agent. The focus is on conversational guidance and information retrieval rather than direct website interaction.",[94,180,107],{"id":181},"key-features-1",[109,183,184,190,196,202,208],{},[112,185,186,189],{},[21,187,188],{},"Fast Setup:"," Tidio is known for its ease of use, with a clean interface and no-code installations for major platforms like Shopify and WordPress.",[112,191,192,195],{},[21,193,194],{},"AI Training on Content:"," Lyro can be quickly trained by pointing it to a URL or uploading documents, allowing it to start answering questions within minutes.",[112,197,198,201],{},[21,199,200],{},"Visual Flow Builder:"," Includes a no-code visual editor with over 40 templates to design automated chatbot flows for lead qualification, FAQs, and proactive engagement.",[112,203,204,207],{},[21,205,206],{},"Hybrid AI and Live Chat:"," Combines the Lyro AI agent with a live chat inbox, allowing for smooth handoffs to human agents when needed.",[112,209,210,213],{},[21,211,212],{},"Omnichannel Support:"," Manages conversations from a website widget, email, Facebook Messenger, Instagram, and WhatsApp in a single dashboard.",[94,215,145],{"id":216},"best-use-cases-1",[11,218,219],{},"Tidio is well-suited for small to medium-sized e-commerce stores and SaaS companies that need to offer 24/7 customer support and automate lead generation. It excels at answering common questions, recommending products, tracking orders through integrations, and qualifying leads before passing them to a sales team. Its ease of use makes it accessible for non-technical teams.",[94,221,152],{"id":222},"limitations-1",[11,224,225],{},"Tidio's AI capabilities are primarily conversational; it does not autonomously navigate the website or fill out forms on behalf of the user. Its integration library is less extensive than some enterprise-focused competitors. The pricing model, which bills for AI conversations and chatbot flow visitors as separate add-ons, can become complex and costly at higher volumes.",[79,227,229],{"id":228},"intercom-fin","Intercom Fin",[11,231,232],{},"Intercom Fin is an advanced AI agent designed for customer service teams to resolve complex support queries and guide users through sales and e-commerce journeys.",[53,234],{":height":88,":width":89,"alt":235,"loading":58,"src":236,"format":92},"Intercom Fin AI agent resolving a complex customer support query and guiding a user through a sales journey","/blog/5-best-tools-to-let-ai-agents-guide-users-through-a-website-2026/7.png",[94,238,97],{"id":239},"how-it-works-2",[11,241,242],{},"Fin operates as a highly integrated AI agent within the Intercom customer service platform (and can also be layered on top of other helpdesks like Zendesk and Salesforce). It trains on a company's help center articles, public website content, and past support conversations to deliver accurate, contextual answers. Unlike simpler chatbots, Fin is designed to handle multi-turn conversations, understand complex queries, and execute actions through deep integrations with other systems like Salesforce and HubSpot. In June 2026, it was reported that Salesforce signed an agreement to acquire Fin (formerly Intercom).",[94,244,107],{"id":245},"key-features-2",[109,247,248,254,260,266,272],{},[112,249,250,253],{},[21,251,252],{},"Complex Query Resolution:"," Built to handle complicated, multi-step customer issues end-to-end, such as updating account information or troubleshooting technical problems.",[112,255,256,259],{},[21,257,258],{},"Role-Based Optimization:"," Can be configured for different roles, such as \"Fin for Service\" to resolve support tickets or \"Fin for Sales\" to qualify and convert leads.",[112,261,262,265],{},[21,263,264],{},"Deep Integrations:"," Connects with external systems like CRMs and e-commerce platforms to perform actions like processing returns or updating customer records, not just provide information.",[112,267,268,271],{},[21,269,270],{},"Human Agent Collaboration:"," Works alongside human agents in a shared inbox, providing a \"Copilot\" to help agents respond faster and ensuring seamless handoffs with full context.",[112,273,274,277],{},[21,275,276],{},"Enterprise-Grade Security:"," Offers SOC 2 Type II and ISO 27001 certifications, along with regional data hosting options, making it suitable for larger, security-conscious organizations.",[94,279,145],{"id":280},"best-use-cases-2",[11,282,283],{},"Intercom Fin is designed for growing startups and established enterprises that need a powerful AI agent to handle a sizeable volume of complex customer support interactions. It is particularly effective for SaaS and financial services companies where support queries often require looking up account data or interacting with other business systems. Its \"Fin for Sales\" role also makes it a strong choice for B2B companies looking to automate their inbound lead qualification process.",[94,285,152],{"id":286},"limitations-2",[11,288,289],{},"The pricing is value-aligned, charging per resolved conversation, which can become expensive for businesses with very high support volume. While it can be used with other helpdesks, the deepest integration and best experience are achieved when using the full Intercom suite. Some tests indicate that the real-world resolution rate is highly dependent on the quality and comprehensiveness of the training documentation.",[79,291,293],{"id":292},"voiceflow","Voiceflow",[11,295,296],{},"Voiceflow is an enterprise-grade conversational AI platform for building, launching, and scaling chat and voice AI agents for a variety of use cases, including on-site guidance.",[53,298],{":height":88,":width":89,"alt":299,"loading":58,"src":300,"format":92},"Voiceflow visual builder for designing and scaling chat and voice AI agents across multiple channels","/blog/5-best-tools-to-let-ai-agents-guide-users-through-a-website-2026/6.png",[94,302,97],{"id":303},"how-it-works-3",[11,305,306],{},"Voiceflow provides a visual, drag-and-drop canvas that allows teams to design complex conversational workflows without extensive coding. It is designed for collaboration between designers, developers, and CX leaders. The platform is highly flexible, supporting both structured, flow-based conversations and more open-ended, AI-driven responses. Agents built on Voiceflow can be deployed across multiple channels, including as a web chat widget, over the phone, or on mobile apps. It integrates with external APIs and databases, allowing agents to retrieve information and trigger actions in other systems.",[94,308,107],{"id":309},"key-features-3",[109,311,312,318,324,330,336],{},[112,313,314,317],{},[21,315,316],{},"Visual Conversation Builder:"," A user-friendly, no-code/low-code interface allows non-technical users to design and prototype sophisticated conversation flows.",[112,319,320,323],{},[21,321,322],{},"Omnichannel Deployment:"," Build an agent once and deploy it across web chat, voice channels (IVR), mobile apps, and more, ensuring a consistent user experience.",[112,325,326,329],{},[21,327,328],{},"Flexible Integrations:"," Strong API capabilities and JavaScript blocks allow developers to connect agents to internal systems, CRMs, and any external service to perform actions.",[112,331,332,335],{},[21,333,334],{},"Designed for Team Collaboration:"," The platform is built to support multidisciplinary teams, making it easier for designers, developers, and product managers to work together on agent development.",[112,337,338,341],{},[21,339,340],{},"Enterprise-Ready:"," Offers strong governance, security, and scalability features required by large organizations.",[94,343,145],{"id":344},"best-use-cases-3",[11,346,347],{},"Voiceflow is ideal for teams that want to build custom, highly sophisticated conversational agents for specific tasks. It is used by customer support teams to automate troubleshooting and account management, by marketing teams to create guided selling quizzes, and by product teams to build interactive onboarding tours. Its flexibility makes it a powerful choice for enterprises that need to create tailored AI experiences rather than using a pre-built solution.",[94,349,152],{"id":350},"limitations-3",[11,352,353],{},"While it's a powerful builder, Voiceflow is not an out-of-the-box agent. It requires a team to design, build, and maintain the conversational flows and integrations. Its focus is on conversational experience and workflow orchestration; it does not provide autonomous web navigation capabilities out of the box. Pricing scales with usage and team size, which can be a consideration for larger deployments.",[79,355,357],{"id":356},"multion-with-browserbase","MultiOn (with Browserbase)",[11,359,360],{},"MultiOn is an autonomous AI agent designed to perform web-based tasks by navigating and interacting with websites much like a human would.",[53,362],{":height":88,":width":89,"alt":363,"loading":58,"src":364,"format":92},"MultiOn autonomous AI agent navigating and interacting with a website to complete a web-based task","/blog/5-best-tools-to-let-ai-agents-guide-users-through-a-website-2026/9.png",[94,366,97],{"id":367},"how-it-works-4",[11,369,370],{},"MultiOn operates as an \"intent layer\" for the web. Instead of giving it step-by-step instructions, a user or developer gives it a high-level goal (e.g., \"Find the cheapest flight to Tokyo\"). The AI agent then figures out the necessary steps: navigating to a travel site, entering the search criteria, filtering the results, and extracting the information. It uses vision-capable models that can understand a website's layout visually, making it resilient to minor UI changes that would break traditional automation scripts. MultiOn offers an API for developers and can be run in two modes: in cloud-hosted virtual browser sessions or locally via a Chrome extension. For scalable cloud automation, it can be paired with infrastructure like Browserbase.",[94,372,107],{"id":373},"key-features-4",[109,375,376,382,388,394,400],{},[112,377,378,381],{},[21,379,380],{},"Autonomous Web Navigation:"," Capable of independently navigating complex websites, filling multi-page forms, and completing tasks without pre-scripted workflows.",[112,383,384,387],{},[21,385,386],{},"Vision-Based Reasoning:"," Understands web pages visually, allowing it to adapt to changes in website layout and design.",[112,389,390,393],{},[21,391,392],{},"Goal-Oriented Action:"," Operates on high-level objectives (\"intent\") rather than specific instructions, reasoning through the steps required to complete a task.",[112,395,396,399],{},[21,397,398],{},"API for Developers:"," Provides a powerful API for developers to programmatically dispatch agents to perform tasks like data collection, account management, and e-commerce transactions.",[112,401,402,405],{},[21,403,404],{},"Self-Healing Workflows:"," Its ability to adapt to UI changes makes its automation more durable than traditional RPA or scripting tools.",[94,407,145],{"id":408},"best-use-cases-4",[11,410,411],{},"MultiOn is built for developers and operations professionals who need to automate complex, multi-step web workflows that are too intricate for simple tools. It excels at tasks like automated competitor monitoring, vendor onboarding, proactive lead generation, and due diligence research. It's the right choice when the goal is to fully automate a web-based process that requires reasoning and adaptation, often running in the background without direct user supervision.",[94,413,152],{"id":414},"limitations-4",[11,416,417],{},"The primary trade-off is latency. Because the agent has to reason about every step, it is noticeably slower than a deterministic script and not suitable for real-time, voice-driven guidance where immediate responses are required. As a developer-focused tool, it has a steeper learning curve than no-code chatbot builders. Its pricing model, a flat monthly subscription, is geared towards power users and may be expensive for occasional use.",{"title":419,"searchDepth":420,"depth":420,"links":421},"",2,[422,429,435,441,447],{"id":81,"depth":420,"text":82,"children":423},[424,426,427,428],{"id":96,"depth":425,"text":97},3,{"id":106,"depth":425,"text":107},{"id":144,"depth":425,"text":145},{"id":151,"depth":425,"text":152},{"id":164,"depth":420,"text":165,"children":430},[431,432,433,434],{"id":175,"depth":425,"text":97},{"id":181,"depth":425,"text":107},{"id":216,"depth":425,"text":145},{"id":222,"depth":425,"text":152},{"id":228,"depth":420,"text":229,"children":436},[437,438,439,440],{"id":239,"depth":425,"text":97},{"id":245,"depth":425,"text":107},{"id":280,"depth":425,"text":145},{"id":286,"depth":425,"text":152},{"id":292,"depth":420,"text":293,"children":442},[443,444,445,446],{"id":303,"depth":425,"text":97},{"id":309,"depth":425,"text":107},{"id":344,"depth":425,"text":145},{"id":350,"depth":425,"text":152},{"id":356,"depth":420,"text":357,"children":448},[449,450,451,452],{"id":367,"depth":425,"text":97},{"id":373,"depth":425,"text":107},{"id":408,"depth":425,"text":145},{"id":414,"depth":425,"text":152},"ai-agents","2026-06-23","The five best tools for letting AI agents guide users through a live website in 2026. Compare Webfuse, Tidio, Intercom Fin, Voiceflow, and MultiOn on latency, governance, autonomy, and real-time voice guidance to find the right execution layer.","md",[458,461,464],{"question":459,"answer":460},"What does it mean for an AI agent to guide a user through a website?","It means the agent can see the live page and act on it in real time - clicking, typing, scrolling, and filling forms inside the user's own session - while explaining each step. This goes beyond a chatbot that only answers questions; the agent drives the interface alongside the user to complete a task end-to-end.",{"question":462,"answer":463},"Which tool is best for real-time voice-guided web interactions?","Latency is the deciding factor. Voice interactions need responses under roughly 800 milliseconds, so tools that execute actions inside the user's browser session, like Webfuse, are better suited than autonomous agents that reason about every step on a remote server and add network round-trips.",{"question":465,"answer":466},"What is the brain-body disconnect in web agents?","It describes the gap between an AI model's ability to plan a task and its ability to reliably execute that plan on the live web. The reasoning (the brain) is strong, but acting on real, dynamic pages (the body) is where most agents fail. The execution layer is what closes this gap.",0,null,{"shortTitle":470,"relatedLinks":471},"5 Tools for AI Website Guidance",[472,476],{"text":473,"href":474,"description":475},"Top 5 Voice AI Agents for Website Integration in 2026","/blog/top-5-voice-ai-agents-for-website-integration-in-2026","A comparison of the leading voice AI platforms for building conversational experiences that act on your website.",{"text":477,"href":478,"description":479},"How to Build an AI Support Agent with Alibaba Page Agent","/blog/how-to-build-an-ai-support-agent-with-alibaba-page-agent","A practical guide to embedding an in-page AI agent into a live web session, and what its limits reveal about deployment.",true,"/blog/5-best-tools-to-let-ai-agents-guide-users-through-a-website-2026",{"title":5,"description":455},{"loc":481},"blog/1046.5-best-tools-to-let-ai-agents-guide-users-through-a-website-2026",[453,486,487,488,489],"web-agents","voice-agents","customer-experience","web-automation","n5Q5m_kEr1IYLdIqTXn-HvUK4b2WoLLEo4SD23Q0ssU",[492,2151],{"id":493,"title":494,"authorId":495,"body":496,"category":453,"created":2128,"description":2129,"extension":456,"faqs":468,"featurePriority":468,"head":468,"landingPath":468,"meta":2130,"navigation":480,"ogImage":468,"path":2142,"robots":468,"schemaOrg":468,"seo":2143,"sitemap":2144,"stem":2145,"tags":2146,"__hash__":2150},"blog/blog/1012.dom-downsampling-for-llm-based-web-agents.md","DOM Downsampling for LLM-Based Web Agents","thassilo-schiepanski",{"type":8,"value":497,"toc":2113},[498,502,527,531,538,542,558,562,568,572,590,617,620,624,627,638,644,675,678,698,710,715,731,745,748,752,772,776,784,796,800,803,1195,1201,1208,1372,1379,1470,1477,1549,1558,1564,1573,1577,1583,1593,1605,1839,1856,1878,1884,1927,1931,1943,1952,1957,1962,1965,1969,1975,1980,2018,2022,2028,2032,2042,2046,2049,2109],[53,499],{":width":56,"alt":500,"format":92,"loading":58,"src":501},"Downsampling visualised for digital images and HTML","/blog/dom-downsampling-for-web-agents/1.png",[11,503,504,511,512,511,517,522,523,526],{},[505,506,510],"a",{"href":507,"rel":508},"https://operator.chatgpt.com",[509],"nofollow","Operator (OpenAI)",", ",[505,513,516],{"href":514,"rel":515},"https://www.director.ai",[509],"Director (Browserbase)",[505,518,521],{"href":519,"rel":520},"https://browser-use.com",[509],"Browser Use"," – we are currently witnessing the rise of ",[21,524,525],{},"web AI agents",". The first iteration of serviceable web agents was enabled by frontier LLMs, which act as instantaneous domain model backends. The domain, hereby, corresponds to the landscape of web application UIs.",[79,528,530],{"id":529},"what-is-a-snapshot","What is a Snapshot?",[11,532,533,534,537],{},"Web agents provide an LLM with a task, and serialised runtime state of a currently browsed web application (e.g., a screenshot). The LLM is ought to suggest relevant actions to perform in the web application. Serialisation of such runtime state is referred to as a ",[21,535,536],{},"snapshot",". And the snapshot technique primarily decides the quality of LLM interaction suggestions.",[94,539,541],{"id":540},"gui-snapshots","GUI Snapshots",[11,543,544,545,548,549,553,554,557],{},"Screenshots – for consistency reasons referred to as ",[21,546,547],{},"GUI snapshots"," – resemble how humans visually perceive web application UIs. LLM APIs subsidise the use of image input through upstream compression. Compresssion, however, irreversibly affects image dimensions, which takes away pixel precision; no way to suggest interactions like ",[550,551,552],"em",{},"“click at 100, 735”",". As a workaround, early web agents used ",[550,555,556],{},"grounded"," GUI snapshots. Grounding describes adding visual cues to the GUI, such as bounding boxes with numerical identifiers. Grounding lets the LLM refer to specific parts of the page by identifier, so the agent can trace back interaction targets.",[53,559],{":width":56,"alt":560,"format":92,"loading":58,"src":561},"Grounded GUI snapshot as implemented by Browser Use","/blog/dom-downsampling-for-web-agents/2.png",[11,563,564],{},[565,566,567],"small",{},"Grounded GUI snapshot as implemented by Browser Use.",[94,569,571],{"id":570},"dom-snapshots","DOM Snapshots",[11,573,574,575,585,586,589],{},"LLMs arguably are much better at understanding code than images. Research supports they excel at describing and classifying HTML, and also navigating an inherent UI",[576,577,578],"sup",{},[505,579,584],{"href":580,"ariaDescribedBy":581,"dataFootnoteRef":419,"id":583},"#user-content-fn-1",[582],"footnote-label","user-content-fnref-1","1",". The DOM (document object model) – a web browser's runtime state model of a web application – translates back to HTML. For this reason, ",[21,587,588],{},"DOM snapshots"," offer a compelling alternative to GUI snapshots. DOM snapshots offer a handful of key advantages:",[591,592,593,596,599,602,605],"ol",{},[112,594,595],{},"DOM snapshots connect with LLM code (HTML) interpretation abilities.",[112,597,598],{},"DOM snapshots can be compiled from deep clones, hidden from supervision (unlike GUI grounding).",[112,600,601],{},"DOM snapshots render text input that on average consume less bandwidth than screnshots.",[112,603,604],{},"DOM snapshots allow for exact programmatic targeting of elements (e.g., via CSS selectors).",[112,606,607,608,612,613,616],{},"DOM snapshots are available with the ",[609,610,611],"code",{},"DOMContentLoaded"," event (whereas the GUI completes initial rendering with ",[609,614,615],{},"load",").",[11,618,619],{},"Yet, DOM snapshots have a major problem: potentially exhaustive model context. Whereas GUI snapshot commonly cost four figures of tokens, a raw DOM snapshot can cost into hundreds of thousands of tokens. To connect with LLM code interpretation abilities, however, developers have used element extraction techniques – picking only (likely) important elements from the DOM. Element extraction flattens the DOM tree, which disregards hierarchy as a potential UI feature (how do elements relate to each other?).",[79,621,623],{"id":622},"dom-downsampling-a-novel-approach","DOM Downsampling: A Novel Approach",[11,625,626],{},"To enable DOM snapshots for use with web agents, it requires client-side pre-processing – similar to how LLM vision APIs process image input. Downsampling is a fundamental signal processing technique that reduces data that scales out of time or space constraints under the assumption that the majority of relevant features is retained. Picture JPEG compression as an example: put simply, a JPEG image stores only an average colour for patches of pixels. The bigger the patches, the smaller the file. Although some detail is lost, key image features – colours, edges, objects – keep being recognisable – up to a large patch size.",[11,628,629,630,633,634,637],{},"We transfer the concept of ",[21,631,632],{},"downsampling"," to ",[21,635,636],{},"DOMs",". Particularly, since such an approach retains HTML characteristics that might be valuable for an LLM backend. We define UI features as concepts that, to a substantial degree, facilitate LLM suggestions on how to act in the UI in order to solve related web-based tasks.",[79,639,641],{"id":640},"d2snap",[550,642,643],{},"D2Snap",[11,645,646,647,655,663,671,672,674],{},"We recently proposed ",[505,648,651],{"href":649,"rel":650},"https://arxiv.org/abs/2508.04412",[509],[21,652,653],{},[550,654,643],{},[576,656,657],{},[505,658,662],{"href":659,"ariaDescribedBy":660,"dataFootnoteRef":419,"id":661},"#user-content-fn-2",[582],"user-content-fnref-2","2",[576,664,665],{},[505,666,670],{"href":667,"ariaDescribedBy":668,"dataFootnoteRef":419,"id":669},"#user-content-fn-3",[582],"user-content-fnref-3","3"," – a first-of-its-kind downsampling algorithm for DOMs. Herein, we'll briefly explain how the ",[550,673,643],{}," algorithm works, and how it can be utilised to build efficient and performant web agents.",[94,676,677],{"id":96},"How it works",[11,679,680,681,683,684,511,687,690,691,694,695,616],{},"There are basically three redundant types of DOM nodes, and HTML concepts: elements, text, and attributes. We defined and empirically adjusted three node-specific procedures. ",[550,682,643],{}," downsamples at a variable ratio, configured through procedure-specific parameters  ",[609,685,686],{},"k",[609,688,689],{},"l",", and ",[609,692,693],{},"m"," (",[609,696,697],{},"∈ [0, 1]",[699,700,701],"blockquote",{},[11,702,703,704,709],{},"We used ",[505,705,708],{"href":706,"rel":707},"https://openai.com/index/hello-gpt-4o/",[509],"GPT-4o"," to create a downsampling ground truth dataset by having it classify HTML elements and scoring semantics regarding relevance for understanding the inherent UI – a UI feature degree.",[711,712,714],"h4",{"id":713},"procedure-elements","Procedure: Elements",[11,716,717,719,720,723,724,727,728,730],{},[550,718,643],{}," downsamples (simplifies) elements by merging container elements like ",[609,721,722],{},"section"," and ",[609,725,726],{},"div"," together. A parameter ",[609,729,686],{}," controls the merge ratio depending on the total DOM tree height. For competing concepts, such as element name, the ground truth determines which element's characterisitics to keep – comparing UI feature scores.",[11,732,733,734,511,736,738,739,744],{},"Elements in content elements (",[609,735,11],{},[609,737,699],{},", ...) are translated to a more comprehensive ",[505,740,743],{"href":741,"rel":742},"https://www.markdownguide.org/basic-syntax/",[509],"Markdown"," representation.",[11,746,747],{},"Interactive elements, definite interaction target candidates, are kept as is.",[711,749,751],{"id":750},"procedure-text","Procedure: Text",[11,753,754,756,757,760,768,769,771],{},[550,755,643],{}," downsamples text by dropping a fraction. Natural units of text are space-separated words, or punctuation-separated sentences. We reuse the ",[550,758,759],{},"TextRank",[576,761,762],{},[505,763,767],{"href":764,"ariaDescribedBy":765,"dataFootnoteRef":419,"id":766},"#user-content-fn-4",[582],"user-content-fnref-4","4"," algorithm to rank sentences in text nodes. The lowest-ranking fraction of sentences, denoted by parameter ",[609,770,689],{},", is dropped.",[711,773,775],{"id":774},"procedure-attributes","Procedure: Attributes",[11,777,778,780,781,783],{},[550,779,643],{}," downsamples attributes by dropping those with a name that, according to ground truth, holds a UI feature degree below a threshold. Parameter ",[609,782,693],{}," denotes this threshold.",[699,785,786],{},[11,787,788,789,795],{},"Check out the ",[505,790,792,794],{"href":649,"rel":791},[509],[550,793,643],{}," paper"," to learn about the algorithm in-depth.",[94,797,799],{"id":798},"example-of-a-downsampled-dom","Example of a Downsampled DOM",[11,801,802],{},"Consider a partial DOM state, serialised as HTML:",[804,805,809],"pre",{"className":806,"code":807,"language":808,"meta":419,"style":419},"language-html shiki shiki-themes catppuccin-latte night-owl","\u003Csection class=\"container\" tabindex=\"3\" required=\"true\" type=\"example\">\n  \u003Cdiv class=\"mx-auto\" data-topic=\"products\" required=\"false\">\n    \u003Ch1>Our Pizza\u003C/h1>\n    \u003Cdiv>\n      \u003Cdiv class=\"shadow-lg\">\n        \u003Ch2>Margherita\u003C/h2>\n        \u003Cp>\n          A simple classic: mozzarela, tomatoes and basil.\n          An everyday choice!\n        \u003C/p>\n        \u003Cbutton type=\"button\">Add\u003C/button>\n      \u003C/div>\n      \u003Cdiv class=\"shadow-lg\">\n        \u003Ch2>Capricciosa\u003C/h2>\n        \u003Cp>\n          A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\n          A true favourite!\n          \u003C/p>\n        \u003Cbutton type=\"button\">Add\u003C/button>\n      \u003C/div>\n    \u003C/div>\n  \u003C/div>\n\u003C/section>\n","html",[609,810,811,878,921,943,952,973,992,1001,1007,1013,1023,1052,1062,1081,1099,1108,1114,1120,1130,1157,1166,1176,1186],{"__ignoreMap":419},[812,813,816,820,823,827,830,834,838,840,843,845,847,849,851,854,856,858,861,863,866,868,870,873,875],"span",{"class":814,"line":815},"line",1,[812,817,819],{"class":818},"s9rnR","\u003C",[812,821,722],{"class":822},"sY2RG",[812,824,826],{"class":825},"swkLt"," class",[812,828,829],{"class":818},"=",[812,831,833],{"class":832},"sbuKk","\"",[812,835,837],{"class":836},"sfrMT","container",[812,839,833],{"class":832},[812,841,842],{"class":825}," tabindex",[812,844,829],{"class":818},[812,846,833],{"class":832},[812,848,670],{"class":836},[812,850,833],{"class":832},[812,852,853],{"class":825}," required",[812,855,829],{"class":818},[812,857,833],{"class":832},[812,859,860],{"class":836},"true",[812,862,833],{"class":832},[812,864,865],{"class":825}," type",[812,867,829],{"class":818},[812,869,833],{"class":832},[812,871,872],{"class":836},"example",[812,874,833],{"class":832},[812,876,877],{"class":818},">\n",[812,879,880,883,885,887,889,891,894,896,899,901,903,906,908,910,912,914,917,919],{"class":814,"line":420},[812,881,882],{"class":818},"  \u003C",[812,884,726],{"class":822},[812,886,826],{"class":825},[812,888,829],{"class":818},[812,890,833],{"class":832},[812,892,893],{"class":836},"mx-auto",[812,895,833],{"class":832},[812,897,898],{"class":825}," data-topic",[812,900,829],{"class":818},[812,902,833],{"class":832},[812,904,905],{"class":836},"products",[812,907,833],{"class":832},[812,909,853],{"class":825},[812,911,829],{"class":818},[812,913,833],{"class":832},[812,915,916],{"class":836},"false",[812,918,833],{"class":832},[812,920,877],{"class":818},[812,922,923,926,929,932,936,939,941],{"class":814,"line":425},[812,924,925],{"class":818},"    \u003C",[812,927,928],{"class":822},"h1",[812,930,931],{"class":818},">",[812,933,935],{"class":934},"s2kId","Our Pizza",[812,937,938],{"class":818},"\u003C/",[812,940,928],{"class":822},[812,942,877],{"class":818},[812,944,946,948,950],{"class":814,"line":945},4,[812,947,925],{"class":818},[812,949,726],{"class":822},[812,951,877],{"class":818},[812,953,955,958,960,962,964,966,969,971],{"class":814,"line":954},5,[812,956,957],{"class":818},"      \u003C",[812,959,726],{"class":822},[812,961,826],{"class":825},[812,963,829],{"class":818},[812,965,833],{"class":832},[812,967,968],{"class":836},"shadow-lg",[812,970,833],{"class":832},[812,972,877],{"class":818},[812,974,976,979,981,983,986,988,990],{"class":814,"line":975},6,[812,977,978],{"class":818},"        \u003C",[812,980,79],{"class":822},[812,982,931],{"class":818},[812,984,985],{"class":934},"Margherita",[812,987,938],{"class":818},[812,989,79],{"class":822},[812,991,877],{"class":818},[812,993,995,997,999],{"class":814,"line":994},7,[812,996,978],{"class":818},[812,998,11],{"class":822},[812,1000,877],{"class":818},[812,1002,1004],{"class":814,"line":1003},8,[812,1005,1006],{"class":934},"          A simple classic: mozzarela, tomatoes and basil.\n",[812,1008,1010],{"class":814,"line":1009},9,[812,1011,1012],{"class":934},"          An everyday choice!\n",[812,1014,1016,1019,1021],{"class":814,"line":1015},10,[812,1017,1018],{"class":818},"        \u003C/",[812,1020,11],{"class":822},[812,1022,877],{"class":818},[812,1024,1026,1028,1031,1033,1035,1037,1039,1041,1043,1046,1048,1050],{"class":814,"line":1025},11,[812,1027,978],{"class":818},[812,1029,1030],{"class":822},"button",[812,1032,865],{"class":825},[812,1034,829],{"class":818},[812,1036,833],{"class":832},[812,1038,1030],{"class":836},[812,1040,833],{"class":832},[812,1042,931],{"class":818},[812,1044,1045],{"class":934},"Add",[812,1047,938],{"class":818},[812,1049,1030],{"class":822},[812,1051,877],{"class":818},[812,1053,1055,1058,1060],{"class":814,"line":1054},12,[812,1056,1057],{"class":818},"      \u003C/",[812,1059,726],{"class":822},[812,1061,877],{"class":818},[812,1063,1065,1067,1069,1071,1073,1075,1077,1079],{"class":814,"line":1064},13,[812,1066,957],{"class":818},[812,1068,726],{"class":822},[812,1070,826],{"class":825},[812,1072,829],{"class":818},[812,1074,833],{"class":832},[812,1076,968],{"class":836},[812,1078,833],{"class":832},[812,1080,877],{"class":818},[812,1082,1084,1086,1088,1090,1093,1095,1097],{"class":814,"line":1083},14,[812,1085,978],{"class":818},[812,1087,79],{"class":822},[812,1089,931],{"class":818},[812,1091,1092],{"class":934},"Capricciosa",[812,1094,938],{"class":818},[812,1096,79],{"class":822},[812,1098,877],{"class":818},[812,1100,1102,1104,1106],{"class":814,"line":1101},15,[812,1103,978],{"class":818},[812,1105,11],{"class":822},[812,1107,877],{"class":818},[812,1109,1111],{"class":814,"line":1110},16,[812,1112,1113],{"class":934},"          A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\n",[812,1115,1117],{"class":814,"line":1116},17,[812,1118,1119],{"class":934},"          A true favourite!\n",[812,1121,1123,1126,1128],{"class":814,"line":1122},18,[812,1124,1125],{"class":818},"          \u003C/",[812,1127,11],{"class":822},[812,1129,877],{"class":818},[812,1131,1133,1135,1137,1139,1141,1143,1145,1147,1149,1151,1153,1155],{"class":814,"line":1132},19,[812,1134,978],{"class":818},[812,1136,1030],{"class":822},[812,1138,865],{"class":825},[812,1140,829],{"class":818},[812,1142,833],{"class":832},[812,1144,1030],{"class":836},[812,1146,833],{"class":832},[812,1148,931],{"class":818},[812,1150,1045],{"class":934},[812,1152,938],{"class":818},[812,1154,1030],{"class":822},[812,1156,877],{"class":818},[812,1158,1160,1162,1164],{"class":814,"line":1159},20,[812,1161,1057],{"class":818},[812,1163,726],{"class":822},[812,1165,877],{"class":818},[812,1167,1169,1172,1174],{"class":814,"line":1168},21,[812,1170,1171],{"class":818},"    \u003C/",[812,1173,726],{"class":822},[812,1175,877],{"class":818},[812,1177,1179,1182,1184],{"class":814,"line":1178},22,[812,1180,1181],{"class":818},"  \u003C/",[812,1183,726],{"class":822},[812,1185,877],{"class":818},[812,1187,1189,1191,1193],{"class":814,"line":1188},23,[812,1190,938],{"class":818},[812,1192,722],{"class":822},[812,1194,877],{"class":818},[11,1196,1197,1198,1200],{},"Here are some ",[550,1199,643],{}," downsampling results, which are based on different parametric configurations. A percentage denotes the reduced size.",[711,1202,1204,1207],{"id":1203},"k3-l3-m3-55",[609,1205,1206],{},"k=.3, l=.3, m=.3"," (55%)",[804,1209,1211],{"className":806,"code":1210,"language":808,"meta":419,"style":419},"\u003Csection tabindex=\"3\" type=\"example\" class=\"container\" required=\"true\">\n  # Our Pizza\n  \u003Cdiv class=\"shadow-lg\">\n    ## Margherita\n    A simple classic: mozzarela, tomatoes, and basil.\n    \u003Cbutton type=\"button\">Add\u003C/button>\n    ## Capricciosa\n    A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\n    \u003Cbutton type=\"button\">Add\u003C/button>\n  \u003C/div>\n\u003C/section>\n",[609,1212,1213,1261,1266,1284,1289,1294,1320,1325,1330,1356,1364],{"__ignoreMap":419},[812,1214,1215,1217,1219,1221,1223,1225,1227,1229,1231,1233,1235,1237,1239,1241,1243,1245,1247,1249,1251,1253,1255,1257,1259],{"class":814,"line":815},[812,1216,819],{"class":818},[812,1218,722],{"class":822},[812,1220,842],{"class":825},[812,1222,829],{"class":818},[812,1224,833],{"class":832},[812,1226,670],{"class":836},[812,1228,833],{"class":832},[812,1230,865],{"class":825},[812,1232,829],{"class":818},[812,1234,833],{"class":832},[812,1236,872],{"class":836},[812,1238,833],{"class":832},[812,1240,826],{"class":825},[812,1242,829],{"class":818},[812,1244,833],{"class":832},[812,1246,837],{"class":836},[812,1248,833],{"class":832},[812,1250,853],{"class":825},[812,1252,829],{"class":818},[812,1254,833],{"class":832},[812,1256,860],{"class":836},[812,1258,833],{"class":832},[812,1260,877],{"class":818},[812,1262,1263],{"class":814,"line":420},[812,1264,1265],{"class":934},"  # Our Pizza\n",[812,1267,1268,1270,1272,1274,1276,1278,1280,1282],{"class":814,"line":425},[812,1269,882],{"class":818},[812,1271,726],{"class":822},[812,1273,826],{"class":825},[812,1275,829],{"class":818},[812,1277,833],{"class":832},[812,1279,968],{"class":836},[812,1281,833],{"class":832},[812,1283,877],{"class":818},[812,1285,1286],{"class":814,"line":945},[812,1287,1288],{"class":934},"    ## Margherita\n",[812,1290,1291],{"class":814,"line":954},[812,1292,1293],{"class":934},"    A simple classic: mozzarela, tomatoes, and basil.\n",[812,1295,1296,1298,1300,1302,1304,1306,1308,1310,1312,1314,1316,1318],{"class":814,"line":975},[812,1297,925],{"class":818},[812,1299,1030],{"class":822},[812,1301,865],{"class":825},[812,1303,829],{"class":818},[812,1305,833],{"class":832},[812,1307,1030],{"class":836},[812,1309,833],{"class":832},[812,1311,931],{"class":818},[812,1313,1045],{"class":934},[812,1315,938],{"class":818},[812,1317,1030],{"class":822},[812,1319,877],{"class":818},[812,1321,1322],{"class":814,"line":994},[812,1323,1324],{"class":934},"    ## Capricciosa\n",[812,1326,1327],{"class":814,"line":1003},[812,1328,1329],{"class":934},"    A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\n",[812,1331,1332,1334,1336,1338,1340,1342,1344,1346,1348,1350,1352,1354],{"class":814,"line":1009},[812,1333,925],{"class":818},[812,1335,1030],{"class":822},[812,1337,865],{"class":825},[812,1339,829],{"class":818},[812,1341,833],{"class":832},[812,1343,1030],{"class":836},[812,1345,833],{"class":832},[812,1347,931],{"class":818},[812,1349,1045],{"class":934},[812,1351,938],{"class":818},[812,1353,1030],{"class":822},[812,1355,877],{"class":818},[812,1357,1358,1360,1362],{"class":814,"line":1015},[812,1359,1181],{"class":818},[812,1361,726],{"class":822},[812,1363,877],{"class":818},[812,1365,1366,1368,1370],{"class":814,"line":1025},[812,1367,938],{"class":818},[812,1369,722],{"class":822},[812,1371,877],{"class":818},[711,1373,1375,1378],{"id":1374},"k4-l6-m8-27",[609,1376,1377],{},"k=.4, l=.6, m=.8"," (27%)",[804,1380,1382],{"className":806,"code":1381,"language":808,"meta":419,"style":419},"\u003Csection>\n  # Our Pizza\n  \u003Cdiv>\n    ## Margherita\n    A simple classic:\n    \u003Cbutton>Add\u003C/button>\n    ## Capricciosa\n    A rich taste:\n    \u003Cbutton>Add\u003C/button>\n  \u003C/div>\n\u003C/section>\n",[609,1383,1384,1392,1396,1404,1408,1413,1429,1433,1438,1454,1462],{"__ignoreMap":419},[812,1385,1386,1388,1390],{"class":814,"line":815},[812,1387,819],{"class":818},[812,1389,722],{"class":822},[812,1391,877],{"class":818},[812,1393,1394],{"class":814,"line":420},[812,1395,1265],{"class":934},[812,1397,1398,1400,1402],{"class":814,"line":425},[812,1399,882],{"class":818},[812,1401,726],{"class":822},[812,1403,877],{"class":818},[812,1405,1406],{"class":814,"line":945},[812,1407,1288],{"class":934},[812,1409,1410],{"class":814,"line":954},[812,1411,1412],{"class":934},"    A simple classic:\n",[812,1414,1415,1417,1419,1421,1423,1425,1427],{"class":814,"line":975},[812,1416,925],{"class":818},[812,1418,1030],{"class":822},[812,1420,931],{"class":818},[812,1422,1045],{"class":934},[812,1424,938],{"class":818},[812,1426,1030],{"class":822},[812,1428,877],{"class":818},[812,1430,1431],{"class":814,"line":994},[812,1432,1324],{"class":934},[812,1434,1435],{"class":814,"line":1003},[812,1436,1437],{"class":934},"    A rich taste:\n",[812,1439,1440,1442,1444,1446,1448,1450,1452],{"class":814,"line":1009},[812,1441,925],{"class":818},[812,1443,1030],{"class":822},[812,1445,931],{"class":818},[812,1447,1045],{"class":934},[812,1449,938],{"class":818},[812,1451,1030],{"class":822},[812,1453,877],{"class":818},[812,1455,1456,1458,1460],{"class":814,"line":1015},[812,1457,1181],{"class":818},[812,1459,726],{"class":822},[812,1461,877],{"class":818},[812,1463,1464,1466,1468],{"class":814,"line":1025},[812,1465,938],{"class":818},[812,1467,722],{"class":822},[812,1469,877],{"class":818},[711,1471,1473,1476],{"id":1472},"k-l0-m-35",[609,1474,1475],{},"k→∞, l=0, ∀m"," (35%)",[804,1478,1480],{"className":806,"code":1479,"language":808,"meta":419,"style":419},"# Our Pizza\n## Margherita\nA simple classic: mozzarela, tomatoes, and basil.\nAn everyday choice!\n\u003Cbutton>Add\u003C/button>\n## Capricciosa\nA rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\nA true favourite!\n\u003Cbutton>Add\u003C/button>\n",[609,1481,1482,1487,1492,1497,1502,1518,1523,1528,1533],{"__ignoreMap":419},[812,1483,1484],{"class":814,"line":815},[812,1485,1486],{"class":934},"# Our Pizza\n",[812,1488,1489],{"class":814,"line":420},[812,1490,1491],{"class":934},"## Margherita\n",[812,1493,1494],{"class":814,"line":425},[812,1495,1496],{"class":934},"A simple classic: mozzarela, tomatoes, and basil.\n",[812,1498,1499],{"class":814,"line":945},[812,1500,1501],{"class":934},"An everyday choice!\n",[812,1503,1504,1506,1508,1510,1512,1514,1516],{"class":814,"line":954},[812,1505,819],{"class":818},[812,1507,1030],{"class":822},[812,1509,931],{"class":818},[812,1511,1045],{"class":934},[812,1513,938],{"class":818},[812,1515,1030],{"class":822},[812,1517,877],{"class":818},[812,1519,1520],{"class":814,"line":975},[812,1521,1522],{"class":934},"## Capricciosa\n",[812,1524,1525],{"class":814,"line":994},[812,1526,1527],{"class":934},"A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\n",[812,1529,1530],{"class":814,"line":1003},[812,1531,1532],{"class":934},"A true favourite!\n",[812,1534,1535,1537,1539,1541,1543,1545,1547],{"class":814,"line":1009},[812,1536,819],{"class":818},[812,1538,1030],{"class":822},[812,1540,931],{"class":818},[812,1542,1045],{"class":934},[812,1544,938],{"class":818},[812,1546,1030],{"class":822},[812,1548,877],{"class":818},[11,1550,1551,1552,1554,1555,1557],{},"Asymptotic ",[609,1553,686],{}," (kind of 'infinite' ",[609,1556,686],{},") completely flattens the DOM, that is, leads to a full content linearisation similar to reader views as present in most browsers. Notably, it preserves all interactive elements like buttons – which are essential for a web agent.",[94,1559,1561],{"id":1560},"adaptived2snap",[550,1562,1563],{},"AdaptiveD2Snap",[11,1565,1566,1567,1569,1570,1572],{},"Fixed parameters might not be ideal for arbitrary DOMs – sourced from a landscape of web applications. We created ",[550,1568,1563],{}," – a wrapper for ",[550,1571,643],{}," that infers suitable parameters from a given DOM in order to hit a certain token budget.",[94,1574,1576],{"id":1575},"implementation-integration","Implementation & Integration",[11,1578,1579,1580,1582],{},"Picture an LLM-based weg agent that is premised on DOM snapshots. Implementing ",[550,1581,643],{}," is simple: Deep clone the DOM, and feed it to the algorithm. Now, take the snapshot; this is, serialise the resulting DOM. Done.",[699,1584,1585],{},[11,1586,1587,1588,1592],{},"Read our ",[505,1589,1591],{"href":1590},"/blog/a-gentle-introduction-to-ai-agents-for-the-web","gentle introduction to AI agents for the web"," to get started with high-level web agent concepts.",[11,1594,1595,1596,1598,1599,1604],{},"The open source ",[550,1597,643],{}," API, provided as a ",[505,1600,1603],{"href":1601,"rel":1602},"https://github.com/webfuse-com/D2Snap",[509],"package on GitHub"," provides the following signature:",[804,1606,1610],{"className":1607,"code":1608,"language":1609,"meta":419,"style":419},"language-ts shiki shiki-themes catppuccin-latte night-owl","type DOM = Document | Element | string;\ntype Options = {\n  assignUniqueIDs?: boolean; // false\n  debug?: boolean;           // true\n};\n\nD2Snap.d2Snap(\n  dom: DOM,\n  k: number, l: number, m: number,\n  options?: Options\n): Promise\u003Cstring>\n\nD2Snap.adaptiveD2Snap(\n  dom: DOM,\n  maxTokens: number = 4096,\n  maxIterations: number = 5,\n  options?: Options\n): Promise\u003Cstring>\n\n","ts",[609,1611,1612,1645,1657,1676,1690,1695,1700,1715,1727,1745,1755,1771,1775,1786,1794,1807,1819,1827],{"__ignoreMap":419},[812,1613,1614,1618,1622,1625,1629,1632,1635,1637,1641],{"class":814,"line":815},[812,1615,1617],{"class":1616},"s76yb","type",[812,1619,1621],{"class":1620},"sXbZB"," DOM ",[812,1623,829],{"class":1624},"s-_ek",[812,1626,1628],{"class":1627},"s-DR7"," Document",[812,1630,1631],{"class":818}," |",[812,1633,1634],{"class":1627}," Element",[812,1636,1631],{"class":818},[812,1638,1640],{"class":1639},"scrte"," string",[812,1642,1644],{"class":1643},"scGhl",";\n",[812,1646,1647,1649,1652,1654],{"class":814,"line":420},[812,1648,1617],{"class":1616},[812,1650,1651],{"class":1620}," Options ",[812,1653,829],{"class":1624},[812,1655,1656],{"class":1643}," {\n",[812,1658,1659,1663,1666,1669,1672],{"class":814,"line":425},[812,1660,1662],{"class":1661},"swl0y","  assignUniqueIDs",[812,1664,1665],{"class":818},"?:",[812,1667,1668],{"class":1639}," boolean",[812,1670,1671],{"class":1643},";",[812,1673,1675],{"class":1674},"sDmS1"," // false\n",[812,1677,1678,1681,1683,1685,1687],{"class":814,"line":945},[812,1679,1680],{"class":1661},"  debug",[812,1682,1665],{"class":818},[812,1684,1668],{"class":1639},[812,1686,1671],{"class":1643},[812,1688,1689],{"class":1674},"           // true\n",[812,1691,1692],{"class":814,"line":954},[812,1693,1694],{"class":1643},"};\n",[812,1696,1697],{"class":814,"line":975},[812,1698,1699],{"emptyLinePlaceholder":480},"\n",[812,1701,1702,1704,1708,1712],{"class":814,"line":994},[812,1703,643],{"class":934},[812,1705,1707],{"class":1706},"s5FwJ",".",[812,1709,1711],{"class":1710},"sNstc","d2Snap",[812,1713,1714],{"class":934},"(\n",[812,1716,1717,1720,1724],{"class":814,"line":1003},[812,1718,1719],{"class":934},"  dom: ",[812,1721,1723],{"class":1722},"sqxXB","DOM",[812,1725,1726],{"class":1643},",\n",[812,1728,1729,1732,1735,1738,1740,1743],{"class":814,"line":1009},[812,1730,1731],{"class":934},"  k: number",[812,1733,1734],{"class":1643},",",[812,1736,1737],{"class":934}," l: number",[812,1739,1734],{"class":1643},[812,1741,1742],{"class":934}," m: number",[812,1744,1726],{"class":1643},[812,1746,1747,1750,1752],{"class":814,"line":1015},[812,1748,1749],{"class":934},"  options",[812,1751,1665],{"class":1624},[812,1753,1754],{"class":934}," Options\n",[812,1756,1757,1760,1764,1766,1769],{"class":814,"line":1025},[812,1758,1759],{"class":934},"): ",[812,1761,1763],{"class":1762},"s8Irk","Promise",[812,1765,819],{"class":1624},[812,1767,1768],{"class":934},"string",[812,1770,877],{"class":1624},[812,1772,1773],{"class":814,"line":1054},[812,1774,1699],{"emptyLinePlaceholder":480},[812,1776,1777,1779,1781,1784],{"class":814,"line":1064},[812,1778,643],{"class":934},[812,1780,1707],{"class":1706},[812,1782,1783],{"class":1710},"adaptiveD2Snap",[812,1785,1714],{"class":934},[812,1787,1788,1790,1792],{"class":814,"line":1083},[812,1789,1719],{"class":934},[812,1791,1723],{"class":1722},[812,1793,1726],{"class":1643},[812,1795,1796,1799,1801,1805],{"class":814,"line":1101},[812,1797,1798],{"class":934},"  maxTokens: number ",[812,1800,829],{"class":1624},[812,1802,1804],{"class":1803},"sZ_Zo"," 4096",[812,1806,1726],{"class":1643},[812,1808,1809,1812,1814,1817],{"class":814,"line":1110},[812,1810,1811],{"class":934},"  maxIterations: number ",[812,1813,829],{"class":1624},[812,1815,1816],{"class":1803}," 5",[812,1818,1726],{"class":1643},[812,1820,1821,1823,1825],{"class":814,"line":1116},[812,1822,1749],{"class":934},[812,1824,1665],{"class":1624},[812,1826,1754],{"class":934},[812,1828,1829,1831,1833,1835,1837],{"class":814,"line":1122},[812,1830,1759],{"class":934},[812,1832,1763],{"class":1762},[812,1834,819],{"class":1624},[812,1836,1768],{"class":934},[812,1838,877],{"class":1624},[11,1840,1841,1842,1844,1845,1850,1851,1855],{},"Moreover, ",[550,1843,643],{}," it is available on the ",[505,1846,1849],{"href":1847,"rel":1848},"https://dev.webfuse.com/automation-api",[509],"Webfuse Automation API",". ",[505,1852,82],{"href":1853,"rel":1854},"https://www.webfuse.com",[509]," essentially is a proxy to seamlessly serve any existing web application with custom augmentations, such as a web agent widget.",[804,1857,1861],{"className":1858,"code":1859,"language":1860,"meta":419,"style":419},"language-js shiki shiki-themes catppuccin-latte night-owl","const domSnapshot = await browser.webfuseSession\n    .automation\n    .take_dom_snapshot({ modifier: 'downsample' })\n","js",[609,1862,1863,1868,1873],{"__ignoreMap":419},[812,1864,1865],{"class":814,"line":815},[812,1866,1867],{},"const domSnapshot = await browser.webfuseSession\n",[812,1869,1870],{"class":814,"line":420},[812,1871,1872],{},"    .automation\n",[812,1874,1875],{"class":814,"line":425},[812,1876,1877],{},"    .take_dom_snapshot({ modifier: 'downsample' })\n",[11,1879,1880,1881,1883],{},"Need precise control over the underlying ",[550,1882,643],{}," invocation? Configure it exactly how you want:",[804,1885,1887],{"className":1858,"code":1886,"language":1860,"meta":419,"style":419},"const domSnapshot = await browser.webfuseSession\n    .automation\n    .take_dom_snapshot({\n        modifier: {\n            name: 'D2Snap',\n            params: { hierarchyRatio: 0.6, textRatio: 0.2, attributeRatio: 0.8 }\n        }\n    })\n",[609,1888,1889,1893,1897,1902,1907,1912,1917,1922],{"__ignoreMap":419},[812,1890,1891],{"class":814,"line":815},[812,1892,1867],{},[812,1894,1895],{"class":814,"line":420},[812,1896,1872],{},[812,1898,1899],{"class":814,"line":425},[812,1900,1901],{},"    .take_dom_snapshot({\n",[812,1903,1904],{"class":814,"line":945},[812,1905,1906],{},"        modifier: {\n",[812,1908,1909],{"class":814,"line":954},[812,1910,1911],{},"            name: 'D2Snap',\n",[812,1913,1914],{"class":814,"line":975},[812,1915,1916],{},"            params: { hierarchyRatio: 0.6, textRatio: 0.2, attributeRatio: 0.8 }\n",[812,1918,1919],{"class":814,"line":994},[812,1920,1921],{},"        }\n",[812,1923,1924],{"class":814,"line":1003},[812,1925,1926],{},"    })\n",[94,1928,1930],{"id":1929},"performance-evaluation","Performance Evaluation",[11,1932,1933,1934,1936,1937,1939,1940,1942],{},"Now for the moment of truth: How does ",[550,1935,643],{}," stack up against the industry standard? We evaluated ",[550,1938,643],{}," in comparison to a grounded GUI snapshot baseline close to those used by ",[550,1941,521],{}," – coloured bounding boxes around visible interactive elements.",[11,1944,1945,1946,1951],{},"To evaluate snapshots isolated from specific agent logic, we crafted a dataset that spans all UI states that occur while solving a related task. We sampled our dataset from the existing ",[505,1947,1950],{"href":1948,"rel":1949},"https://github.com/OSU-NLP-Group/Online-Mind2Web",[509],"Online-Mind2Web"," dataset.",[53,1953],{":width":1954,"alt":1955,"format":92,"loading":58,"src":1956},"800","Exemplary solution UI state trajectory of a defined web-based task","/blog/dom-downsampling-for-web-agents/3.png",[11,1958,1959],{},[565,1960,1961],{},"Exemplary solution UI state trajectory for the task: “View the pricing plan for 'Business'. Specifically, we have 100 users. We need a 1PB storage quota and a 50 TB transfer quota.”",[11,1963,1964],{},"These are our key findings...",[711,1966,1968],{"id":1967},"substantial-success-rates","Substantial Success Rates",[11,1970,1971,1972,1974],{},"The results exceeded our expectations. Not only did ",[550,1973,643],{}," meet the baseline's performance – our best configuration outperformed it by a significant margin. Full linearisation matches performance, and estimated model input token size order of the baseline.",[53,1976],{":width":1977,"alt":1978,"format":92,"loading":58,"src":1979},"550","Success rate per web agent snapshot subject evaluated across the dataset","/blog/dom-downsampling-for-web-agents/4.png",[565,1981,1982,1983,1990,1991,1993,1994,1997,1998,2001,2002,2005,2006,2009,2010,2013,2014,2017],{},"\n  Success rate per web agent snapshot subject evaluated across the dataset.\n  Labels: ",[609,1984,1985,1986],{},"GUI",[1987,1988,1989],"sub",{}," gr.",": Baseline, ",[609,1992,1723],{},": Raw DOM (cut-off at ~8K tokens), ",[609,1995,1996],{},"k( l m)",": Parameter values; e.g., ",[609,1999,2000],{},".9 .3 .6",", or ",[609,2003,2004],{},".4"," if equal). ",[609,2007,2008],{},"∞",": Linearisation,  ",[609,2011,2012],{},"8192 / 32768",": via token-limited (resp.) ",[2015,2016,1563],"i",{},".\n",[711,2019,2021],{"id":2020},"containable-token-and-byte-size","Containable Token and Byte Size",[11,2023,2024,2025,2027],{},"Even light downsampling delivers dramatic size reductions. Most ",[550,2026,643],{}," configurations average just one token order above the baseline – a massive improvement over raw DOM snapshots. Better yet, most DOMs from the dataset could actually be downsampled to the baseline order. And while image data balloons in file size, our text-based approach stays lean and efficient.",[53,2029],{":width":1954,"alt":2030,"format":92,"loading":58,"src":2031},"Comparison of mean input size across and per subject","/blog/dom-downsampling-for-web-agents/5.png",[565,2033,2034,2035,2038,2039,2041],{},"\n  Left: Comparison of mean input size (tokens vs bytes) across and per subject.",[2036,2037],"br",{},"\n  Right: Estimated input token size across the dataset created by a single ",[2015,2040,643],{}," evaluation subject.\n",[711,2043,2045],{"id":2044},"hierarchy-actually-matters","Hierarchy Actually Matters",[11,2047,2048],{},"Which UI feature matters most for LLM web agent backend performance? We alternated parameter configurations to find out. Interestingly, hierarchy reveals itself as the strongest of the three assessed features. Element extraction throws away hierarchy, which suggests that downsampling is a superior technique.",[722,2050,2053,2058],{"className":2051,"dataFootnotes":419},[2052],"footnotes",[79,2054,2057],{"className":2055,"id":582},[2056],"sr-only","Footnotes",[591,2059,2060,2075,2086,2097],{},[112,2061,2063,2067,2068],{"id":2062},"user-content-fn-1",[505,2064,2065],{"href":2065,"rel":2066},"https://arxiv.org/abs/2210.03945",[509]," ",[505,2069,2074],{"href":2070,"ariaLabel":2071,"className":2072,"dataFootnoteBackref":419},"#user-content-fnref-1","Back to reference 1",[2073],"data-footnote-backref","↩",[112,2076,2078,2067,2081],{"id":2077},"user-content-fn-2",[505,2079,649],{"href":649,"rel":2080},[509],[505,2082,2074],{"href":2083,"ariaLabel":2084,"className":2085,"dataFootnoteBackref":419},"#user-content-fnref-2","Back to reference 2",[2073],[112,2087,2089,2067,2092],{"id":2088},"user-content-fn-3",[505,2090,1601],{"href":1601,"rel":2091},[509],[505,2093,2074],{"href":2094,"ariaLabel":2095,"className":2096,"dataFootnoteBackref":419},"#user-content-fnref-3","Back to reference 3",[2073],[112,2098,2100,2067,2104],{"id":2099},"user-content-fn-4",[505,2101,2102],{"href":2102,"rel":2103},"https://aclanthology.org/W04-3252",[509],[505,2105,2074],{"href":2106,"ariaLabel":2107,"className":2108,"dataFootnoteBackref":419},"#user-content-fnref-4","Back to reference 4",[2073],[2110,2111,2112],"style",{},"html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .s9rnR, html code.shiki .s9rnR{--shiki-default:#179299;--shiki-dark:#7FDBCA}html pre.shiki code .sY2RG, html code.shiki .sY2RG{--shiki-default:#1E66F5;--shiki-dark:#CAECE6}html pre.shiki code .swkLt, html code.shiki .swkLt{--shiki-default:#DF8E1D;--shiki-default-font-style:inherit;--shiki-dark:#C5E478;--shiki-dark-font-style:italic}html pre.shiki code .sbuKk, html code.shiki .sbuKk{--shiki-default:#40A02B;--shiki-dark:#D9F5DD}html pre.shiki code .sfrMT, html code.shiki .sfrMT{--shiki-default:#40A02B;--shiki-dark:#ECC48D}html pre.shiki code .s2kId, html code.shiki .s2kId{--shiki-default:#4C4F69;--shiki-dark:#D6DEEB}html pre.shiki code .s76yb, html code.shiki .s76yb{--shiki-default:#8839EF;--shiki-dark:#C792EA}html pre.shiki code .sXbZB, html code.shiki .sXbZB{--shiki-default:#DF8E1D;--shiki-default-font-style:italic;--shiki-dark:#D6DEEB;--shiki-dark-font-style:inherit}html pre.shiki code .s-_ek, html code.shiki .s-_ek{--shiki-default:#179299;--shiki-dark:#C792EA}html pre.shiki code .s-DR7, html code.shiki .s-DR7{--shiki-default:#DF8E1D;--shiki-default-font-style:italic;--shiki-dark:#FFCB8B;--shiki-dark-font-style:inherit}html pre.shiki code .scrte, html code.shiki .scrte{--shiki-default:#8839EF;--shiki-dark:#C5E478}html pre.shiki code .scGhl, html code.shiki .scGhl{--shiki-default:#7C7F93;--shiki-dark:#D6DEEB}html pre.shiki code .swl0y, html code.shiki .swl0y{--shiki-default:#4C4F69;--shiki-default-font-style:italic;--shiki-dark:#D6DEEB;--shiki-dark-font-style:inherit}html pre.shiki code .sDmS1, html code.shiki .sDmS1{--shiki-default:#7C7F93;--shiki-default-font-style:italic;--shiki-dark:#637777;--shiki-dark-font-style:italic}html pre.shiki code .s5FwJ, html code.shiki .s5FwJ{--shiki-default:#179299;--shiki-default-font-style:inherit;--shiki-dark:#C792EA;--shiki-dark-font-style:italic}html pre.shiki code .sNstc, html code.shiki .sNstc{--shiki-default:#1E66F5;--shiki-default-font-style:italic;--shiki-dark:#82AAFF;--shiki-dark-font-style:italic}html pre.shiki code .sqxXB, html code.shiki .sqxXB{--shiki-default:#4C4F69;--shiki-dark:#82AAFF}html pre.shiki code .s8Irk, html code.shiki .s8Irk{--shiki-default:#DF8E1D;--shiki-default-font-style:italic;--shiki-dark:#C5E478;--shiki-dark-font-style:inherit}html pre.shiki code .sZ_Zo, html code.shiki .sZ_Zo{--shiki-default:#FE640B;--shiki-dark:#F78C6C}",{"title":419,"searchDepth":420,"depth":420,"links":2114},[2115,2119,2120,2127],{"id":529,"depth":420,"text":530,"children":2116},[2117,2118],{"id":540,"depth":425,"text":541},{"id":570,"depth":425,"text":571},{"id":622,"depth":420,"text":623},{"id":640,"depth":420,"text":643,"children":2121},[2122,2123,2124,2125,2126],{"id":96,"depth":425,"text":677},{"id":798,"depth":425,"text":799},{"id":1560,"depth":425,"text":1563},{"id":1575,"depth":425,"text":1576},{"id":1929,"depth":425,"text":1930},{"id":582,"depth":420,"text":2057},"2025-08-18","We propose D2Snap – a first-of-its-kind downsampling algorithm for DOMs. D2Snap can be used as a pre-processing technique for DOM snapshots to optimise web agency context quality and token costs.",{"homepage":480,"relatedLinks":2131},[2132,2136,2139],{"text":2133,"href":2134,"description":2135},"What is a Website Snapshot?","/blog/snapshots-provide-llms-with-website-state","Learn what a website snapshot is and how to utilise it for web agents",{"text":2137,"href":1590,"description":2138},"What is a Web Agent?","Learn the basics of web agents",{"text":1849,"href":2140,"external":480,"description":2141},"https://dev.webfuse.com/automation-api#take_dom_snapshot","Check out the Webfuse Automation API","/blog/dom-downsampling-for-llm-based-web-agents",{"title":494,"description":2129},{"loc":2142},"blog/1012.dom-downsampling-for-llm-based-web-agents",[453,2147,2148,2149,486,489],"browser-agents","llms","llm-context","bGJtg_9k7O95O2CJswaRFj4ONGhX4hGr_8aL5dhDZms",{"id":2152,"title":2153,"authorId":495,"body":2154,"category":453,"created":2883,"description":2884,"extension":456,"faqs":468,"featurePriority":468,"head":468,"landingPath":468,"meta":2885,"navigation":480,"ogImage":468,"path":1590,"robots":468,"schemaOrg":468,"seo":2894,"sitemap":2895,"stem":2896,"tags":2897,"__hash__":2898},"blog/blog/1011.a-gentle-introduction-to-ai-agents-for-the-web.md","A Gentle Introduction to AI Agents for the Web",{"type":8,"value":2155,"toc":2864},[2156,2170,2173,2180,2186,2190,2193,2208,2212,2222,2226,2230,2243,2247,2251,2254,2259,2263,2272,2276,2287,2292,2296,2314,2318,2324,2428,2431,2664,2680,2684,2687,2692,2696,2699,2703,2721,2746,2753,2757,2795,2798,2809,2813,2816,2844,2848,2856,2861],[11,2157,2158,2159,511,2163,690,2166,2169],{},"In no time, AI became a natural part of modern web interfaces. AI agents for the web enjoy a recent hype, sparked by the means of ",[505,2160,510],{"href":2161,"rel":2162},"https://openai.com/index/introducing-operator/",[509],[505,2164,516],{"href":514,"rel":2165},[509],[505,2167,521],{"href":519,"rel":2168},[509],". By now, it is within reach to automate arbitrary web-based tasks, such as booking the cheapest flight from Berlin to Amsterdam.",[79,2171,2137],{"id":2172},"what-is-a-web-agent",[11,2174,2175,2176,2179],{},"For starters, let us break down the term ",[21,2177,2178],{},"web AI agent",": An agent is an entity that autonomously acts on behalf of another entity. An artificially intelligent agent is an application that acts on behalf of a human. In contrast to non-AI computer agents, it solves complex tasks with at least human-grade effectiveness and efficiency. For a human-centric web, web agents have deliberately been designed to browse the web in a human fashion – through UIs rather than APIs.",[53,2181],{":width":2182,"alt":2183,"format":2184,"loading":58,"src":2185},"610","High-level agent description comparing human and computer agents","svg","/blog/a-gentle-introduction-to-ai-agents-for-the-web/1.svg",[94,2187,2189],{"id":2188},"the-role-of-frontier-llms","The Role of Frontier LLMs",[11,2191,2192],{},"Web agents have been a vague desire for a long time. AI agents used to rely on complete models of a problem domain in order to allow (heuristic) search through problem states. Such models would comprise the problem world (e.g., a chessboard), actors (pawns, rooks, etc.), possible actions per actor (rook moves straight), and constraints (i.a., max one piece per field). A heterogeneous space of web application UIs describes the problem domain of a web agent: how to understand a web page, and how to interact with it to solve the declared task?",[11,2194,2195,2196,2203,2204,2207],{},"Frontier LLMs disrupted the AI agent world: explicit problem domain models beyond feasibility can now be replaced by an LLM. The LLM thereby acts as an instantaneous domain model backend that can be consulted with twofold context: serialised problem state, such as a chess position code (",[550,2197,2198,2199,2202],{},"“",[812,2200,2201],{},"..."," e4 e5 2. Nc3 f5”","), and the respective task (",[550,2205,2206],{},"“What is the best move for white?”","). For web agents, problem state corresponds to the currently browsed web application's runtime state, for instance, a screenshot.",[94,2209,2211],{"id":2210},"generalist-web-agents","Generalist Web Agents",[11,2213,2214,2215,690,2218,2221],{},"Generalist web agents are supposed to solve arbitrary tasks through a web browser. Web-based tasks can be as diverse as ",[550,2216,2217],{},"“Find a picture of a cat.”",[550,2219,2220],{},"“Book the cheapest flight from Berlin to Amsterdam tomorrow afternoon (business class, window seat).”"," In reality, generalist agents still fail uncommon or too precise tasks. While they have been critically acclaimed, they mainly act as early proofs-of-concept. Tasks that are indeed solvable with a generalist agent promise great results with an according specialist agent.",[53,2223],{":width":56,"alt":2224,"format":92,"loading":58,"src":2225},"Screenshot of a generalist web agent UI (Director)","/blog/a-gentle-introduction-to-ai-agents-for-the-web/2.png",[94,2227,2229],{"id":2228},"specialist-web-agents","Specialist Web Agents",[11,2231,2232,2233,2236,2237,2242],{},"Other than generalist agents, specialist web agents are constrained to a certain task and application domain. Specialist agents bear the major share of commercial value. Most prominently, modal chat agents that provide users with on-page help. Picture a little floating widget that can be chatted to via text or voice input. In most cases, in fact, the term ",[550,2234,2235],{},"web (AI) agent"," refers to chat agents. Chat agents – text or voice – can be implemented on top of virtually any existing website. Frontier LLMs provide a lot of commonsense out-of-the-box. A ",[505,2238,2241],{"href":2239,"rel":2240},"https://docs.claude.com/en/docs/build-with-claude/prompt-engineering/system-prompts",[509],"system prompt"," can, moreover, be leveraged to drive specialist agent quality for the respective problem domain.",[53,2244],{":width":56,"alt":2245,"format":92,"loading":58,"src":2246},"Screenshots of two modal specialist web agent UIs augmenting an underlying website's UI","/blog/a-gentle-introduction-to-ai-agents-for-the-web/3.png",[79,2248,2250],{"id":2249},"how-does-a-web-agent-work","How Does a Web Agent Work?",[11,2252,2253],{},"LLM-based web agents are premised on a more or less uniform architecture. The agent application embodies a mediator between a web browser (environment), and the LLM backend (model).",[53,2255],{":width":2256,"alt":2257,"format":2184,"loading":58,"src":2258},"480","High-level web agent architecture component view","/blog/a-gentle-introduction-to-ai-agents-for-the-web/4.svg",[94,2260,2262],{"id":2261},"the-agent-lifecycle","The Agent Lifecycle",[11,2264,2265,2266,2271],{},"To reduce a user's cognitive load, solving a web-based task is usually chunked into a sequence of UI states. Consider looking for rental apartments on ",[505,2267,2270],{"href":2268,"rel":2269},"https://www.redfin.com",[509],"redfin.com",": In the first step, you specify a location. Only subsequently are you provided with a grid of available apartments for that location.",[53,2273],{":width":56,"alt":2274,"format":92,"loading":58,"src":2275},"Example of separated UI states in a rental home search application","/blog/a-gentle-introduction-to-ai-agents-for-the-web/5.png",[11,2277,2278,2279,2286],{},"Web agent logic is iterative; not least for a sequential web interaction model, but also for a conversational agent interaction model. Browsing the web, human and computer agents represent users alike. That said, Norman's well-known ",[505,2280,2283],{"href":2281,"rel":2282},"https://mitpress.mit.edu/9780262640374/the-design-of-everyday-things/",[509],[550,2284,2285],{},"Seven Stages of Action",", which hierarchically model the human cognition cycle, transfer to the web agent lifecycle. For each UI state in a web browser (environment) and web-based task (action intention); decide where to click, type, etc. (action planning), and perform those clicks, etc. (action execution). Afterwards, perceive, interpret, and evaluate the results of those actions in the web browser (state). As long as there is a mismatch between the evaluated state and the declared goal state, repeat that cycle. Potentially prompt the user with more required information.",[53,2288],{":width":2289,"alt":2290,"format":2184,"loading":58,"src":2291},"580","Donald 'Norman's Seven Stages of Action' model of the human cognition cycle that transfers to non-human agents","/blog/a-gentle-introduction-to-ai-agents-for-the-web/6.svg",[94,2293,2295],{"id":2294},"web-context-for-llms","Web Context for LLMs",[11,2297,2298,2299,2301,2302,2305,2306,2309,2310,2313],{},"The gap from an agent towards the environment, according to ",[550,2300,2285],{},", is known as the ",[550,2303,2304],{},"gulf of execution",". In real-world scenarios, how to act in the environment in respect to a planned sequence of actions might be difficult (e.g., how to actually open the trunk of a new car?). Arguably, web agents face a novel ",[550,2307,2308],{},"gulf of intention"," towards the action planning stage: how to serialise a currently browsed web page's runtime state for LLMs? ",[550,2311,2312],{},"Snapshot"," is a more comprehensive term to describe the serialisation of a web page's current runtime state. Screenshots, for instance, represent a type of snapshot that closely resembles how humans perceive a web page at a given point in time. But are they as accessible to LLMs?",[94,2315,2317],{"id":2316},"agentic-ui-interaction","Agentic UI Interaction",[11,2319,2320,2321,2323],{},"With a qualified set of well-defined actuation methods, web agents are able to close the ",[550,2322,2304],{}," quite well. HTML element types strongly afford a certain action (e.g., click a button, type to a field). Below is how an actuation schema to present the LLM backend with could look like:",[804,2325,2327],{"className":1607,"code":2326,"language":1609,"meta":419,"style":419},"interface ActuationSchema = {\n    thought: string;\n    action: \"click\"\n        | \"scroll\"\n        | \"type\";\n    cssSelector: string;\n    data?: string;\n}[];\n",[609,2328,2329,2343,2355,2372,2384,2396,2407,2418],{"__ignoreMap":419},[812,2330,2331,2334,2337,2340],{"class":814,"line":815},[812,2332,2333],{"class":1616},"interface",[812,2335,2336],{"class":1620}," ActuationSchema",[812,2338,2339],{"class":934}," = ",[812,2341,2342],{"class":1643},"{\n",[812,2344,2345,2348,2351,2353],{"class":814,"line":420},[812,2346,2347],{"class":934},"    thought",[812,2349,2350],{"class":818},":",[812,2352,1640],{"class":1639},[812,2354,1644],{"class":1643},[812,2356,2357,2360,2362,2365,2369],{"class":814,"line":425},[812,2358,2359],{"class":934},"    action",[812,2361,2350],{"class":818},[812,2363,2364],{"class":832}," \"",[812,2366,2368],{"class":2367},"sgAC-","click",[812,2370,2371],{"class":832},"\"\n",[812,2373,2374,2377,2379,2382],{"class":814,"line":945},[812,2375,2376],{"class":818},"        |",[812,2378,2364],{"class":832},[812,2380,2381],{"class":2367},"scroll",[812,2383,2371],{"class":832},[812,2385,2386,2388,2390,2392,2394],{"class":814,"line":954},[812,2387,2376],{"class":818},[812,2389,2364],{"class":832},[812,2391,1617],{"class":2367},[812,2393,833],{"class":832},[812,2395,1644],{"class":1643},[812,2397,2398,2401,2403,2405],{"class":814,"line":975},[812,2399,2400],{"class":934},"    cssSelector",[812,2402,2350],{"class":818},[812,2404,1640],{"class":1639},[812,2406,1644],{"class":1643},[812,2408,2409,2412,2414,2416],{"class":814,"line":994},[812,2410,2411],{"class":934},"    data",[812,2413,1665],{"class":818},[812,2415,1640],{"class":1639},[812,2417,1644],{"class":1643},[812,2419,2420,2423,2426],{"class":814,"line":1003},[812,2421,2422],{"class":1643},"}",[812,2424,2425],{"class":934},"[]",[812,2427,1644],{"class":1643},[11,2429,2430],{},"And a suggested actions response could, in turn, look as follows:",[804,2432,2436],{"className":2433,"code":2434,"language":2435,"meta":419,"style":419},"language-json shiki shiki-themes catppuccin-latte night-owl","[\n    {\n        \"thought\": \"Scroll newsletter cta into view\",\n        \"action\": \"scroll\",\n        \"cssSelector\": \"section#newsletter\"\n    },\n    {\n        \"thought\": \"Type email address to newsletter cta\",\n        \"action\": \"type\",\n        \"cssSelector\": \"section#newsletter > input\",\n        \"data\": \"user@example.org\"\n    },\n    {\n        \"thought\": \"Submit newsletter sign up\",\n        \"action\": \"click\",\n        \"cssSelector\": \"section#newsletter > button\"\n    }\n]\n","json",[609,2437,2438,2443,2448,2472,2491,2509,2514,2518,2537,2555,2574,2592,2596,2600,2619,2637,2654,2659],{"__ignoreMap":419},[812,2439,2440],{"class":814,"line":815},[812,2441,2442],{"class":1643},"[\n",[812,2444,2445],{"class":814,"line":420},[812,2446,2447],{"class":1643},"    {\n",[812,2449,2450,2454,2458,2460,2462,2464,2468,2470],{"class":814,"line":425},[812,2451,2453],{"class":2452},"srFR9","        \"",[812,2455,2457],{"class":2456},"s30W1","thought",[812,2459,833],{"class":2452},[812,2461,2350],{"class":1643},[812,2463,2364],{"class":832},[812,2465,2467],{"class":2466},"sCC8C","Scroll newsletter cta into view",[812,2469,833],{"class":832},[812,2471,1726],{"class":1643},[812,2473,2474,2476,2479,2481,2483,2485,2487,2489],{"class":814,"line":945},[812,2475,2453],{"class":2452},[812,2477,2478],{"class":2456},"action",[812,2480,833],{"class":2452},[812,2482,2350],{"class":1643},[812,2484,2364],{"class":832},[812,2486,2381],{"class":2466},[812,2488,833],{"class":832},[812,2490,1726],{"class":1643},[812,2492,2493,2495,2498,2500,2502,2504,2507],{"class":814,"line":954},[812,2494,2453],{"class":2452},[812,2496,2497],{"class":2456},"cssSelector",[812,2499,833],{"class":2452},[812,2501,2350],{"class":1643},[812,2503,2364],{"class":832},[812,2505,2506],{"class":2466},"section#newsletter",[812,2508,2371],{"class":832},[812,2510,2511],{"class":814,"line":975},[812,2512,2513],{"class":1643},"    },\n",[812,2515,2516],{"class":814,"line":994},[812,2517,2447],{"class":1643},[812,2519,2520,2522,2524,2526,2528,2530,2533,2535],{"class":814,"line":1003},[812,2521,2453],{"class":2452},[812,2523,2457],{"class":2456},[812,2525,833],{"class":2452},[812,2527,2350],{"class":1643},[812,2529,2364],{"class":832},[812,2531,2532],{"class":2466},"Type email address to newsletter cta",[812,2534,833],{"class":832},[812,2536,1726],{"class":1643},[812,2538,2539,2541,2543,2545,2547,2549,2551,2553],{"class":814,"line":1009},[812,2540,2453],{"class":2452},[812,2542,2478],{"class":2456},[812,2544,833],{"class":2452},[812,2546,2350],{"class":1643},[812,2548,2364],{"class":832},[812,2550,1617],{"class":2466},[812,2552,833],{"class":832},[812,2554,1726],{"class":1643},[812,2556,2557,2559,2561,2563,2565,2567,2570,2572],{"class":814,"line":1015},[812,2558,2453],{"class":2452},[812,2560,2497],{"class":2456},[812,2562,833],{"class":2452},[812,2564,2350],{"class":1643},[812,2566,2364],{"class":832},[812,2568,2569],{"class":2466},"section#newsletter > input",[812,2571,833],{"class":832},[812,2573,1726],{"class":1643},[812,2575,2576,2578,2581,2583,2585,2587,2590],{"class":814,"line":1025},[812,2577,2453],{"class":2452},[812,2579,2580],{"class":2456},"data",[812,2582,833],{"class":2452},[812,2584,2350],{"class":1643},[812,2586,2364],{"class":832},[812,2588,2589],{"class":2466},"user@example.org",[812,2591,2371],{"class":832},[812,2593,2594],{"class":814,"line":1054},[812,2595,2513],{"class":1643},[812,2597,2598],{"class":814,"line":1064},[812,2599,2447],{"class":1643},[812,2601,2602,2604,2606,2608,2610,2612,2615,2617],{"class":814,"line":1083},[812,2603,2453],{"class":2452},[812,2605,2457],{"class":2456},[812,2607,833],{"class":2452},[812,2609,2350],{"class":1643},[812,2611,2364],{"class":832},[812,2613,2614],{"class":2466},"Submit newsletter sign up",[812,2616,833],{"class":832},[812,2618,1726],{"class":1643},[812,2620,2621,2623,2625,2627,2629,2631,2633,2635],{"class":814,"line":1101},[812,2622,2453],{"class":2452},[812,2624,2478],{"class":2456},[812,2626,833],{"class":2452},[812,2628,2350],{"class":1643},[812,2630,2364],{"class":832},[812,2632,2368],{"class":2466},[812,2634,833],{"class":832},[812,2636,1726],{"class":1643},[812,2638,2639,2641,2643,2645,2647,2649,2652],{"class":814,"line":1110},[812,2640,2453],{"class":2452},[812,2642,2497],{"class":2456},[812,2644,833],{"class":2452},[812,2646,2350],{"class":1643},[812,2648,2364],{"class":832},[812,2650,2651],{"class":2466},"section#newsletter > button",[812,2653,2371],{"class":832},[812,2655,2656],{"class":814,"line":1116},[812,2657,2658],{"class":1643},"    }\n",[812,2660,2661],{"class":814,"line":1122},[812,2662,2663],{"class":1643},"]\n",[699,2665,2666],{},[11,2667,2668,2673,2674,2679],{},[505,2669,2672],{"href":2670,"rel":2671},"https://platform.openai.com/docs/guides/function-calling",[509],"Function Calling"," and the ",[505,2675,2678],{"href":2676,"rel":2677},"https://modelcontextprotocol.io",[509],"Model Context Protocol"," represent two ends to outsource an explicit actuation model – server- and client-side, respectively.",[94,2681,2683],{"id":2682},"agentic-ui-augmentation","Agentic UI Augmentation",[11,2685,2686],{},"An agent represents yet another feature to integrate with an application and its UI. Discoverability and availability, however, are among the most fundamental requirements of a web agent. Evidently, when a user experiences UI/UX friction, at least the agent should be interactive. That said, a scrolling modal web agent UI has been the go-to approach, that is, a little floating widget on top of the underlying application's UI. It comes with a major advantage: the agent application can be decoupled from the underlying, self-contained application.",[53,2688],{":width":2689,"alt":2690,"format":2184,"loading":58,"src":2691},"360","Depiction of a web agent application augmenting an underlying application in an isolated layer","/blog/a-gentle-introduction-to-ai-agents-for-the-web/7.svg",[79,2693,2695],{"id":2694},"how-to-build-a-web-agent","How to Build a Web Agent?",[11,2697,2698],{},"Believe it or not: enhancing an existing web application with a purposeful agent is a lower-hanging fruit. The evolving agent ecosystem provides you with a spectrum of solutions: instantly use a pre-compiled agent, tweak a templated agent, or develop an agent from scratch. Either way, LLMs and web browsers exist for reuse, boiling down agent development to LLM context engineering, and UI augmentation.",[94,2700,2702],{"id":2701},"develop-a-web-agent","Develop a Web Agent",[11,2704,2705,2706,2709,2710,690,2715,2720],{},"Opting for a ",[21,2707,2708],{},"pre-compiled agent"," does not necessarily involve any actual development step. Instead, pre-compiled agents allow for high-level configuration through an agent-as-a-service provider's interface. Popular agent-as-a-service providers are, i.a., ",[505,2711,2714],{"href":2712,"rel":2713},"https://elevenlabs.io/conversational-ai",[509],"ElevenLabs",[505,2716,2719],{"href":2717,"rel":2718},"https://www.intercom.com/drlp/ai-agent",[509],"Intercom",". Serviced agents hide LLM communication and potentially interaction with a web browser behind the configuration interface.",[11,2722,2723,2724,2727,2728,2733,2734,2739,2740,2745],{},"Using a ",[21,2725,2726],{},"templated agent"," resembles the agent-as-a-service approach on a lower level. Openly sourced from a ",[505,2729,2732],{"href":2730,"rel":2731},"https://github.com/webfuse-com/agent-extension-blueprint",[509],"code repository",", templated agents allow for any kind of development tweaks. Favourably, agent templates shortcut integration with ",[505,2735,2738],{"href":2736,"rel":2737},"https://openai.com/api/",[509],"LLM APIs"," and web ",[505,2741,2744],{"href":2742,"rel":2743},"https://developer.mozilla.org/en-US/docs/Web/API",[509],"browser APIs",". Using a templated agent usually represents the preferable, best-of-both-worlds approach; common- and best-practice code snippets are available from the beginning, but everything can be customised as desired.",[11,2747,2748,2749,2752],{},"Of course, developing an ",[21,2750,2751],{},"agent from scratch"," is always an option. It is preferable whenever agent requirements deviate to a large extent from what exists in the service or template landscape.",[94,2754,2756],{"id":2755},"deploy-a-web-agent","Deploy a Web Agent",[11,2758,2759,2760,723,2765,2770,2771,2776,2777,2782,2783,2788,2789,2794],{},"When web agent code lives side-by-side with the augmented application's code, agent deployment is covered by a generic pipeline. Something like: ",[505,2761,2764],{"href":2762,"rel":2763},"https://eslint.org",[509],"linting",[505,2766,2769],{"href":2767,"rel":2768},"https://prettier.io",[509],"formatting"," agent code, ",[505,2772,2775],{"href":2773,"rel":2774},"https://esbuild.github.io",[509],"transpiling and bundling"," agent modules, ",[505,2778,2781],{"href":2779,"rel":2780},"https://www.cypress.io",[509],"testing"," agent, ",[505,2784,2787],{"href":2785,"rel":2786},"https://pages.cloudflare.com",[509],"hosting"," agent bundle, and ",[505,2790,2793],{"href":2791,"rel":2792},"https://docs.github.com/en/actions/get-started/continuous-integration",[509],"tiggering"," post deployment events. In that case, an agent represents a modular feature component in the application, no different than, for instance, a sign-up component.",[11,2796,2797],{},"Web agent source code right inside the application codebase comes at a cost:",[109,2799,2800,2803,2806],{},[112,2801,2802],{},"Agent developers can manipulate the source code of the underlying application.",[112,2804,2805],{},"Agent functionality could introduce side effects on the underlying application.",[112,2807,2808],{},"Agent changes require deployment of the entire application.",[94,2810,2812],{"id":2811},"best-practices-of-agentic-ux","Best Practices of Agentic UX",[11,2814,2815],{},"When designing user experiences for agent-enhanced applications, there are a few things to consider:",[109,2817,2818,2819,2818,2828,2818,2836],{},"\n    ",[112,2820,2821,2822,2821,2825,2827],{},"\n        ",[21,2823,2824],{},"Stream input and output to reduce latency",[2036,2826],{},"\n        LLMs (re-)introduce noticeable communication round-trip time. To reduce wait time for the human user, stream chunks of data whenever they are available.\n    ",[112,2829,2821,2830,2821,2833,2835],{},[21,2831,2832],{},"Provide fine-grained feedback to bridge high-latency",[2036,2834],{},"\n        Human attention is sensitive to several seconds of [system response time](https://www.nngroup.com/articles/response-times-3-important-limits/). Periodically provide agent _thoughts_ as feedback to perceptibly break down round-trip time.\n    ",[112,2837,2821,2838,2821,2841,2843],{},[21,2839,2840],{},"Always prompt the human user for consent to perform critical actions",[2036,2842],{},"\n        Some actions in a web application lead to irreversible or significant changes of state. Never have the agent perform such actions on behalf of the user without explicitly asking for the permission.\n    ",[94,2845,2847],{"id":2846},"non-invasive-web-agents-with-webfuse","Non-Invasive Web Agents with Webfuse",[11,2849,2850,2855],{},[505,2851,2853],{"href":1853,"rel":2852},[509],[21,2854,82],{}," is a configurable web proxy that lets you augment any web application. As pictured, web agents represent highly self-contained applications. Moreover, web agents and underlying applications communicate at runtime in the client. This does, in fact, render opportunities to bridge the above-mentioned drawbacks with Webfuse: Develop web agents with a sandbox extension methodology, and deploy them through the low-latency proxy layer. On demand, seamlessly serve users with your agent-enhanced website. Benefit from information hiding, safe code, and fewer deployments.",[157,2857],{":demoAction":2858,"heading":2859,"subtitle":2860},"{\"text\":\"Read more\",\"showIcon\":false,\"href\":\"https://www.webfuse.com/blog/category/ai-agents\"}","Deploy Web Agents with Webfuse","Develop or deploy web agents in minutes; serve agent-enhanced websites through an isolated application layer.",[2110,2862,2863],{},"html pre.shiki code .s76yb, html code.shiki .s76yb{--shiki-default:#8839EF;--shiki-dark:#C792EA}html pre.shiki code .sXbZB, html code.shiki .sXbZB{--shiki-default:#DF8E1D;--shiki-default-font-style:italic;--shiki-dark:#D6DEEB;--shiki-dark-font-style:inherit}html pre.shiki code .s2kId, html code.shiki .s2kId{--shiki-default:#4C4F69;--shiki-dark:#D6DEEB}html pre.shiki code .scGhl, html code.shiki .scGhl{--shiki-default:#7C7F93;--shiki-dark:#D6DEEB}html pre.shiki code .s9rnR, html code.shiki .s9rnR{--shiki-default:#179299;--shiki-dark:#7FDBCA}html pre.shiki code .scrte, html code.shiki .scrte{--shiki-default:#8839EF;--shiki-dark:#C5E478}html pre.shiki code .sbuKk, html code.shiki .sbuKk{--shiki-default:#40A02B;--shiki-dark:#D9F5DD}html pre.shiki code .sgAC-, html code.shiki .sgAC-{--shiki-default:#40A02B;--shiki-default-font-style:italic;--shiki-dark:#ECC48D;--shiki-dark-font-style:inherit}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .srFR9, html code.shiki .srFR9{--shiki-default:#7C7F93;--shiki-dark:#7FDBCA}html pre.shiki code .s30W1, html code.shiki .s30W1{--shiki-default:#1E66F5;--shiki-dark:#7FDBCA}html pre.shiki code .sCC8C, html code.shiki .sCC8C{--shiki-default:#40A02B;--shiki-dark:#C789D6}",{"title":419,"searchDepth":420,"depth":420,"links":2865},[2866,2871,2877],{"id":2172,"depth":420,"text":2137,"children":2867},[2868,2869,2870],{"id":2188,"depth":425,"text":2189},{"id":2210,"depth":425,"text":2211},{"id":2228,"depth":425,"text":2229},{"id":2249,"depth":420,"text":2250,"children":2872},[2873,2874,2875,2876],{"id":2261,"depth":425,"text":2262},{"id":2294,"depth":425,"text":2295},{"id":2316,"depth":425,"text":2317},{"id":2682,"depth":425,"text":2683},{"id":2694,"depth":420,"text":2695,"children":2878},[2879,2880,2881,2882],{"id":2701,"depth":425,"text":2702},{"id":2755,"depth":425,"text":2756},{"id":2811,"depth":425,"text":2812},{"id":2846,"depth":425,"text":2847},"2025-06-15","LLMs only recently enabled serviceable web agents: autonomous systems that browse web on behalf of a human. Get started with fundamental methodology, key design challenges, and technological opportunities.",{"homepage":480,"relatedLinks":2886},[2887,2888,2892],{"text":2133,"href":2134,"description":2135},{"text":2889,"href":2890,"description":2891},"Develop an AI Agent for Any Website with Webfuse","/blog/develop-an-ai-agent-for-any-website-with-webfuse","Learn how to develop and deploy a web agent for any website with Webfuse",{"text":1849,"href":2893,"external":480,"description":2141},"https://dev.webfuse.com/automation-api/",{"title":2153,"description":2884},{"loc":1590},"blog/1011.a-gentle-introduction-to-ai-agents-for-the-web",[453,2147,2148,486,489],"9anWTMfg6llLSdye3e9qWZZZcEAZcELLMk_vpnixn3M",1782224733880]