[{"data":1,"prerenderedAt":2120},["ShallowReactive",2],{"/blog/the-web-is-the-platform-how-webmcp-makes-agentic-customer-journeys-instant":3,"related-/blog/the-web-is-the-platform-how-webmcp-makes-agentic-customer-journeys-instant":603},{"id":4,"title":5,"authorId":6,"body":7,"category":557,"created":558,"description":559,"extension":560,"faqs":561,"featurePriority":158,"head":580,"landingPath":580,"meta":581,"navigation":592,"ogImage":580,"path":593,"robots":580,"schemaOrg":580,"seo":594,"sitemap":595,"stem":596,"tags":597,"__hash__":602},"blog/blog/1043.the-web-is-the-platform-how-webmcp-makes-agentic-customer-journeys-instant.md","The Web Is the Platform: How WebMCP Makes Agentic Customer Journeys Instant","nicholas-piel",{"type":8,"value":9,"toc":543},"minimark",[10,17,33,72,75,80,88,101,105,108,121,124,129,133,142,145,194,197,203,208,212,217,224,227,232,236,239,246,250,253,259,266,286,292,296,302,313,319,326,330,337,404,407,411,505,509,512,519,524,539],[11,12,13],"blockquote",{},[14,15,16],"p",{},"What if a customer could complete a loan application, an onboarding flow, or a purchase just by talking to you inside a chat? No downloads. No app installs. No friction.",[14,18,19,20,24,25,28,29,32],{},"That's not a hypothetical future. It's what happens when you combine ",[21,22,23],"strong",{},"Webfuse's zero-code co-browsing"," with ",[21,26,27],{},"Chrome's WebMCP protocol"," and a ",[21,30,31],{},"conversational voice agent",", all running in the browser.",[34,35,37,43,54,60,66],"tldr-box",{"title":36},"TL;DR",[14,38,39,42],{},[21,40,41],{},"The browser is the platform."," Around 90% of customer journeys happen on the web, including inside messaging-app in-app browsers.",[14,44,45,48,49,53],{},[21,46,47],{},"WebMCP replaces screenshots and DOM scraping."," Websites expose structured tools to AI agents through ",[50,51,52],"code",{},"navigator.modelContext",", so the agent calls actions instead of guessing pixels.",[14,55,56,59],{},[21,57,58],{},"Webfuse makes any site agent-ready in one click."," A proxy-based co-browsing layer works on any website, with no SDK, no app store review, and no native code.",[14,61,62,65],{},[21,63,64],{},"The voice layer is pluggable."," Ship today with ElevenLabs, or go fully Google-native with ADK + Gemini Audio.",[14,67,68,71],{},[21,69,70],{},"The result is days, not months."," What used to need SDK integrations, screenshot agents, and bespoke tool servers is now a composable web-native stack.",[14,73,74],{},"Let me show you.",[76,77,79],"h2",{"id":78},"try-it-live","Try It Live",[81,82],"iframe",{"src":83,"width":84,"height":85,"frameBorder":86,"style":87},"https://webfu.se/+11labs-nice-world-demo/","100%","700px","0","border-radius: 12px; box-shadow: 0 4px 24px rgba(0,0,0,0.1);",[14,89,90],{},[91,92,93,94,100],"em",{},"Interact with the demo above, or ",[95,96,99],"a",{"href":83,"rel":97},[98],"nofollow","open it in a new tab",".",[76,102,104],{"id":103},"the-problem-agents-are-still-pretending-to-be-humans","The Problem: Agents Are Still Pretending to Be Humans",[14,106,107],{},"If you've watched an AI agent \"use\" a website, you know the absurdity:",[109,110,111,115,118],"ul",{},[112,113,114],"li",{},"Taking screenshots of pages and burning through thousands of tokens to guess which button is which",[112,116,117],{},"Scraping raw HTML, hoping nothing changed since the last crawl",[112,119,120],{},"Clicking around until something works",[14,122,123],{},"This is billion-parameter models pretending to be humans, pixel by pixel. It's fragile, expensive, and slow. A single form fill that takes a human ten seconds might require dozens of sequential agent interactions, each one an inference call that adds latency and cost.",[14,125,126],{},[21,127,128],{},"Web UI is designed for humans. AI agents need structure.",[76,130,132],{"id":131},"the-solution-webmcp-gives-agents-structured-access-to-the-web","The Solution: WebMCP Gives Agents Structured Access to the Web",[14,134,135,136,139,140,100],{},"Enter ",[21,137,138],{},"WebMCP"," (Web Model Context Protocol), a browser-native standard from Google that lets websites expose structured, callable tools directly to AI agents through ",[50,141,52],{},[14,143,144],{},"Instead of an agent guessing which blue rectangle is the \"Submit\" button, the website publishes a contract:",[146,147,152],"pre",{"className":148,"code":149,"language":150,"meta":151,"style":151},"language-javascript shiki shiki-themes catppuccin-latte night-owl","// Instead of: \"find submit button and click it\"\n// With WebMCP, the agent calls:\nsubmit_form({ data })\n","javascript","",[50,153,154,163,169],{"__ignoreMap":151},[155,156,159],"span",{"class":157,"line":158},"line",1,[155,160,162],{"class":161},"sDmS1","// Instead of: \"find submit button and click it\"\n",[155,164,166],{"class":157,"line":165},2,[155,167,168],{"class":161},"// With WebMCP, the agent calls:\n",[155,170,172,176,180,184,188,191],{"class":157,"line":171},3,[155,173,175],{"class":174},"sNstc","submit_form",[155,177,179],{"class":178},"s2kId","(",[155,181,183],{"class":182},"scGhl","{",[155,185,187],{"class":186},"soAP-"," data",[155,189,190],{"class":182}," }",[155,192,193],{"class":178},")\n",[14,195,196],{},"The agent knows exactly what actions are available, what parameters they accept, and what results they return. No screenshots. No DOM scraping. No guessing.",[14,198,199,202],{},[21,200,201],{},"The difference is stark:"," WebMCP transforms every website into a structured tool for AI agents, turning the browser from a passive display into an interactive platform.",[204,205],"article-signup-cta",{"heading":206,"subtitle":207},"Make Any Website Agent-Ready","Webfuse turns any website into a shared, agent-controllable surface with zero code. Pair it with WebMCP and a voice agent to ship agentic customer journeys in days, not months, on the platform every customer already has - the browser.",[76,209,211],{"id":210},"how-it-works-in-practice","How It Works in Practice",[213,214,216],"h3",{"id":215},"layer-1-webfuse-instant-co-browsing-zero-code","Layer 1: Webfuse - Instant Co-Browsing, Zero Code",[14,218,219,220,223],{},"Webfuse is a proxy-based co-browsing platform. It works on ",[21,221,222],{},"any website",", including third-party sites, without requiring SDK installation, code changes, or app wrapper integration.",[14,225,226],{},"When a customer taps a button inside their messaging app, the session opens in the in-app browser. Both the agent and the user see the same page simultaneously. The agent can navigate, highlight, and guide. The user can interact freely.",[14,228,229],{},[21,230,231],{},"No join codes. No downloads. No QR codes. One tap.",[213,233,235],{"id":234},"layer-2-webmcp-structured-agent-to-web-interaction","Layer 2: WebMCP - Structured Agent-to-Web Interaction",[14,237,238],{},"While the user and agent share the viewport, WebMCP gives the agent structured access to page actions. The agent doesn't need to \"see\" the page visually to know what to do; it calls the tools the website exposes.",[14,240,241,242,245],{},"This is the critical shift: the agent acts on the page ",[91,243,244],{},"while"," the user watches and guides via voice.",[213,247,249],{"id":248},"layer-3-voice-natural-conversational-control","Layer 3: Voice - Natural, Conversational Control",[14,251,252],{},"The voice interface makes the interaction feel natural. The customer speaks. The agent responds. And while they're having this conversation, the agent is actively navigating the page, filling forms, clicking buttons, guiding the user through the journey.",[14,254,255,258],{},[21,256,257],{},"And here's the key:"," the voice layer is pluggable.",[14,260,261,262,265],{},"You can use ElevenLabs for production-ready voice today. Or you can swap in ",[21,263,264],{},"Google ADK + Gemini Audio"," for a fully Google-native stack:",[109,267,268,274,280],{},[112,269,270,273],{},[21,271,272],{},"Google ADK"," provides bidirectional voice streaming with Gemini-powered reasoning",[112,275,276,279],{},[21,277,278],{},"WebMCP tools"," integrate directly into ADK agents, no bridging layer needed",[112,281,282,285],{},[21,283,284],{},"Gemini multimodal audio"," handles both speech synthesis and understanding in one model",[14,287,288,289],{},"The narrative becomes clear: ",[91,290,291],{},"Webfuse on the web layer, ADK + WebMCP on the agent layer, Gemini for voice, all Google infrastructure.",[76,293,295],{"id":294},"why-web-first-wins","Why Web-First Wins",[14,297,298,299],{},"You might be wondering: ",[91,300,301],{},"why not use native app co-browsing?",[14,303,304,305,308,309,312],{},"Because ",[21,306,307],{},"90% of customer journeys happen on the web",". Even inside messaging apps, the in-app browser is your canvas. WebMCP makes that canvas ",[91,310,311],{},"agent-ready",". You don't need a native SDK to guide a customer through a form, a purchase, or an onboarding flow.",[14,314,315,316,318],{},"The browser is the universal runtime. And with WebMCP, it's also the most ",[91,317,311],{}," runtime.",[14,320,321,322,325],{},"For the handful of cases where you absolutely need native UI access, SDK-based tools fill that gap. But for the vast majority of customer interactions, forms, transactions, onboarding, guided sales, web is not \"good enough.\" It's the ",[91,323,324],{},"right"," platform.",[76,327,329],{"id":328},"speed-to-market-is-the-whole-point","Speed-to-Market Is the Whole Point",[14,331,332,333,336],{},"The demo you just saw above took ",[21,334,335],{},"days"," to put together, not weeks. Here's why:",[338,339,340,356],"table",{},[341,342,343],"thead",{},[344,345,346,350,353],"tr",{},[347,348,349],"th",{},"Layer",[347,351,352],{},"Traditional Approach",[347,354,355],{},"WebMCP + Webfuse Approach",[357,358,359,371,382,393],"tbody",{},[344,360,361,365,368],{},[362,363,364],"td",{},"Co-Browsing",[362,366,367],{},"SDK integration, app store review, native code",[362,369,370],{},"Zero-code proxy, works instantly on any website",[344,372,373,376,379],{},[362,374,375],{},"Agent Interaction",[362,377,378],{},"Screenshot-based parsing, DOM scraping, custom tool servers",[362,380,381],{},"WebMCP, structured tool registration in the browser",[344,383,384,387,390],{},[362,385,386],{},"Voice Interface",[362,388,389],{},"Third-party API integration",[362,391,392],{},"Pluggable: ElevenLabs or Google ADK + Gemini",[344,394,395,398,401],{},[362,396,397],{},"Total Time",[362,399,400],{},"Weeks to months",[362,402,403],{},"Days",[14,405,406],{},"The integration point is the agent, and with WebMCP, that integration is structured tool registration, not DOM parsing and screenshot analysis.",[76,408,410],{"id":409},"the-stack","The Stack",[338,412,413,425],{},[341,414,415],{},[344,416,417,419,422],{},[347,418,349],{},[347,420,421],{},"Technology",[347,423,424],{},"Role",[357,426,427,439,454,467,479,492],{},[344,428,429,433,436],{},[362,430,431],{},[21,432,364],{},[362,434,435],{},"Webfuse / Surfly",[362,437,438],{},"Proxy-based co-browsing, zero SDK, works on any website",[344,440,441,446,449],{},[362,442,443],{},[21,444,445],{},"Agent Protocol",[362,447,448],{},"Chrome WebMCP",[362,450,451,452],{},"Structured tool exposure for AI agents via ",[50,453,52],{},[344,455,456,461,464],{},[362,457,458],{},[21,459,460],{},"Voice (Option A)",[362,462,463],{},"ElevenLabs",[362,465,466],{},"Production-ready conversational voice",[344,468,469,474,476],{},[362,470,471],{},[21,472,473],{},"Voice (Option B)",[362,475,264],{},[362,477,478],{},"Fully Google-native stack with bidirectional streaming",[344,480,481,486,489],{},[362,482,483],{},[21,484,485],{},"Messaging",[362,487,488],{},"Google Messages / RCS",[362,490,491],{},"Entry point - in-app browser renders the web session",[344,493,494,499,502],{},[362,495,496],{},[21,497,498],{},"Session Control",[362,500,501],{},"Webfuse API",[362,503,504],{},"Start/stop sessions, one-click mobile launch",[76,506,508],{"id":507},"this-isnt-hypothetical","This Isn't Hypothetical",[14,510,511],{},"WebMCP is already in Chrome Canary and stabilizing. Google ADK is open-source and actively developed. Webfuse is production-ready today.",[14,513,514,515,518],{},"The infrastructure for web-first agentic customer experiences exists ",[91,516,517],{},"now",". It's composable. It's pluggable. And it runs on the platform everyone already has: the browser.",[14,520,521],{},[21,522,523],{},"The question isn't whether the web can be the platform for AI agents. The question is: how fast can you build on it?",[14,525,526],{},[91,527,528,529,534,535,100],{},"Want to see this in action for your use case? ",[95,530,533],{"href":531,"rel":532},"https://www.webfuse.com/demo/",[98],"Book a demo"," or ",[95,536,538],{"href":537},"mailto:hello@webfuse.com","reach out to our team",[540,541,542],"style",{},"html pre.shiki code .sDmS1, html code.shiki .sDmS1{--shiki-default:#7C7F93;--shiki-default-font-style:italic;--shiki-dark:#637777;--shiki-dark-font-style:italic}html pre.shiki code .sNstc, html code.shiki .sNstc{--shiki-default:#1E66F5;--shiki-default-font-style:italic;--shiki-dark:#82AAFF;--shiki-dark-font-style:italic}html pre.shiki code .s2kId, html code.shiki .s2kId{--shiki-default:#4C4F69;--shiki-dark:#D6DEEB}html pre.shiki code .scGhl, html code.shiki .scGhl{--shiki-default:#7C7F93;--shiki-dark:#D6DEEB}html pre.shiki code .soAP-, html code.shiki .soAP-{--shiki-default:#4C4F69;--shiki-dark:#D7DBE0}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}",{"title":151,"searchDepth":165,"depth":165,"links":544},[545,546,547,548,553,554,555,556],{"id":78,"depth":165,"text":79},{"id":103,"depth":165,"text":104},{"id":131,"depth":165,"text":132},{"id":210,"depth":165,"text":211,"children":549},[550,551,552],{"id":215,"depth":171,"text":216},{"id":234,"depth":171,"text":235},{"id":248,"depth":171,"text":249},{"id":294,"depth":165,"text":295},{"id":328,"depth":165,"text":329},{"id":409,"depth":165,"text":410},{"id":507,"depth":165,"text":508},"ai-agents","2026-05-18","How combining Webfuse zero-code co-browsing, Chrome's WebMCP protocol, and a conversational voice agent turns the browser into the fastest path to agentic customer journeys - no SDKs, no installs, no friction.","md",[562,565,568,571,574,577],{"question":563,"answer":564},"What is WebMCP and how does it differ from MCP?","WebMCP is a browser-native protocol that lets websites publish structured, callable tools to AI agents through navigator.modelContext. Unlike server-side MCP, WebMCP runs entirely inside the page, so an agent can act on the same DOM the user sees without screenshots or DOM scraping.",{"question":566,"answer":567},"Why use the browser instead of a native app for agentic experiences?","Around 90% of customer journeys still happen on the web, and even inside messaging apps the in-app browser is the canvas. With Webfuse and WebMCP, that canvas becomes agent-ready instantly - no SDK, no app store review, no native code.",{"question":569,"answer":570},"Can I swap the voice layer for a Google-native stack?","Yes. The voice layer is pluggable. You can use ElevenLabs for production-ready voice today, or swap in Google ADK with Gemini Audio for a fully Google-native stack where WebMCP tools integrate directly into ADK agents with no bridging layer.",{"question":572,"answer":573},"Why not native app co-browsing?","Because 90% of customer journeys happen on the web. Even inside messaging apps, the in-app browser is your canvas. WebMCP makes that canvas agent-ready. You don't need a native SDK to guide a customer through a form, a purchase, or an onboarding flow. The browser is the universal runtime.",{"question":575,"answer":576},"What about complex app workflows?","For the handful of cases where you absolutely need native UI access, tools like Cobrowse.io fill that gap. But for the vast majority of customer interactions, forms, transactions, onboarding, guided sales, web is not 'good enough,' it's the right platform. Especially with WebMCP giving agents structured access to page actions.",{"question":578,"answer":579},"How fast can we build something like this?","The demo took days to put together, not weeks. Webfuse is zero-code for the co-browsing layer. WebMCP is already in Chrome Canary and stabilizing. The integration point is the agent, and with WebMCP, that integration is structured tool registration, not DOM parsing and screenshot analysis. Speed-to-market is the whole point.",null,{"shortTitle":582,"relatedLinks":583},"The Web Is the Platform",[584,588],{"text":585,"href":586,"description":587},"What Is WebMCP?","/blog/what-is-webmcp-the-practical-guide-to-the-web-model-context-protocol","A practical guide to the Web Model Context Protocol and how websites can expose tools directly to AI agents.",{"text":589,"href":590,"description":591},"Architecture of a Web-Controlling Voice Agent","/blog/architecture-of-a-web-controlling-voice-agent","How a voice agent and a live browser session combine into a single shared workspace for the user.",true,"/blog/the-web-is-the-platform-how-webmcp-makes-agentic-customer-journeys-instant",{"title":5,"description":559},{"loc":593},"blog/1043.the-web-is-the-platform-how-webmcp-makes-agentic-customer-journeys-instant",[598,557,599,600,601],"webmcp","co-browsing","voice-ai","agentic-web","J_keadtSf4duCvt1wnPmpolXh5lTfomTbF0GAgJ_bYI",[604,1406],{"id":605,"title":606,"authorId":607,"body":608,"category":557,"created":1380,"description":1381,"extension":560,"faqs":580,"featurePriority":580,"head":580,"landingPath":580,"meta":1382,"navigation":592,"ogImage":580,"path":1396,"robots":580,"schemaOrg":580,"seo":1397,"sitemap":1398,"stem":1399,"tags":1400,"__hash__":1405},"blog/blog/1011.a-gentle-introduction-to-ai-agents-for-the-web.md","A Gentle Introduction to AI Agents for the Web","thassilo-schiepanski",{"type":8,"value":609,"toc":1361},[610,631,635,642,650,654,657,672,676,686,692,696,709,713,717,720,725,729,738,742,753,758,762,780,784,790,911,914,1158,1174,1178,1181,1186,1190,1193,1197,1214,1239,1246,1250,1289,1292,1303,1307,1310,1339,1343,1353,1358],[14,611,612,613,618,619,624,625,630],{},"In no time, AI became a natural part of modern web interfaces. AI agents for the web enjoy a recent hype, sparked by the means of ",[95,614,617],{"href":615,"rel":616},"https://openai.com/index/introducing-operator/",[98],"Operator (OpenAI)",", ",[95,620,623],{"href":621,"rel":622},"https://www.director.ai",[98],"Director (Browserbase)",", and ",[95,626,629],{"href":627,"rel":628},"https://browser-use.com",[98],"Browser Use",". By now, it is within reach to automate arbitrary web-based tasks, such as booking the cheapest flight from Berlin to Amsterdam.",[76,632,634],{"id":633},"what-is-a-web-agent","What is a Web Agent?",[14,636,637,638,641],{},"For starters, let us break down the term ",[21,639,640],{},"web AI agent",": An agent is an entity that autonomously acts on behalf of another entity. An artificially intelligent agent is an application that acts on behalf of a human. In contrast to non-AI computer agents, it solves complex tasks with at least human-grade effectiveness and efficiency. For a human-centric web, web agents have deliberately been designed to browse the web in a human fashion – through UIs rather than APIs.",[643,644],"nuxt-picture",{":width":645,"alt":646,"format":647,"loading":648,"src":649},"610","High-level agent description comparing human and computer agents","svg","lazy","/blog/a-gentle-introduction-to-ai-agents-for-the-web/1.svg",[213,651,653],{"id":652},"the-role-of-frontier-llms","The Role of Frontier LLMs",[14,655,656],{},"Web agents have been a vague desire for a long time. AI agents used to rely on complete models of a problem domain in order to allow (heuristic) search through problem states. Such models would comprise the problem world (e.g., a chessboard), actors (pawns, rooks, etc.), possible actions per actor (rook moves straight), and constraints (i.a., max one piece per field). A heterogeneous space of web application UIs describes the problem domain of a web agent: how to understand a web page, and how to interact with it to solve the declared task?",[14,658,659,660,667,668,671],{},"Frontier LLMs disrupted the AI agent world: explicit problem domain models beyond feasibility can now be replaced by an LLM. The LLM thereby acts as an instantaneous domain model backend that can be consulted with twofold context: serialised problem state, such as a chess position code (",[91,661,662,663,666],{},"“",[155,664,665],{},"..."," e4 e5 2. Nc3 f5”","), and the respective task (",[91,669,670],{},"“What is the best move for white?”","). For web agents, problem state corresponds to the currently browsed web application's runtime state, for instance, a screenshot.",[213,673,675],{"id":674},"generalist-web-agents","Generalist Web Agents",[14,677,678,679,624,682,685],{},"Generalist web agents are supposed to solve arbitrary tasks through a web browser. Web-based tasks can be as diverse as ",[91,680,681],{},"“Find a picture of a cat.”",[91,683,684],{},"“Book the cheapest flight from Berlin to Amsterdam tomorrow afternoon (business class, window seat).”"," In reality, generalist agents still fail uncommon or too precise tasks. While they have been critically acclaimed, they mainly act as early proofs-of-concept. Tasks that are indeed solvable with a generalist agent promise great results with an according specialist agent.",[643,687],{":width":688,"alt":689,"format":690,"loading":648,"src":691},"900","Screenshot of a generalist web agent UI (Director)","webp","/blog/a-gentle-introduction-to-ai-agents-for-the-web/2.png",[213,693,695],{"id":694},"specialist-web-agents","Specialist Web Agents",[14,697,698,699,702,703,708],{},"Other than generalist agents, specialist web agents are constrained to a certain task and application domain. Specialist agents bear the major share of commercial value. Most prominently, modal chat agents that provide users with on-page help. Picture a little floating widget that can be chatted to via text or voice input. In most cases, in fact, the term ",[91,700,701],{},"web (AI) agent"," refers to chat agents. Chat agents – text or voice – can be implemented on top of virtually any existing website. Frontier LLMs provide a lot of commonsense out-of-the-box. A ",[95,704,707],{"href":705,"rel":706},"https://docs.claude.com/en/docs/build-with-claude/prompt-engineering/system-prompts",[98],"system prompt"," can, moreover, be leveraged to drive specialist agent quality for the respective problem domain.",[643,710],{":width":688,"alt":711,"format":690,"loading":648,"src":712},"Screenshots of two modal specialist web agent UIs augmenting an underlying website's UI","/blog/a-gentle-introduction-to-ai-agents-for-the-web/3.png",[76,714,716],{"id":715},"how-does-a-web-agent-work","How Does a Web Agent Work?",[14,718,719],{},"LLM-based web agents are premised on a more or less uniform architecture. The agent application embodies a mediator between a web browser (environment), and the LLM backend (model).",[643,721],{":width":722,"alt":723,"format":647,"loading":648,"src":724},"480","High-level web agent architecture component view","/blog/a-gentle-introduction-to-ai-agents-for-the-web/4.svg",[213,726,728],{"id":727},"the-agent-lifecycle","The Agent Lifecycle",[14,730,731,732,737],{},"To reduce a user's cognitive load, solving a web-based task is usually chunked into a sequence of UI states. Consider looking for rental apartments on ",[95,733,736],{"href":734,"rel":735},"https://www.redfin.com",[98],"redfin.com",": In the first step, you specify a location. Only subsequently are you provided with a grid of available apartments for that location.",[643,739],{":width":688,"alt":740,"format":690,"loading":648,"src":741},"Example of separated UI states in a rental home search application","/blog/a-gentle-introduction-to-ai-agents-for-the-web/5.png",[14,743,744,745,752],{},"Web agent logic is iterative; not least for a sequential web interaction model, but also for a conversational agent interaction model. Browsing the web, human and computer agents represent users alike. That said, Norman's well-known ",[95,746,749],{"href":747,"rel":748},"https://mitpress.mit.edu/9780262640374/the-design-of-everyday-things/",[98],[91,750,751],{},"Seven Stages of Action",", which hierarchically model the human cognition cycle, transfer to the web agent lifecycle. For each UI state in a web browser (environment) and web-based task (action intention); decide where to click, type, etc. (action planning), and perform those clicks, etc. (action execution). Afterwards, perceive, interpret, and evaluate the results of those actions in the web browser (state). As long as there is a mismatch between the evaluated state and the declared goal state, repeat that cycle. Potentially prompt the user with more required information.",[643,754],{":width":755,"alt":756,"format":647,"loading":648,"src":757},"580","Donald 'Norman's Seven Stages of Action' model of the human cognition cycle that transfers to non-human agents","/blog/a-gentle-introduction-to-ai-agents-for-the-web/6.svg",[213,759,761],{"id":760},"web-context-for-llms","Web Context for LLMs",[14,763,764,765,767,768,771,772,775,776,779],{},"The gap from an agent towards the environment, according to ",[91,766,751],{},", is known as the ",[91,769,770],{},"gulf of execution",". In real-world scenarios, how to act in the environment in respect to a planned sequence of actions might be difficult (e.g., how to actually open the trunk of a new car?). Arguably, web agents face a novel ",[91,773,774],{},"gulf of intention"," towards the action planning stage: how to serialise a currently browsed web page's runtime state for LLMs? ",[91,777,778],{},"Snapshot"," is a more comprehensive term to describe the serialisation of a web page's current runtime state. Screenshots, for instance, represent a type of snapshot that closely resembles how humans perceive a web page at a given point in time. But are they as accessible to LLMs?",[213,781,783],{"id":782},"agentic-ui-interaction","Agentic UI Interaction",[14,785,786,787,789],{},"With a qualified set of well-defined actuation methods, web agents are able to close the ",[91,788,770],{}," quite well. HTML element types strongly afford a certain action (e.g., click a button, type to a field). Below is how an actuation schema to present the LLM backend with could look like:",[146,791,795],{"className":792,"code":793,"language":794,"meta":151,"style":151},"language-ts shiki shiki-themes catppuccin-latte night-owl","interface ActuationSchema = {\n    thought: string;\n    action: \"click\"\n        | \"scroll\"\n        | \"type\";\n    cssSelector: string;\n    data?: string;\n}[];\n","ts",[50,796,797,813,829,847,860,875,887,900],{"__ignoreMap":151},[155,798,799,803,807,810],{"class":157,"line":158},[155,800,802],{"class":801},"s76yb","interface",[155,804,806],{"class":805},"sXbZB"," ActuationSchema",[155,808,809],{"class":178}," = ",[155,811,812],{"class":182},"{\n",[155,814,815,818,822,826],{"class":157,"line":165},[155,816,817],{"class":178},"    thought",[155,819,821],{"class":820},"s9rnR",":",[155,823,825],{"class":824},"scrte"," string",[155,827,828],{"class":182},";\n",[155,830,831,834,836,840,844],{"class":157,"line":171},[155,832,833],{"class":178},"    action",[155,835,821],{"class":820},[155,837,839],{"class":838},"sbuKk"," \"",[155,841,843],{"class":842},"sgAC-","click",[155,845,846],{"class":838},"\"\n",[155,848,850,853,855,858],{"class":157,"line":849},4,[155,851,852],{"class":820},"        |",[155,854,839],{"class":838},[155,856,857],{"class":842},"scroll",[155,859,846],{"class":838},[155,861,863,865,867,870,873],{"class":157,"line":862},5,[155,864,852],{"class":820},[155,866,839],{"class":838},[155,868,869],{"class":842},"type",[155,871,872],{"class":838},"\"",[155,874,828],{"class":182},[155,876,878,881,883,885],{"class":157,"line":877},6,[155,879,880],{"class":178},"    cssSelector",[155,882,821],{"class":820},[155,884,825],{"class":824},[155,886,828],{"class":182},[155,888,890,893,896,898],{"class":157,"line":889},7,[155,891,892],{"class":178},"    data",[155,894,895],{"class":820},"?:",[155,897,825],{"class":824},[155,899,828],{"class":182},[155,901,903,906,909],{"class":157,"line":902},8,[155,904,905],{"class":182},"}",[155,907,908],{"class":178},"[]",[155,910,828],{"class":182},[14,912,913],{},"And a suggested actions response could, in turn, look as follows:",[146,915,919],{"className":916,"code":917,"language":918,"meta":151,"style":151},"language-json shiki shiki-themes catppuccin-latte night-owl","[\n    {\n        \"thought\": \"Scroll newsletter cta into view\",\n        \"action\": \"scroll\",\n        \"cssSelector\": \"section#newsletter\"\n    },\n    {\n        \"thought\": \"Type email address to newsletter cta\",\n        \"action\": \"type\",\n        \"cssSelector\": \"section#newsletter > input\",\n        \"data\": \"user@example.org\"\n    },\n    {\n        \"thought\": \"Submit newsletter sign up\",\n        \"action\": \"click\",\n        \"cssSelector\": \"section#newsletter > button\"\n    }\n]\n","json",[50,920,921,926,931,956,975,993,998,1002,1021,1040,1060,1079,1084,1089,1109,1128,1146,1152],{"__ignoreMap":151},[155,922,923],{"class":157,"line":158},[155,924,925],{"class":182},"[\n",[155,927,928],{"class":157,"line":165},[155,929,930],{"class":182},"    {\n",[155,932,933,937,941,943,945,947,951,953],{"class":157,"line":171},[155,934,936],{"class":935},"srFR9","        \"",[155,938,940],{"class":939},"s30W1","thought",[155,942,872],{"class":935},[155,944,821],{"class":182},[155,946,839],{"class":838},[155,948,950],{"class":949},"sCC8C","Scroll newsletter cta into view",[155,952,872],{"class":838},[155,954,955],{"class":182},",\n",[155,957,958,960,963,965,967,969,971,973],{"class":157,"line":849},[155,959,936],{"class":935},[155,961,962],{"class":939},"action",[155,964,872],{"class":935},[155,966,821],{"class":182},[155,968,839],{"class":838},[155,970,857],{"class":949},[155,972,872],{"class":838},[155,974,955],{"class":182},[155,976,977,979,982,984,986,988,991],{"class":157,"line":862},[155,978,936],{"class":935},[155,980,981],{"class":939},"cssSelector",[155,983,872],{"class":935},[155,985,821],{"class":182},[155,987,839],{"class":838},[155,989,990],{"class":949},"section#newsletter",[155,992,846],{"class":838},[155,994,995],{"class":157,"line":877},[155,996,997],{"class":182},"    },\n",[155,999,1000],{"class":157,"line":889},[155,1001,930],{"class":182},[155,1003,1004,1006,1008,1010,1012,1014,1017,1019],{"class":157,"line":902},[155,1005,936],{"class":935},[155,1007,940],{"class":939},[155,1009,872],{"class":935},[155,1011,821],{"class":182},[155,1013,839],{"class":838},[155,1015,1016],{"class":949},"Type email address to newsletter cta",[155,1018,872],{"class":838},[155,1020,955],{"class":182},[155,1022,1024,1026,1028,1030,1032,1034,1036,1038],{"class":157,"line":1023},9,[155,1025,936],{"class":935},[155,1027,962],{"class":939},[155,1029,872],{"class":935},[155,1031,821],{"class":182},[155,1033,839],{"class":838},[155,1035,869],{"class":949},[155,1037,872],{"class":838},[155,1039,955],{"class":182},[155,1041,1043,1045,1047,1049,1051,1053,1056,1058],{"class":157,"line":1042},10,[155,1044,936],{"class":935},[155,1046,981],{"class":939},[155,1048,872],{"class":935},[155,1050,821],{"class":182},[155,1052,839],{"class":838},[155,1054,1055],{"class":949},"section#newsletter > input",[155,1057,872],{"class":838},[155,1059,955],{"class":182},[155,1061,1063,1065,1068,1070,1072,1074,1077],{"class":157,"line":1062},11,[155,1064,936],{"class":935},[155,1066,1067],{"class":939},"data",[155,1069,872],{"class":935},[155,1071,821],{"class":182},[155,1073,839],{"class":838},[155,1075,1076],{"class":949},"user@example.org",[155,1078,846],{"class":838},[155,1080,1082],{"class":157,"line":1081},12,[155,1083,997],{"class":182},[155,1085,1087],{"class":157,"line":1086},13,[155,1088,930],{"class":182},[155,1090,1092,1094,1096,1098,1100,1102,1105,1107],{"class":157,"line":1091},14,[155,1093,936],{"class":935},[155,1095,940],{"class":939},[155,1097,872],{"class":935},[155,1099,821],{"class":182},[155,1101,839],{"class":838},[155,1103,1104],{"class":949},"Submit newsletter sign up",[155,1106,872],{"class":838},[155,1108,955],{"class":182},[155,1110,1112,1114,1116,1118,1120,1122,1124,1126],{"class":157,"line":1111},15,[155,1113,936],{"class":935},[155,1115,962],{"class":939},[155,1117,872],{"class":935},[155,1119,821],{"class":182},[155,1121,839],{"class":838},[155,1123,843],{"class":949},[155,1125,872],{"class":838},[155,1127,955],{"class":182},[155,1129,1131,1133,1135,1137,1139,1141,1144],{"class":157,"line":1130},16,[155,1132,936],{"class":935},[155,1134,981],{"class":939},[155,1136,872],{"class":935},[155,1138,821],{"class":182},[155,1140,839],{"class":838},[155,1142,1143],{"class":949},"section#newsletter > button",[155,1145,846],{"class":838},[155,1147,1149],{"class":157,"line":1148},17,[155,1150,1151],{"class":182},"    }\n",[155,1153,1155],{"class":157,"line":1154},18,[155,1156,1157],{"class":182},"]\n",[11,1159,1160],{},[14,1161,1162,1167,1168,1173],{},[95,1163,1166],{"href":1164,"rel":1165},"https://platform.openai.com/docs/guides/function-calling",[98],"Function Calling"," and the ",[95,1169,1172],{"href":1170,"rel":1171},"https://modelcontextprotocol.io",[98],"Model Context Protocol"," represent two ends to outsource an explicit actuation model – server- and client-side, respectively.",[213,1175,1177],{"id":1176},"agentic-ui-augmentation","Agentic UI Augmentation",[14,1179,1180],{},"An agent represents yet another feature to integrate with an application and its UI. Discoverability and availability, however, are among the most fundamental requirements of a web agent. Evidently, when a user experiences UI/UX friction, at least the agent should be interactive. That said, a scrolling modal web agent UI has been the go-to approach, that is, a little floating widget on top of the underlying application's UI. It comes with a major advantage: the agent application can be decoupled from the underlying, self-contained application.",[643,1182],{":width":1183,"alt":1184,"format":647,"loading":648,"src":1185},"360","Depiction of a web agent application augmenting an underlying application in an isolated layer","/blog/a-gentle-introduction-to-ai-agents-for-the-web/7.svg",[76,1187,1189],{"id":1188},"how-to-build-a-web-agent","How to Build a Web Agent?",[14,1191,1192],{},"Believe it or not: enhancing an existing web application with a purposeful agent is a lower-hanging fruit. The evolving agent ecosystem provides you with a spectrum of solutions: instantly use a pre-compiled agent, tweak a templated agent, or develop an agent from scratch. Either way, LLMs and web browsers exist for reuse, boiling down agent development to LLM context engineering, and UI augmentation.",[213,1194,1196],{"id":1195},"develop-a-web-agent","Develop a Web Agent",[14,1198,1199,1200,1203,1204,624,1208,1213],{},"Opting for a ",[21,1201,1202],{},"pre-compiled agent"," does not necessarily involve any actual development step. Instead, pre-compiled agents allow for high-level configuration through an agent-as-a-service provider's interface. Popular agent-as-a-service providers are, i.a., ",[95,1205,463],{"href":1206,"rel":1207},"https://elevenlabs.io/conversational-ai",[98],[95,1209,1212],{"href":1210,"rel":1211},"https://www.intercom.com/drlp/ai-agent",[98],"Intercom",". Serviced agents hide LLM communication and potentially interaction with a web browser behind the configuration interface.",[14,1215,1216,1217,1220,1221,1226,1227,1232,1233,1238],{},"Using a ",[21,1218,1219],{},"templated agent"," resembles the agent-as-a-service approach on a lower level. Openly sourced from a ",[95,1222,1225],{"href":1223,"rel":1224},"https://github.com/webfuse-com/agent-extension-blueprint",[98],"code repository",", templated agents allow for any kind of development tweaks. Favourably, agent templates shortcut integration with ",[95,1228,1231],{"href":1229,"rel":1230},"https://openai.com/api/",[98],"LLM APIs"," and web ",[95,1234,1237],{"href":1235,"rel":1236},"https://developer.mozilla.org/en-US/docs/Web/API",[98],"browser APIs",". Using a templated agent usually represents the preferable, best-of-both-worlds approach; common- and best-practice code snippets are available from the beginning, but everything can be customised as desired.",[14,1240,1241,1242,1245],{},"Of course, developing an ",[21,1243,1244],{},"agent from scratch"," is always an option. It is preferable whenever agent requirements deviate to a large extent from what exists in the service or template landscape.",[213,1247,1249],{"id":1248},"deploy-a-web-agent","Deploy a Web Agent",[14,1251,1252,1253,1258,1259,1264,1265,1270,1271,1276,1277,1282,1283,1288],{},"When web agent code lives side-by-side with the augmented application's code, agent deployment is covered by a generic pipeline. Something like: ",[95,1254,1257],{"href":1255,"rel":1256},"https://eslint.org",[98],"linting"," and ",[95,1260,1263],{"href":1261,"rel":1262},"https://prettier.io",[98],"formatting"," agent code, ",[95,1266,1269],{"href":1267,"rel":1268},"https://esbuild.github.io",[98],"transpiling and bundling"," agent modules, ",[95,1272,1275],{"href":1273,"rel":1274},"https://www.cypress.io",[98],"testing"," agent, ",[95,1278,1281],{"href":1279,"rel":1280},"https://pages.cloudflare.com",[98],"hosting"," agent bundle, and ",[95,1284,1287],{"href":1285,"rel":1286},"https://docs.github.com/en/actions/get-started/continuous-integration",[98],"tiggering"," post deployment events. In that case, an agent represents a modular feature component in the application, no different than, for instance, a sign-up component.",[14,1290,1291],{},"Web agent source code right inside the application codebase comes at a cost:",[109,1293,1294,1297,1300],{},[112,1295,1296],{},"Agent developers can manipulate the source code of the underlying application.",[112,1298,1299],{},"Agent functionality could introduce side effects on the underlying application.",[112,1301,1302],{},"Agent changes require deployment of the entire application.",[213,1304,1306],{"id":1305},"best-practices-of-agentic-ux","Best Practices of Agentic UX",[14,1308,1309],{},"When designing user experiences for agent-enhanced applications, there are a few things to consider:",[109,1311,1312,1313,1312,1323,1312,1331],{},"\n    ",[112,1314,1315,1316,1315,1319,1322],{},"\n        ",[21,1317,1318],{},"Stream input and output to reduce latency",[1320,1321],"br",{},"\n        LLMs (re-)introduce noticeable communication round-trip time. To reduce wait time for the human user, stream chunks of data whenever they are available.\n    ",[112,1324,1315,1325,1315,1328,1330],{},[21,1326,1327],{},"Provide fine-grained feedback to bridge high-latency",[1320,1329],{},"\n        Human attention is sensitive to several seconds of [system response time](https://www.nngroup.com/articles/response-times-3-important-limits/). Periodically provide agent _thoughts_ as feedback to perceptibly break down round-trip time.\n    ",[112,1332,1315,1333,1315,1336,1338],{},[21,1334,1335],{},"Always prompt the human user for consent to perform critical actions",[1320,1337],{},"\n        Some actions in a web application lead to irreversible or significant changes of state. Never have the agent perform such actions on behalf of the user without explicitly asking for the permission.\n    ",[213,1340,1342],{"id":1341},"non-invasive-web-agents-with-webfuse","Non-Invasive Web Agents with Webfuse",[14,1344,1345,1352],{},[95,1346,1349],{"href":1347,"rel":1348},"https://www.webfuse.com",[98],[21,1350,1351],{},"Webfuse"," is a configurable web proxy that lets you augment any web application. As pictured, web agents represent highly self-contained applications. Moreover, web agents and underlying applications communicate at runtime in the client. This does, in fact, render opportunities to bridge the above-mentioned drawbacks with Webfuse: Develop web agents with a sandbox extension methodology, and deploy them through the low-latency proxy layer. On demand, seamlessly serve users with your agent-enhanced website. Benefit from information hiding, safe code, and fewer deployments.",[204,1354],{":demoAction":1355,"heading":1356,"subtitle":1357},"{\"text\":\"Read more\",\"showIcon\":false,\"href\":\"https://www.webfuse.com/blog/category/ai-agents\"}","Deploy Web Agents with Webfuse","Develop or deploy web agents in minutes; serve agent-enhanced websites through an isolated application layer.",[540,1359,1360],{},"html pre.shiki code .s76yb, html code.shiki .s76yb{--shiki-default:#8839EF;--shiki-dark:#C792EA}html pre.shiki code .sXbZB, html code.shiki .sXbZB{--shiki-default:#DF8E1D;--shiki-default-font-style:italic;--shiki-dark:#D6DEEB;--shiki-dark-font-style:inherit}html pre.shiki code .s2kId, html code.shiki .s2kId{--shiki-default:#4C4F69;--shiki-dark:#D6DEEB}html pre.shiki code .scGhl, html code.shiki .scGhl{--shiki-default:#7C7F93;--shiki-dark:#D6DEEB}html pre.shiki code .s9rnR, html code.shiki .s9rnR{--shiki-default:#179299;--shiki-dark:#7FDBCA}html pre.shiki code .scrte, html code.shiki .scrte{--shiki-default:#8839EF;--shiki-dark:#C5E478}html pre.shiki code .sbuKk, html code.shiki .sbuKk{--shiki-default:#40A02B;--shiki-dark:#D9F5DD}html pre.shiki code .sgAC-, html code.shiki .sgAC-{--shiki-default:#40A02B;--shiki-default-font-style:italic;--shiki-dark:#ECC48D;--shiki-dark-font-style:inherit}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .srFR9, html code.shiki .srFR9{--shiki-default:#7C7F93;--shiki-dark:#7FDBCA}html pre.shiki code .s30W1, html code.shiki .s30W1{--shiki-default:#1E66F5;--shiki-dark:#7FDBCA}html pre.shiki code .sCC8C, html code.shiki .sCC8C{--shiki-default:#40A02B;--shiki-dark:#C789D6}",{"title":151,"searchDepth":165,"depth":165,"links":1362},[1363,1368,1374],{"id":633,"depth":165,"text":634,"children":1364},[1365,1366,1367],{"id":652,"depth":171,"text":653},{"id":674,"depth":171,"text":675},{"id":694,"depth":171,"text":695},{"id":715,"depth":165,"text":716,"children":1369},[1370,1371,1372,1373],{"id":727,"depth":171,"text":728},{"id":760,"depth":171,"text":761},{"id":782,"depth":171,"text":783},{"id":1176,"depth":171,"text":1177},{"id":1188,"depth":165,"text":1189,"children":1375},[1376,1377,1378,1379],{"id":1195,"depth":171,"text":1196},{"id":1248,"depth":171,"text":1249},{"id":1305,"depth":171,"text":1306},{"id":1341,"depth":171,"text":1342},"2025-06-15","LLMs only recently enabled serviceable web agents: autonomous systems that browse web on behalf of a human. Get started with fundamental methodology, key design challenges, and technological opportunities.",{"homepage":592,"relatedLinks":1383},[1384,1388,1392],{"text":1385,"href":1386,"description":1387},"What is a Website Snapshot?","/blog/snapshots-provide-llms-with-website-state","Learn what a website snapshot is and how to utilise it for web agents",{"text":1389,"href":1390,"description":1391},"Develop an AI Agent for Any Website with Webfuse","/blog/develop-an-ai-agent-for-any-website-with-webfuse","Learn how to develop and deploy a web agent for any website with Webfuse",{"text":1393,"href":1394,"external":592,"description":1395},"Webfuse Automation API","https://dev.webfuse.com/automation-api/","Check out the Webfuse Automation API","/blog/a-gentle-introduction-to-ai-agents-for-the-web",{"title":606,"description":1381},{"loc":1396},"blog/1011.a-gentle-introduction-to-ai-agents-for-the-web",[557,1401,1402,1403,1404],"browser-agents","llms","web-agents","web-automation","9anWTMfg6llLSdye3e9qWZZZcEAZcELLMk_vpnixn3M",{"id":1407,"title":1408,"authorId":1409,"body":1410,"category":2084,"created":2085,"description":2086,"extension":560,"faqs":2087,"featurePriority":2109,"head":580,"landingPath":580,"meta":2110,"navigation":592,"ogImage":580,"path":2111,"robots":580,"schemaOrg":580,"seo":2112,"sitemap":2113,"stem":2114,"tags":2115,"__hash__":2119},"blog/blog/1007.web-augmentation-comprehensive-guide.md","Web Augmentation: The Comprehensive Guide","salome-koshadze",{"type":8,"value":1411,"toc":2063},[1412,1415,1418,1422,1425,1428,1431,1463,1476,1486,1490,1493,1496,1516,1519,1522,1526,1529,1606,1609,1613,1637,1641,1663,1667,1686,1690,1693,1719,1722,1726,1733,1742,1746,1753,1756,1789,1792,1799,1802,1822,1825,1836,1840,1843,1846,1850,1883,1887,1914,1918,1949,1953,1980,1984,2011,2014,2018,2021,2054],[14,1413,1414],{},"We interact with countless websites every day and the way we use them is evolving beyond passive consumption. For a long time, we've accepted web applications as they were given to us. But what if the next development allowed you to layer your own specific requirements onto any website and share it with others?",[14,1416,1417],{},"Web augmentation makes this possible. In this comprehensive guide, I’ll walk you through its key aspects and how you can apply them to your own web interactions.",[76,1419,1421],{"id":1420},"what-is-web-augmentation","What is Web Augmentation?",[14,1423,1424],{},"Web augmentation in simple terms means changing or adding to existing websites and web applications.",[14,1426,1427],{},"Think of it as giving any site a tune-up or some new features, rather than building a whole new one from the ground up. The idea is to make these online tools work better for the people using them.",[14,1429,1430],{},"Modification can take many forms, from small visual adjustments to adding major new capabilities. Here are a few examples of what web augmentation can do:",[109,1432,1433,1439,1445,1451,1457],{},[112,1434,1435,1438],{},[21,1436,1437],{},"Visual Tweaks:"," Changing how a site looks, such as text sizes, colors or the layout of items.",[112,1440,1441,1444],{},[21,1442,1443],{},"Adding Features:"," Introducing a video chat window, a notes panel or an automated process directly onto a website.",[112,1446,1447,1450],{},[21,1448,1449],{},"Task Automation:"," Setting up scripts or tools to do repetitive tasks, like filling in forms or clicking through a series of pages for you.",[112,1452,1453,1456],{},[21,1454,1455],{},"Showing Extra Information:"," Displaying helpful tips, internal company notes or relevant data on top of the web page you’re looking at.",[112,1458,1459,1462],{},[21,1460,1461],{},"Connecting Two Services:"," Making two separate web tools work more smoothly, perhaps by allowing them to share information or be viewed side by side more easily.",[1464,1465,1469,1470,1475],"video",{"style":1466,"className":1467,"autoPlay":592,"muted":592,"controls":592},"width: 100%; height: auto; display: block;",[1468],"rounded-md","\n  ",[1471,1472],"source",{"src":1473,"type":1474},"https://surfly-public.s3.eu-west-1.amazonaws.com/blog/augmentation-demo.mp4","video/mp4","\n  Your browser does not support the video tag.\n",[14,1477,1478,1479,1485],{},"For a practical example of web augmentation in action, check out ",[95,1480,1484],{"href":1481,"rel":1482},"https://annotateweb.com",[1483],"dofollow","AnnotateWeb.com",", which demonstrates how you can add annotations and modifications to any website.",[76,1487,1489],{"id":1488},"why-augment-the-web","Why Augment the Web?",[14,1491,1492],{},"Standard websites and SaaS tools provide a set experience, but what happens when that experience doesn't quite align with our evolving needs or internal processes? This is where the real value of web augmentation begins to show.",[14,1494,1495],{},"Web augmentation offers a way to take back control and reshape these digital environments. Imagine being able to:",[109,1497,1498,1504,1510],{},[112,1499,1500,1503],{},[21,1501,1502],{},"Instantly tailor any website or SaaS application"," you use. You could modify its functions or adjust workflows to perfectly match your specific requirements, all without needing to dive into its original code or ask the vendor for changes.",[112,1505,1506,1509],{},[21,1507,1508],{},"Bypass vendor limitations and slow development cycles."," Instead of waiting for official updates, you can add important missing features, UI improvements, or custom integrations to third-party SaaS or even older legacy applications right when you need them.",[112,1511,1512,1515],{},[21,1513,1514],{},"Deploy powerful, 'browser extension-like' functionality centrally."," This means providing custom tools, data overlays, or UI enhancements directly within any website. Your team or users access these via a simple link, with zero software for them to install, removing IT headaches and user friction.",[14,1517,1518],{},"Beyond these, augmentation techniques also make it possible to integrate applications that were previously difficult to combine, such as embedding a specific CRM view into your support portal, or provide secure, interactive guidance on any site.",[14,1520,1521],{},"It's about making the web work for you, not the other way around.",[76,1523,1525],{"id":1524},"conventional-methods","Conventional Methods",[14,1527,1528],{},"Website customization isn't a new idea. Various methods have been developed over time, each with its own benefits and drawbacks. But, the main challenges remain the same: creating modifications that are shareable, universally compatible, and scalable.",[338,1530,1531,1549],{},[341,1532,1533],{},[344,1534,1535,1539,1544],{},[347,1536,1538],{"align":1537},"left","Feature",[347,1540,1541],{"align":1537},[21,1542,1543],{},"Web Augmentation",[347,1545,1546],{"align":1537},[21,1547,1548],{},"Custom Development",[357,1550,1551,1562,1573,1584,1595],{},[344,1552,1553,1556,1559],{},[362,1554,1555],{"align":1537},"Implementation Speed",[362,1557,1558],{"align":1537},"Days to Hours",[362,1560,1561],{"align":1537},"Years to Months",[344,1563,1564,1567,1570],{},[362,1565,1566],{"align":1537},"Costs",[362,1568,1569],{"align":1537},"Low(er)",[362,1571,1572],{"align":1537},"High(er)",[344,1574,1575,1578,1581],{},[362,1576,1577],{"align":1537},"App Control",[362,1579,1580],{"align":1537},"Frontend / UI",[362,1582,1583],{"align":1537},"Fullstack",[344,1585,1586,1589,1592],{},[362,1587,1588],{"align":1537},"3rd-Party App Control",[362,1590,1591],{"align":1537},"Yes",[362,1593,1594],{"align":1537},"No",[344,1596,1597,1600,1603],{},[362,1598,1599],{"align":1537},"Long-Term Strategy",[362,1601,1602],{"align":1537},"Adaptive",[362,1604,1605],{"align":1537},"Gradual",[14,1607,1608],{},"Let's look at some of these common approaches and where they tend to fall short.",[213,1610,1612],{"id":1611},"_1-direct-source-code-modification","1. Direct Source Code Modification",[109,1614,1615,1621,1627],{},[112,1616,1617,1620],{},[21,1618,1619],{},"What it is:"," Direct source modification means getting into the actual code of a website – the HTML, CSS, JavaScript, or backend programming – and changing it.",[112,1622,1623,1626],{},[21,1624,1625],{},"The Upside:"," You have complete control. You can change anything and everything about how the website looks and works.",[112,1628,1629,1632,1633,1636],{},[21,1630,1631],{},"The Downside:"," You can only do this if you ",[91,1634,1635],{},"own"," the website or have explicit permission and access to its code. For any site you don't own, this method is off the table.",[213,1638,1640],{"id":1639},"_2-browser-extensions-userscripts","2. Browser Extensions & Userscripts",[109,1642,1643,1649,1654],{},[112,1644,1645,1648],{},[21,1646,1647],{},"What they are:"," These are small programs or scripts (like Chrome Extensions, Firefox Add-ons, or scripts run with tools like Tampermonkey) that you install in your own web browser. They then make changes to websites as you view them on your computer.",[112,1650,1651,1653],{},[21,1652,1625],{}," They can be great for personalizing your own browsing. There's a huge library of existing extensions, and it's relatively easy for developers to create simple ones.",[112,1655,1656,1658,1659,1662],{},[21,1657,1631],{}," The main problem is that ",[91,1660,1661],{},"every single person"," who wants the augmented experience needs to find, install, and manage the extension or script themselves. This is a huge barrier for sharing it with others.",[213,1664,1666],{"id":1665},"_3-apis-official-integrations","3. APIs & Official Integrations",[109,1668,1669,1674,1680],{},[112,1670,1671,1673],{},[21,1672,1647],{}," APIs are official ways provided by a web service owner for other applications to interact with their service, often used for data exchange or triggering specific actions.",[112,1675,1676,1679],{},[21,1677,1678],{},"Pros:"," Generally stable, secure, and supported by the service provider; designed for reliable interaction.",[112,1681,1682,1685],{},[21,1683,1684],{},"Cons:"," You are limited to only what the API developer chooses to expose. APIs often don't allow for modifying the user interface or augmenting aspects outside the specific functions they cover. Implementing API integrations can still be complex and require development resources.",[213,1687,1689],{"id":1688},"challenges-of-conventional-methods","Challenges of Conventional Methods",[14,1691,1692],{},"Looking at these methods, a few common problems emerge when we think about truly effective and shareable web augmentation:",[109,1694,1695,1701,1707,1713],{},[112,1696,1697,1700],{},[21,1698,1699],{},"Friction for End-Users:"," Asking people to install software (like browser extensions) is often a deal-breaker.",[112,1702,1703,1706],{},[21,1704,1705],{},"Inconsistent Experiences:"," If everyone has a different setup, the augmented experience won't be reliable.",[112,1708,1709,1712],{},[21,1710,1711],{},"Scalability and Management Problems:"," Deploying and managing augmentations for many users is tough.",[112,1714,1715,1718],{},[21,1716,1717],{},"Embedding Difficulties:"," None of these older methods easily solve the common issue of trying to embed one website inside another when security settings prevent it.",[14,1720,1721],{},"These challenges highlight why there's a need for a more advanced approach – one that can provide powerful, consistent, and easily shared augmentations without burdening the end-user or requiring access to the original website's code.",[76,1723,1725],{"id":1724},"how-web-augmentation-works","How Web Augmentation Works",[14,1727,1728,1729,1732],{},"So, how do we get past these limitations and really tap into what web augmentation can do? That's where ",[21,1730,1731],{},"web augmentation platforms"," like Webfuse come in.",[14,1734,1735,1736,1258,1739],{},"Webfuse is built on two core technologies: the ",[21,1737,1738],{},"Augmented Web Proxy (AWP)",[21,1740,1741],{},"Virtual Web Sessions (VWS)",[76,1743,1745],{"id":1744},"the-augmented-web-proxy-awp","The Augmented Web Proxy (AWP)",[14,1747,1748,1749,1752],{},"You can think of the ",[21,1750,1751],{},"Augmented Web Proxy",", or AWP as the primary engine that makes web augmentation possible.  It is a proxy engine within the Webfuse platform that acts as an intermediary between an end-user's browser and the target web application's servers.",[14,1754,1755],{},"AWP performs several actions in real-time before content reaches the user:",[109,1757,1758,1775,1783],{},[112,1759,1760,1763,1764,1767,1768,1771,1772],{},[21,1761,1762],{},"Dynamic URL Rewriting:"," It maps original application domains to Webfuse-controlled domains. For example, ",[50,1765,1766],{},"targetapp.com"," becomes ",[50,1769,1770],{},"targetapp.webfuse.com"," or a custom domain you've configured.",[21,1773,1774],{},"‍",[112,1776,1777,1780,1781],{},[21,1778,1779],{},"Content Transformation:"," The AWP actively modifies resource paths for elements like scripts, images, links, and CSS files. This rewriting routes them correctly through the Webfuse proxy.",[21,1782,1774],{},[112,1784,1785,1788],{},[21,1786,1787],{},"Header Adjustments:"," It changes HTTP headers, such as Content-Security-Policy (CSP), X-Frame-Options, and Cross-Origin Resource Sharing (CORS) headers. These adjustments are often what allow for embedding or modifying websites that would otherwise restrict such actions.",[76,1790,1741],{"id":1791},"virtual-web-sessions-vws",[14,1793,1794,1795,1798],{},"A ",[21,1796,1797],{},"Virtual Web Session",", or VWS, is the actual interactive instance of a proxied and augmented web application that a user experiences. When you use a Webfuse link to access a site, the VWS is the environment where this modified version of the site is loaded and displayed.",[14,1800,1801],{},"Several things characterize a Virtual Web Session:",[109,1803,1804,1810,1816],{},[112,1805,1806,1809],{},[21,1807,1808],{},"Augmented Content Delivery:"," The VWS presents the web content after the AWP has rewritten it. This means users see the HTML, CSS, and JavaScript with all intended modifications and augmentations applied.",[112,1811,1812,1815],{},[21,1813,1814],{},"Isolated State Management:"," Each VWS maintains its own separate context for cookies, local storage, and session storage. This isolation ensures that data from one VWS does not affect other browser tabs or Webfuse sessions, even for the same website.",[112,1817,1818,1821],{},[21,1819,1820],{},"Secure JavaScript Execution:"," The original JavaScript from the target website, as well as any custom JavaScript introduced by Webfuse Session Extensions or Apps, runs within a sandboxed environment inside the VWS. This controlled execution helps manage how scripts interact with the page and each other.",[14,1823,1824],{},"Webfuse uses a stateless proxying approach for these sessions. This means it transparently forwards requests, helping to maintain user authentication and context with the original web application while the VWS itself remains a distinct, virtualized environment.",[14,1826,1827,1828,1831,1832,1835],{},"Those technologies together unlock advanced web capabilities. For example, because the platform has such deep insight into the session, it can generate detailed ",[21,1829,1830],{},"audit logs"," of user interactions (like form changes or clicks) and even create ",[21,1833,1834],{},"full video recordings"," of sessions for review or compliance, often without needing any changes to the original application.",[76,1837,1839],{"id":1838},"real-world-applications","Real-World Applications",[14,1841,1842],{},"Overcoming traditional web augmentation limitations, Webfuse opens up possibilities that weren't feasible before. Companies can add compliance tracking to any web app, teams can customize third-party tools without breaking them, and anyone can make websites work exactly how they need them to.",[14,1844,1845],{},"Here are some specific examples of these use cases:",[213,1847,1849],{"id":1848},"enterprise-compliance-security-overlays","Enterprise Compliance & Security Overlays",[109,1851,1852,1858,1877],{},[112,1853,1854,1857],{},[21,1855,1856],{},"The Challenge:"," Businesses need to ensure their teams follow strict data handling procedures, security protocols, and industry regulations (like GDPR, HIPAA, or PCI DSS) when using various web-based tools. This is especially tricky with third-party applications where the company doesn't control the source code.",[112,1859,1860,1863],{},[21,1861,1862],{},"The Augmentation Solution:",[109,1864,1865,1868,1871,1874],{},[112,1866,1867],{},"Injecting real-time data validation rules directly into web forms to prevent incorrect or non-compliant data entry.",[112,1869,1870],{},"Using element masking features to automatically hide or redact Personally Identifiable Information (PII) or other sensitive data from view or from being logged, even if the original site displays it.",[112,1872,1873],{},"Implementing detailed audit logging for specific user actions (e.g., viewing a customer record, exporting data) within any web application.",[112,1875,1876],{},"Displaying mandatory security warnings, disclaimers, or internal policy reminders contextually within relevant web applications.",[112,1878,1879,1882],{},[21,1880,1881],{},"The Advantage:"," Consistent policy enforcement across different applications without needing to modify their core systems. The zero end-user install nature ensures all relevant users automatically get the compliant, augmented version. This can drastically reduce risk and simplify compliance audits.",[213,1884,1886],{"id":1885},"automated-website-workflows-ai-powered-assistance","Automated Website Workflows & AI-Powered Assistance",[109,1888,1889,1894,1909],{},[112,1890,1891,1893],{},[21,1892,1856],{}," Users often struggle with complex, multi-step processes on websites, or teams spend time on repetitive data entry, navigation, or information retrieval tasks.",[112,1895,1896,1898],{},[21,1897,1862],{},[109,1899,1900,1903,1906],{},[112,1901,1902],{},"Deploying scripts that automate form filling, button clicking, or navigation through a sequence of pages based on predefined logic or user triggers.",[112,1904,1905],{},"Enabling \"virtual participants\" or bots to perform routine tasks 24/7 within an augmented session, such as checking for updates, extracting data, or monitoring information.",[112,1907,1908],{},"Embedding AI-powered agents directly into the session. These agents could provide real-time guidance, answer user questions contextually by understanding the content of the page, summarize information, or even assist in completing complex tasks.",[112,1910,1911,1913],{},[21,1912,1881],{}," Simplifies complex user journeys, significantly reduces manual effort and the potential for human error, and provides intelligent, in-context assistance, all delivered seamlessly via a shared link.",[213,1915,1917],{"id":1916},"secure-real-time-support-co-browsing","Secure Real-Time Support & Co-browsing",[109,1919,1920,1925,1944],{},[112,1921,1922,1924],{},[21,1923,1856],{}," Providing effective, hands-on support for customers or employees navigating web applications. Traditional screen sharing tools can be invasive, raise privacy concerns, require software installations, and might not work well in restricted environments.",[112,1926,1927,1929],{},[21,1928,1862],{},[109,1930,1931,1938,1941],{},[112,1932,1933,1934,1937],{},"Secure \"co-browsing\" sessions where a support agent and a user interact with the ",[91,1935,1936],{},"same"," virtual web session through a shared platform link.",[112,1939,1940],{},"Using built-in data masking to hide sensitive user information (like account balances or personal details) from the support agent's view, even while co-browsing.",[112,1942,1943],{},"Logging all actions during the support session for security, training, and audit purposes.",[112,1945,1946,1948],{},[21,1947,1881],{}," A much smoother, more secure, and context-aware support experience. Issues are resolved faster, improving customer/employee satisfaction. No software installation is required for the person needing help, making it highly accessible.",[213,1950,1952],{"id":1951},"integrating-apps-for-unified-dashboards","Integrating Apps for Unified Dashboards",[109,1954,1955,1960,1975],{},[112,1956,1957,1959],{},[21,1958,1856],{}," Many organizations want to create unified digital workplaces or custom dashboards that bring together various essential web applications (e.g., CRM, support ticketing, project management, financial tools). However, they are often blocked because these applications use security headers (like X-Frame-Options or Content-Security-Policy) that prevent them from being embedded in an iframe on another site.",[112,1961,1962,1964],{},[21,1963,1862],{},[109,1965,1966,1972],{},[112,1967,1968,1969,100],{},"The augmentation platform wraps the target web application (e.g., Salesforce, Zendesk, Jira) within its virtual session technology. This process can effectively manage the problematic embedding-prevention headers ",[91,1970,1971],{},"within the context of that session",[112,1973,1974],{},"The resulting augmented session, now containing the fully functional target application, can then be embedded using a standard iframe into a central portal, intranet, or another web application.",[112,1976,1977,1979],{},[21,1978,1881],{}," Enables the creation of truly integrated workspaces. Users can access and interact with multiple, full-featured web applications from a single interface, streamlining workflows and reducing the need to constantly switch tabs.",[213,1981,1983],{"id":1982},"upgrading-legacy-systems-without-rewrites","Upgrading Legacy Systems Without Rewrites",[109,1985,1986,1991,2006],{},[112,1987,1988,1990],{},[21,1989,1856],{}," Many organizations rely on older, internal web-based systems that are critical to their operations but have outdated user interfaces, clunky navigation, or lack modern functionalities. Rewriting or replacing these legacy systems can be prohibitively expensive, time-consuming, and risky.",[112,1992,1993,1995],{},[21,1994,1862],{},[109,1996,1997,2000,2003],{},[112,1998,1999],{},"Creating an augmentation layer that overlays a more modern, intuitive UI on top of the legacy system.",[112,2001,2002],{},"Simplifying navigation by hiding unused elements, rearranging fields for better workflow, or adding helpful tooltips and guides.",[112,2004,2005],{},"Injecting new functionalities using custom scripts – such as data lookups from other systems, quick links to related resources, or integrations with modern messaging tools – all without touching the legacy system's underlying code.",[112,2007,2008,2010],{},[21,2009,1881],{}," Extends the useful life of critical legacy systems, dramatically improves user experience and efficiency, and reduces training time for new employees. It offers a cost-effective alternative to a full rewrite, allowing organizations to get more value from their existing investments.",[14,2012,2013],{},"These examples are just a starting point. The core idea is that advanced web augmentation technology enables users and organizations to overcome common barriers related to access, control, usability, and integration, fundamentally changing how we can shape and interact with virtually any web application or website.",[76,2015,2017],{"id":2016},"final-take-aways","Final Take Aways",[14,2019,2020],{},"Throughout this guide, we've explored the concept of web augmentation and how it allows for the modification and enhancement of existing web platforms. Here’s a quick look back at the main points:",[109,2022,2023,2028,2034,2040,2048],{},[112,2024,2025,2027],{},[21,2026,1619],{}," Web augmentation involves changing or adding functionalities to websites and web applications to improve their utility or fit specific needs, without altering the original source code.",[112,2029,2030,2033],{},[21,2031,2032],{},"Why it's needed:"," Many standard web tools offer a \"one-size-fits-all\" experience. Augmentation allows users and organizations to customize these tools, adapt to new requirements quickly, and gain more control over their digital environment.",[112,2035,2036,2039],{},[21,2037,2038],{},"Conventional methods:"," Earlier approaches like direct source code modification, browser extensions, and APIs have their uses but often come with limitations in shareability, scalability, and end-user friction.",[112,2041,2042,2045,2046],{},[21,2043,2044],{},"Next-gen solutions:"," Modern web augmentation platforms utilize technologies like the Augmented Web Proxy (AWP) and Virtual Web Sessions (VWS). These allow for changes to be applied centrally and experienced seamlessly by users through a simple link, without any installation.",[21,2047,1774],{},[112,2049,2050,2053],{},[21,2051,2052],{},"Real-world value:"," We saw several practical applications, including enforcing enterprise compliance, automating workflows, providing secure co-browsing support, integrating disparate applications into unified dashboards, and upgrading legacy systems.",[14,2055,2056,2057,2062],{},"Ready to customize your web applications? ",[95,2058,2061],{"href":2059,"rel":2060},"https://webfuse.com/studio/auth/signup",[98],"Sign up"," and begin adapting web tools to your needs.",{"title":151,"searchDepth":165,"depth":165,"links":2064},[2065,2066,2067,2073,2074,2075,2076,2083],{"id":1420,"depth":165,"text":1421},{"id":1488,"depth":165,"text":1489},{"id":1524,"depth":165,"text":1525,"children":2068},[2069,2070,2071,2072],{"id":1611,"depth":171,"text":1612},{"id":1639,"depth":171,"text":1640},{"id":1665,"depth":171,"text":1666},{"id":1688,"depth":171,"text":1689},{"id":1724,"depth":165,"text":1725},{"id":1744,"depth":165,"text":1745},{"id":1791,"depth":165,"text":1741},{"id":1838,"depth":165,"text":1839,"children":2077},[2078,2079,2080,2081,2082],{"id":1848,"depth":171,"text":1849},{"id":1885,"depth":171,"text":1886},{"id":1916,"depth":171,"text":1917},{"id":1951,"depth":171,"text":1952},{"id":1982,"depth":171,"text":1983},{"id":2016,"depth":165,"text":2017},"web-augmentation","2025-05-26","Discover how to enhance any website with web augmentation. How Augmented Web Proxies and Virtual Web Sessions enable secure co-browsing, automation, compliance overlays, and more.",[2088,2091,2094,2097,2100,2103,2106],{"question":2089,"answer":2090},"What is web augmentation in simple terms?","Web augmentation means making changes or adding new capabilities to existing websites and web applications. You can think of it as giving a website a custom tune-up to better suit specific user or business requirements, without needing to build a new site from scratch.",{"question":2092,"answer":2093},"How is web augmentation different from directly changing a website's source code?","Direct source code modification requires you to own the website or have access to its original code. Web augmentation, on the other hand, allows you to apply changes to websites, even those you don't own, by layering modifications on top of the existing site as it's viewed.",{"question":2095,"answer":2096},"What are the main advantages of web augmentation platforms compared to browser extensions?","Browser extensions typically require each user to install them individually, which can be a barrier for widespread use. Web augmentation platforms can deliver a modified experience to many users through a simple link, with no software installation needed by the end-users, making sharing and management much simpler.",{"question":2098,"answer":2099},"How do modern web augmentation systems deliver changes without users installing software?","These systems often use a type of intermediary server. When you access a website through a link provided by the augmentation platform, this server modifies the website's content in real time before it reaches your browser. This way, you see the changed version without needing any special software on your device.",{"question":2101,"answer":2102},"Can web augmentation help businesses apply compliance rules to third-party web tools?","Yes, web augmentation can be very useful here. For instance, companies can add real-time data checks to forms on any web application, mask sensitive information from view, or create detailed logs of user actions. This helps enforce company policies and industry regulations across various web tools without altering the original applications.",{"question":2104,"answer":2105},"Is it possible to improve old or outdated web systems using web augmentation?","Absolutely. Web augmentation can provide a much-needed update to legacy web systems without costly rewrites. You could overlay a more modern user interface, simplify complex navigation, or inject new functionalities like connections to other services, all while the original underlying system remains untouched.",{"question":2107,"answer":2108},"What are some major benefits of using web augmentation for organizations?","Organizations benefit by customizing third-party apps to fit their workflows, adding features without vendor delays, deploying tools centrally via links, enhancing security compliance, and extending the lifespan of older systems with modern interfaces.",0,{"homepage":592},"/blog/web-augmentation-comprehensive-guide",{"title":1408,"description":2086},{"loc":2111},"blog/1007.web-augmentation-comprehensive-guide",[2084,2116,2117,599,2118],"augmented-web-proxies","virtual-web-sessions","compliance-overlays","3pRJ8uLYLV7YdJxdIr22Pnm_-8tj_zs2fPF_6qtQpY8",1779282829675]