Puppeteer Cheat Sheet

Complete quick reference for Puppeteer - the high-level Node.js library for driving Chrome and Firefox over the DevTools Protocol or WebDriver BiDi. Headless by default, scripted automation, scraping, and PDF rendering.

≈ 12 min read
Chrome DevTools team · Apache 2.0
Node.js · cross-platform
Last updated: April 28, 2026

What Is Puppeteer?

High-level Node.js library for headless Chrome and Firefox - scripted automation, scraping, testing, and PDF generation

The Core Idea

Puppeteer is a Chrome DevTools team-maintained Node.js library (Apache 2.0) that drives a real browser - Chrome for Testing by default, or Firefox via WebDriver BiDi. It speaks the Chrome DevTools Protocol (CDP) directly, giving fine-grained control of every page event: navigation, network, console, performance, accessibility. Modern Puppeteer favours the Locator API (auto-waits + actionability) over the legacy $/waitForSelector flow, and ships headless by default.

puppeteer vs puppeteer-core

puppeteer

Full package. Downloads a matching Chrome for Testing build (~170-282 MB) into ~/.cache/puppeteer on install. Pick this for local dev and self-contained CI.

$ npm i puppeteer
import puppeteer from 'puppeteer';
puppeteer-core

Library only - no browser download. Use when you bring your own Chrome (system install, Lambda layer, container) or connect to a remote endpoint.

$ npm i puppeteer-core
import puppeteer from 'puppeteer-core';

Core Strengths

Direct CDP access - low-level control over every Chrome subsystem (Network, Page, Runtime, Performance).
Modern Locator API - page.locator() auto-waits for visibility, enabledness, stable bbox.
P-pseudo-selectors - ::-p-text, ::-p-aria, ::-p-xpath, >>> shadow-piercing.
BiDi support - cross-browser via WebDriver BiDi (default for Firefox).
Cooperative request interception - block, modify, mock; multiple handlers compose cleanly.
First-class PDF - print-fidelity rendering with font-loading guarantees.
Browser isolation - createBrowserContext() for clean session boundaries.

Browsers & Protocols

Chrome (default)

Drives Chrome for Testing over CDP. Full feature surface - extensions, accessibility, heap snapshots, screencast.

Firefox

Stable Firefox via WebDriver BiDi (default). No browser patching, standards-based protocol.

Headless modes

headless: true (new headless, real Chrome rendering pipeline) is the default. Set false to run headed for debugging.

Protocols

Pick per-launch with protocol: 'cdp' or 'webDriverBiDi'.

Key Terminology

Browser
A single browser process. Spawn via puppeteer.launch() or attach with connect().
BrowserContext
An isolated session - separate cookies, storage, permissions. Like an incognito window. Replaces the old createIncognitoBrowserContext.
Page
A single tab inside a context. Most user-facing API hangs off Page.
Locator
Lazy, retrying handle to an element. Auto-waits for visibility, enabledness, stable bounding box, and viewport before acting.
ElementHandle
Eager handle to a specific DOM node. Must be disposed (.dispose()) to avoid leaks. Prefer Locator.
CDP Session
Direct DevTools Protocol channel via page.createCDPSession(). Escape hatch for features not in the high-level API.

Installation & Setup

Install, configure browser caching, and run reliably in containers

Hello-World Script

Minimal end-to-end: install, launch, navigate, screenshot, close.

$ npm i puppeteer

// hello.mjs
import puppeteer from 'puppeteer';

const browser = await puppeteer.launch();
const page = await browser.newPage();

await page.goto('https://pptr.dev', { waitUntil: 'networkidle2' });
await page.screenshot({ path: 'pptr.png', fullPage: true });

await browser.close();

Configuration File

Puppeteer searches up the directory tree for .puppeteerrc.cjs, .puppeteerrc.json, or puppeteer.config.js. Use it to control browser downloads and the cache directory.

// .puppeteerrc.cjs
const { join } = require('path');

module.exports = {
  chrome:  { skipDownload: false },
  firefox: { skipDownload: false },
  cacheDirectory: join(__dirname, '.cache', 'puppeteer'),
};

After changing config, sync browsers:

$ npx puppeteer browsers install

Environment Variables

Override config without touching files - handy for CI, Docker, and serverless.

PUPPETEER_CACHE_DIR

Where browsers are stored (default: ~/.cache/puppeteer).

PUPPETEER_EXECUTABLE_PATH

Use a system Chrome instead of downloading. Required with puppeteer-core.

PUPPETEER_SKIP_CHROMIUM_DOWNLOAD

Skip the post-install browser download.

HTTP_PROXY · HTTPS_PROXY

Tunnel browser downloads through a corporate proxy.

DEBUG=puppeteer:*

Verbose protocol/runtime logs - turn on for diagnostics.

@puppeteer/browsers CLI

Standalone CLI for managing Chrome / Firefox builds outside the Puppeteer install lifecycle. Useful for prebuilt CI images.

# Install latest stable Chrome for Testing
$ npx @puppeteer/browsers install chrome@stable

# Install Firefox stable
$ npx @puppeteer/browsers install firefox@stable

# List installed builds
$ npx @puppeteer/browsers list

# Pin a specific revision (use the version you want)
$ npx @puppeteer/browsers install chrome@<version>

Docker & CI

Containerised Chrome needs sandbox flags relaxed and a larger /dev/shm. Use the official image to skip system-dep wrangling.

Official image
# Dockerfile
FROM ghcr.io/puppeteer/puppeteer:latest

WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
CMD ["node", "index.mjs"]
Run flags
$ docker run --rm \
    --shm-size=2gb \
    --cap-add=SYS_ADMIN \
    -v $PWD:/app \
    my-pptr-image
Required launch args
args: [
  '--no-sandbox',
  '--disable-setuid-sandbox',
  '--disable-dev-shm-usage',
  '--disable-gpu',
]
Only disable the sandbox in trusted environments. The Chrome sandbox is a critical security boundary - never run untrusted URLs with --no-sandbox on a host that holds secrets.

Core API

Browser → BrowserContext → Page hierarchy, launch options, and remote connections

The Object Hierarchy

Each level adds isolation. Reuse the Browser, create fresh BrowserContexts for parallel work, and Pages for tabs.

const browser  = await puppeteer.launch({ headless: true });
const context  = await browser.createBrowserContext();   // isolated session
const page     = await context.newPage();                  // or browser.newPage() for default ctx

await page.goto('https://example.com');
// ... do work ...

await page.close();
await context.close();      // drops cookies + storage
await browser.close();

Launch Options

Production-ready defaults with explicit comments for each knob.

await puppeteer.launch({
  headless: true,        // false for visual debug
  slowMo: 0,             // ms between steps
  protocol: 'cdp',        // 'webDriverBiDi' for Firefox
  product: 'chrome',      // or 'firefox'
  channel: 'chrome',      // installed Chrome channel
  executablePath: '/usr/bin/chrome',
  userDataDir: './profile',// persistent cookies
  defaultViewport: { width: 1280, height: 800 },
  acceptInsecureCerts: true,
  ignoreDefaultArgs: ['--enable-automation'],
  args: [
    '--no-sandbox',
    '--disable-dev-shm-usage',
    '--disable-blink-features=AutomationControlled',
  ],
  dumpio: true,           // pipe browser stdio
  timeout: 30_000,
});

BrowserContext - Isolation

Cheap session boundary - separate cookies, storage, permissions, even per-context proxies. Replaces the deprecated createIncognitoBrowserContext.

// Per-context proxy + clean session
const ctx = await browser.createBrowserContext({
  proxyServer: 'http://proxy:8080',
  proxyBypassList: ['localhost'],
});

await ctx.overridePermissions(
  'https://app.example.com',
  ['geolocation', 'notifications']
);

const page = await ctx.newPage();
// ... when done
await ctx.close();
Pattern: one Browser, N contexts, M pages per context. Reuse the Browser - launch is the expensive part.

Connecting to a Running Browser

Reuse a long-lived browser process - faster cold starts and ideal for browser farms.

// 1. Start Chrome with a remote port
$ chrome --remote-debugging-port=9222

// 2. Attach
const browser = await puppeteer.connect({
  browserURL: 'http://localhost:9222',
  // or browserWSEndpoint: 'ws://...'
});

// disconnect WITHOUT closing
browser.disconnect();

Use puppeteer-core for connect-only setups - skips the bundled Chrome download.

Bulletproof Cleanup

Always close the browser even on errors - orphaned Chrome processes leak memory and file descriptors.

const browser = await puppeteer.launch();
try {
  const page = await browser.newPage();
  await page.goto(url, { waitUntil: 'networkidle2', timeout: 30_000 });
  return await page.content();
} catch (err) {
  console.error('Scrape failed:', err);
  throw err;
} finally {
  await browser.close();   // runs on success AND failure
}

Locators & Selectors

Modern Locator API with auto-waiting, P-pseudo selectors, and shadow-DOM piercing

Preferred - Locator API

page.locator() returns a lazy handle. Before each action it auto-waits for the element to be visible, enabled, have a stable bounding box, and be in the viewport. No more waitForSelector + click dance.

Actions
await page.locator('button.submit').click();
await page.locator('#email').fill('a@b.co');
await page.locator('.tooltip').hover();
await page.locator('#footer').scroll();
await page.locator('.spinner').wait();
Configuration (per locator)
await page.locator('#late')
  .setTimeout(10_000)
  .setVisibility(null) // don't require visible
  .setEnsureElementIsInTheViewport(false)
  .setWaitForEnabled(false)
  .click();

P-Pseudo Selectors

Puppeteer-only selector extensions - text, ARIA, XPath, all composable with regular CSS.

// Visible text (works inside shadow DOM)
page.locator('::-p-text(Checkout)');

// ARIA role + accessible name
page.locator(
  '::-p-aria([name="Submit"][role="button"])'
);

// XPath as a selector
page.locator('::-p-xpath(//h1[1])');

// Compose with regular CSS
page.locator(
  'form >>> ::-p-text(Save)'
);

P-selectors prefer user-facing attributes over fragile CSS paths - much more resilient to DOM refactors.

Shadow DOM Piercing

Web Components hide internals behind shadow roots. Puppeteer pierces them with >>> (closed/open) and >>>> (open only).

// Cross any shadow boundary
page.locator(
  'my-app >>> my-button'
);

// Open shadow DOM only (safer)
page.locator(
  'my-app >>>> .label'
);

// Combined with text selector
page.locator(
  'my-modal >>> ::-p-text(Confirm)'
).click();

Filtering, Custom Predicates & Conditional Wait

Locators accept callbacks for filtering and even arbitrary conditions on the page.

// Filter by predicate
await page
  .locator('button')
  .filter(el => el.textContent === 'My button')
  .click();

// Wait for an arbitrary DOM condition
await page
  .locator(() => document.querySelectorAll('p').length >= 3)
  .wait();

// Race multiple locators (first to resolve wins)
await Locator.race([
  page.locator('::-p-text(Accept)'),
  page.locator('::-p-text(Allow)'),
]).click();

Legacy Selectors (still supported)

Use these only when you need the eager ElementHandle - e.g. passing an element into page.evaluate or computing client-side metrics.

// Single match (or null)
const el = await page.$('#main h1');

// All matches
const rows = await page.$$('tr.row');

// Wait for selector + return handle
const btn = await page.waitForSelector(
  'button[disabled=false]',
  { visible: true, timeout: 5000 }
);
// Quick eval shortcuts
const title = await page.$eval(
  'h1', el => el.textContent
);

const hrefs = await page.$$eval(
  'a', links => links.map(a => a.href)
);

// IMPORTANT: dispose handles
await el.dispose();
Memory leak warning: every $/$$/waitForSelector call returns a handle that pins JS objects in the browser. Always .dispose() or use the Locator API.

Network, Navigation & Waits

Request interception, navigation, cookies, and reliable waiting strategies

Navigation

Pick the right waitUntil for the page you're on. networkidle2 is a good default for SPAs that fire trickling analytics requests.

await page.goto(url, {
  waitUntil: 'networkidle2', // or 'load' / 'domcontentloaded' / 'networkidle0'
  timeout: 30_000,
  referer: 'https://example.com',
});

await page.goBack();
await page.goForward();
await page.reload();

load: load event fired. networkidle0: 0 active requests for 500ms. networkidle2: ≤2 active requests for 500ms.

Waiting Strategies

Always wait for a condition, not a fixed sleep. Most waitFor* APIs accept an AbortSignal.

// Until predicate is true in the page
await page.waitForFunction(
  () => document.title.includes('Loaded'),
  { polling: 'raf', timeout: 10_000 }
);

// Until network quiets down
await page.waitForNetworkIdle({
  idleTime: 500,
  concurrency: 0,
});

// Until matching response
await page.waitForResponse(
  res => res.url().includes('/api/user') && res.ok()
);

// AbortSignal for app-level timeout
const ac = new AbortController();
setTimeout(() => ac.abort(), 5000);
await page.waitForSelector('#ok', { signal: ac.signal });

Cooperative Request Interception

Block heavy assets, modify headers, or stub responses. Multiple handlers compose - use request.continue() only if no other handler responds.

Block heavy resources (scraper speedup)
await page.setRequestInterception(true);

page.on('request', req => {
  const blocked = ['image', 'media',
                   'font', 'stylesheet'];
  if (blocked.includes(req.resourceType())) {
    req.abort();
  } else {
    req.continue();
  }
});
Mock a response
page.on('request', req => {
  if (req.url().endsWith('/api/me')) {
    return req.respond({
      status: 200,
      contentType: 'application/json',
      body: JSON.stringify({ id: 1, name: 'Ada' }),
    });
  }
  req.continue();
});
Each handler can call continue, abort, or respond exactly once. Resource types: document, stylesheet, image, media, font, script, xhr, fetch, websocket, ...

Cookies, Headers & Auth

// Custom HTTP headers
await page.setExtraHTTPHeaders({
  'X-Custom': 'value',
  'Accept-Language': 'en-US',
});

// Basic auth
await page.authenticate({
  username: 'user', password: 'pass'
});

// Cookies
await page.setCookie({
  name: 'session', value: 'abc',
  domain: '.example.com', secure: true,
});

await browser.deleteMatchingCookies({ name: 'session' });

Evaluating in the Page

Code passed to evaluate runs in the browser. Variables don't cross - pass them as args.

const count = await page.evaluate(
  (selector) => document.querySelectorAll(selector).length,
  '.row'      // passed as arg
);

// Use a handle in evaluate
const btn = await page.$('button');
const rect = await page.evaluate(
  el => el.getBoundingClientRect().toJSON(),
  btn
);
await btn.dispose();
Don't return DOM nodes from evaluate - results are JSON-serialised. Return primitives or use evaluateHandle.

Screenshots, PDF & Emulation

Capture, render print-quality PDFs, and emulate devices, geolocation, network, and permissions

Screenshots

// Full-page PNG
await page.screenshot({
  path: 'shot.png',
  fullPage: true,
  omitBackground: true, // transparent BG
});

// Region clip
await page.screenshot({
  path: 'hero.jpg',
  type: 'jpeg',
  quality: 85,
  clip: { x: 0, y: 0, width: 1200, height: 600 },
});

// Single element
await page.locator('.card').screenshot({ path: 'card.png' });

Returns a Buffer when path is omitted - useful for streaming uploads.

PDF Generation

Chrome's print pipeline. waitForFonts defaults to true - no more flash-of-fallback-font.

await page.pdf({
  path: 'invoice.pdf',
  format: 'A4',           // or 'Letter', etc.
  printBackground: true,
  preferCSSPageSize: true, // honour @page CSS
  margin: { top: '1cm', bottom: '1cm' },
  displayHeaderFooter: true,
  headerTemplate: '<div>Header</div>',
  footerTemplate: '<div><span class="pageNumber"></span></div>',
  waitForFonts: true,
  tagged: true,         // accessible PDF
});

Device Emulation

import { KnownDevices } from 'puppeteer';

const iPhone = KnownDevices['iPhone 15 Pro'];
await page.emulate(iPhone);

// Manual override
await page.setViewport({
  width: 390, height: 844,
  deviceScaleFactor: 3, isMobile: true,
  hasTouch: true,
});

await page.setUserAgent(
  'Mozilla/5.0 (iPhone; ...)'
);

Geolocation, Permissions, Network

// Per-context permissions
await context.overridePermissions(
  'https://maps.example.com',
  ['geolocation', 'clipboard-read']
);

await page.setGeolocation({
  latitude: 52.520008, longitude: 13.404954
});

// Time zone & locale
await page.emulateTimezone('Europe/Berlin');

// Throttle network
await page.emulateNetworkConditions({
  offline: false,
  download: 1.5 * 1024 * 1024 / 8,
  upload:   750 * 1024 / 8,
  latency:  40,
});

await page.emulateCPUThrottling(4); // 4× slower

Keyboard, Mouse & File Uploads

Keyboard & mouse
await page.keyboard.type('hello', { delay: 50 });
await page.keyboard.press('Enter');
await page.keyboard.down('Shift');

await page.mouse.move(100, 200);
await page.mouse.click(100, 200, { button: 'right' });
await page.mouse.wheel({ deltaY: 500 });
File upload & download
// File input
const input = await page.$('input[type=file]');
await input.uploadFile('./report.pdf');

// Set download dir (CDP)
const client = await page.createCDPSession();
await client.send('Browser.setDownloadBehavior', {
  behavior: 'allow',
  downloadPath: '/tmp/dl',
});

Best Practices & Gotchas

Performance, debugging, anti-detection reality, and the pitfalls that bite everyone exactly once

Performance & Scaling

Reuse the Browser. launch() is the heaviest call - 1-3s. Spin up many BrowserContexts instead.
Block heavy resources when scraping - images, fonts, media, stylesheets. 2-5× speedup is common.
Cap concurrency. Each page is a real Chrome tab - 4-8 in parallel per CPU is a sane ceiling.
Close everything. page.close(), context.close(), dispose handles. Watch for orphan Chromes in ps.
Recycle long-running browsers. Restart every N pages or M minutes - Chrome leaks slowly under load.

Debugging

# Watch what the browser is doing
$ DEBUG=puppeteer:* node script.mjs

// Run headed + slowed
puppeteer.launch({
  headless: false,
  slowMo: 250,
  devtools: true,
  dumpio: true, // pipe browser stderr/stdout
});

// Browser console → Node logs
page.on('console', m => console.log(m.type(), m.text()));
page.on('pageerror', e => console.error(e));
page.on('requestfailed', r => console.warn(r.url(), r.failure()));

Drop debugger; inside page.evaluate + devtools: true to step through page-side code.

Common Pitfalls

Race conditions

Acting before the element is ready. Fix: prefer the Locator API or explicit waitFor* - never setTimeout.

Memory leaks from handles

Every $/$$/waitForSelector returns an ElementHandle that pins JS objects. Always .dispose().

Returning DOM nodes from evaluate

Results are JSON-serialised. Return primitives or use evaluateHandle if you really need a node.

Closure variables in evaluate

Outer-scope variables aren't visible in the page. Pass them as args after the function.

Brittle CSS selectors

Long descendant chains break. Prefer ::-p-text, ::-p-aria, or data-testid.

DevTools ≠ Puppeteer

Selector tools in DevTools give plain CSS - they don't know about ::-p-* or shadow piercing.

Orphan browser processes

Forgetting browser.close() in error paths. Always wrap in try / finally.

Docker /dev/shm too small

Crashes on large pages. Use --shm-size=2gb or pass --disable-dev-shm-usage.

Anti-Detection Reality (2026)

Stealth is an arms race - none of these are permanent. Use them only for legitimate automation against systems where you have permission.

What helps
  • Coherent fingerprint: UA + viewport + locale + timezone + proxy region all match.
  • Persistent profile via userDataDir - reuse cookies, history, fonts.
  • Human pacing: randomised delays, mouse movement, scrolling before clicks.
  • WebDriver BiDi - smaller automation footprint than legacy CDP flags.
  • --disable-blink-features=AutomationControlled + drop --enable-automation from default args.
What's flaky
  • puppeteer-extra-plugin-stealth still works for many targets, but breaks intermittently after major Chrome rolls.
  • Static UA spoofing - bot detectors check Client Hints and TLS fingerprints too.
  • Pure-headless against fingerprint-heavy targets - run headed in Xvfb if needed.
Respect robots.txt, ToS, and rate limits. Anti-detection is a tool, not a license.

Official Resources

Primary sources - docs, API references, GitHub, and the surrounding ecosystem

Official Documentation

Ecosystem & Adjacent Tools

GitHub Repository

Core Repo (Apache 2.0)
github.com/puppeteer/puppeteer
Official Docker image
ghcr.io/puppeteer/puppeteer
Repo structure
/packages/puppeteer/ - main library
/packages/puppeteer-core/ - browser-less variant
/packages/browsers/ - download CLI
/docs/ - generated reference