Engineering

From the trenches

Technical deep-dives on the infrastructure problems we solve to ship browser agents that survive production.

01 / 05

Stealth infrastructure for browser agents

Every mainstream anti-bot system—Cloudflare Turnstile, Akamai Bot Manager, PerimeterX (now HUMAN), DataDome, Kasada—ships a JavaScript challenge that fingerprints the browser environment at multiple layers simultaneously. Headless Chrome, the default runtime for Playwright, Puppeteer, and every agent framework built on top of them, fails these checks within milliseconds. The navigator.webdriver flag is the most obvious tell, but it is nowhere near the only one.

This post breaks down exactly how detection works at each layer—JavaScript API surface, rendering pipeline, TLS handshake, and behavioral biometrics—and the architecture decisions we made to solve each one at the runtime level rather than with fragile userland patches.

The fingerprinting stack

Anti-bot vendors fingerprint across four independent layers. A browser session must pass all four simultaneously; failing any single one triggers a block or CAPTCHA. Understanding the full stack is essential because point-fixes at one layer (e.g., overriding navigator.webdriver) are meaningless if TLS or canvas fingerprints are inconsistent.

Layer Signal What it detects
JS API surface navigator.webdriver Automation flag set by CDP connection
JS API surface navigator.plugins.length Headless has 0 plugins; real Chrome has 3+
JS API surface navigator.languages Headless returns empty or inconsistent locale arrays
Rendering Canvas hash GPU-dependent pixel output; headless renders differently
Rendering WebGL renderer string ANGLE (... vs. real GPU; SwiftShader detected
Rendering AudioContext fingerprint DynamicsCompressor output varies by audio stack
TLS JA3 / JA4 hash ClientHello cipher suites + extensions must match declared UA
TLS HTTP/2 SETTINGS frame Akamai fingerprints H2 initial window size + priority tree
Behavioral Mouse / scroll entropy Synthetic events have zero jitter; human input has Gaussian noise
Behavioral Keystroke cadence Programmatic input fires at constant intervals

Why userland patches fail

The most common approach to anti-detection is monkey-patching JavaScript globals. Libraries like puppeteer-extra-plugin-stealth override navigator.webdriver, inject fake plugin arrays, and spoof window.chrome. This worked in 2020. It does not work today.

Modern challenge scripts use multiple detection vectors that cannot be patched from userland:

  • CDP timing side-channel. Challenge scripts measure the round-trip latency of Runtime.evaluate. When CDP is connected, the latency profile is measurably different (<1ms vs. 2-5ms for real user-initiated evaluation). This timing difference persists regardless of flag overrides.
  • Iframe sandbox probing. A cross-origin iframe is created and its contentWindow.chrome object is inspected. Stealth plugins patch the parent frame but miss child iframes, exposing the real automation state.
  • Stack trace analysis. Error().stack is parsed inside getter traps placed on navigator properties. If the stack trace originates from a CDP evaluation context (identifiable by __puppeteer_evaluation_script__ frames), the session is flagged.
  • WebGL parameter consistency. Even if you spoof the WebGL renderer string via getParameter(UNMASKED_RENDERER_WEBGL), challenge scripts cross-check against MAX_TEXTURE_SIZE, MAX_RENDERBUFFER_SIZE, and ALIASED_LINE_WIDTH_RANGE. These must be internally consistent with the declared GPU. SwiftShader reports impossible parameter combinations for any real GPU.
!

The patch treadmill. Cloudflare updates their challenge script every 7-14 days. Each update introduces new detection vectors. Maintaining a userland stealth plugin means reverse-engineering minified challenge code on a biweekly cadence—a full-time job that scales to zero vendors.

Runtime-level architecture

Instead of patching after the fact, StableBrowse operates at the Chromium runtime layer. We maintain a modified Chromium build where automation artifacts are structurally absent from the source, not overridden after initialization.

1. CDP channel restructuring.

The Chrome DevTools Protocol uses a bidirectional WebSocket channel between the automation client and the browser process. Challenge scripts detect this channel through timing analysis of the Mojo IPC pipe. Our build restructures the CDP transport to use a Unix domain socket with kernel-level buffering that eliminates the measurable latency signature. The browser process sees the same IPC characteristics as a standalone Chrome instance with DevTools closed.

2. Fingerprint coherence engine.

Each agent session draws from a database of real-device fingerprints collected from opt-in enterprise device fleets. A fingerprint includes:

  • GPU renderer string + all WebGL parameter values (cross-validated for consistency)
  • Canvas hash (pre-computed from the actual GPU output for that hardware profile)
  • AudioContext DynamicsCompressor output (recorded from real hardware)
  • Screen resolution, devicePixelRatio, available fonts
  • Timezone, locale, language preferences (correlated to GeoIP of the exit proxy)

The key insight is coherence. A fingerprint from an M1 MacBook Pro must report the Apple M1 GPU renderer, a MAX_TEXTURE_SIZE of 16384, a screen resolution of 2560x1600 at 2x DPR, and an AudioContext output that matches the M1's DAC characteristics. Any inconsistency—even one parameter—is a detection signal.

// Fingerprint coherence validation (simplified)
function validateFingerprint(fp) {
  const gpu = GPU_PROFILES[fp.webgl.renderer];
  assert(fp.webgl.maxTextureSize  === gpu.maxTextureSize);
  assert(fp.webgl.maxRenderBuffer === gpu.maxRenderBuffer);
  assert(fp.webgl.aliasedLineRange[0] === gpu.aliasedLineMin);
  assert(fp.webgl.aliasedLineRange[1] === gpu.aliasedLineMax);
  assert(fp.canvas.hash === gpu.expectedCanvasHash[fp.os]);
  assert(TIMEZONE_GEO[fp.timezone].includes(fp.proxy.country));
  assert(fp.screen.dpr === KNOWN_DPR[fp.device]);
}
Fig 1. Every fingerprint parameter is validated against known hardware profiles before injection.

3. TLS fingerprint alignment.

JA3 is a method for creating SSL/TLS client fingerprints by hashing the TLS ClientHello message—specifically the SSL version, accepted ciphers, extensions, elliptic curves, and point formats. JA4 extends this with additional HTTP/2 metadata. The problem: headless Chromium's TLS stack produces a different JA3 hash than Chrome stable on the same OS, because the build flags differ.

Our Chromium build uses the exact same BoringSSL configuration and cipher suite ordering as the declared Chrome version. We maintain a CI pipeline that compares our JA3/JA4 output against the official Chrome Canary, Beta, and Stable channels for each platform. Any drift triggers a build failure.

4. HTTP/2 fingerprint matching.

Akamai's bot detection fingerprints the HTTP/2 SETTINGS frame that the client sends on connection establishment. This includes:

  • SETTINGS_HEADER_TABLE_SIZE
  • SETTINGS_ENABLE_PUSH
  • SETTINGS_MAX_CONCURRENT_STREAMS
  • SETTINGS_INITIAL_WINDOW_SIZE
  • SETTINGS_MAX_FRAME_SIZE
  • SETTINGS_MAX_HEADER_LIST_SIZE
  • The WINDOW_UPDATE and PRIORITY frames sent after SETTINGS

Standard headless Chrome sends different values from headed Chrome. Our build patches the HTTP/2 session initialization in Chromium's net stack to replicate the exact SETTINGS and priority tree of headed Chrome for the declared version/platform combination.

5. Behavioral injection.

Even with a perfect static fingerprint, synthetic input events are detectable. Real mouse movements follow a minimum-jerk trajectory with Gaussian noise. Scroll events have momentum and deceleration curves that match the OS's scroll physics. Keystroke intervals follow a log-normal distribution unique to human typing.

We maintain a library of recorded human interaction templates (anonymized, opt-in enterprise users). When an agent performs an action, the raw CDP Input.dispatchMouseEvent calls are modulated through these templates. Mouse moves follow Bezier curves with randomized control points. Scroll events include the OS-level momentum phase. Keystroke timing is drawn from a per-character log-normal model trained on real typing data.

Proxy and network identity

IP reputation is the fifth detection layer. Datacenter IPs are flagged by every major anti-bot vendor on sight. Residential proxy pools solve this but introduce new problems: latency variance, session instability, and IP rotation that triggers re-authentication.

StableBrowse maintains dedicated residential IP pools per enterprise client. IPs are pre-warmed with browser traffic to build reputation before agent sessions use them. Each IP is geo-fenced to match the fingerprint's timezone and locale. Session persistence is maintained through sticky IP routing—the same IP is reused for the entire multi-page workflow to avoid mid-session fingerprint changes that trigger Akamai's "impossible travel" heuristic.

i

Defense in depth. No single stealth technique is sufficient. Detection is a conjunction: challenge scripts flag sessions that fail any check. Our architecture ensures coherence across all layers simultaneously—JS APIs, rendering, TLS, HTTP/2, behavioral, and network identity—because that's how real browsers work.


02 / 05

Knowledge graphs: teaching agents to navigate, extract, and interact

For a browser agent to reliably operate on the web, it needs to be capable of three primitives: Navigation, Extraction, and Interaction. Every agent task—booking a flight, filing an insurance claim, completing a purchase—is a composition of these three operations. Get any one of them wrong, and the task fails.

The standard approach is to drop an LLM into a browser, hand it the raw DOM or a screenshot, and hope it figures things out. This works on demos. It does not work in production. Raw HTML pages can exceed a million tokens. CSS selectors break on every redesign. The agent has no map of where it is, what it can do, or where it needs to go. It is navigating blind.

Knowledge graphs solve this. They convert a website from an opaque rendering surface into a structured, traversable map. Pages become nodes. Transitions become edges. The agent stops guessing and starts following a graph.

The three primitives

Every browser task decomposes into a sequence of three operations. Agents that treat the browser as a single undifferentiated action space fail because they conflate fundamentally different problems. Each primitive has its own failure modes and requires its own solution.

Primitive What the agent does Failure without a graph
Navigation Move between pages and application states Gets lost in multi-step flows, loops back to visited pages, can't find the right portal
Extraction Pull structured data from the current page Hallucinates field values, misses data behind dynamic rendering, breaks on every redesign
Interaction Fill forms, click buttons, manipulate UI controls Clicks the wrong element, misses required fields, doesn't handle complex UI controls correctly

Primitive 1: Navigation

Navigation is the most underestimated primitive. The core problem is topological blindness—an agent that only sees the current page has no knowledge of the site's complete state space. It doesn't know how many steps remain, whether it's on the right path, or how to recover when something loads unexpectedly.

Consider a multi-step insurance quoting flow: broker portal → carrier selection → risk details → coverage options → quote summary → bind. Without a map, the agent is forced into trial-and-error exploration, burning tokens and time on dead ends.

A knowledge graph encodes the full site topology upfront. Each page state is a node. Each transition—clicking a link, submitting a form, opening a modal—is a directed edge with a specific action and a predicted destination. Navigation becomes deterministic pathfinding rather than probabilistic exploration.

The result: the agent traverses the graph instead of reasoning about where to go next. The LLM is invoked once to understand the task, not at every step to decide where to click.

Primitive 2: Extraction

Extraction is the reason most agents exist—pulling structured data out of unstructured web pages. The standard DOM-based approach is fragile because the DOM is a rendering interface, not a data interface.

By the time data reaches the DOM, it has been fragmented across elements, hidden behind dynamic rendering, and decorated with framework-specific artifacts that change between versions. Selectors that work today break tomorrow.

Knowledge graphs solve extraction by attaching data schemas directly to graph nodes. Each region of the page knows what fields it contains and their types. The agent knows what to extract before it touches the page, and can target specific regions rather than sending the entire DOM to an LLM.

This schema-first approach means extraction is precise and efficient. Instead of asking an LLM “what data is on this page?” at massive token cost, the agent runs targeted extraction against known regions with known schemas.

Primitive 3: Interaction

Interaction is the hardest primitive to get right. Clicking a button is simple. Filling out a multi-step form with date pickers, dropdowns, sliders, autocomplete fields, and conditional inputs is not. Every UI control has its own interaction model, and getting it wrong silently produces incorrect results.

Consider booking a flight. The agent needs to type a city into an autocomplete field and wait for suggestions. It needs to open a date picker and navigate to the right month. It needs to increment a passenger counter. Each step has timing dependencies and ordering constraints that a generic “click this element” approach can't handle.

Knowledge graphs encode the functional type of every interactive element—whether it's a text input, date picker, dropdown, slider, or submit button—along with the relationships between them. The graph knows that field B depends on field A, that a dropdown must be opened before an option can be selected, and that a form must be filled before it can be submitted.

With this information, interaction becomes deterministic. The LLM parses the user's task into structured parameters. The executor maps those parameters to graph nodes and runs them in the correct order. No per-step reasoning, no DOM interpretation, no hallucination risk.

Graph construction

StableBrowse builds knowledge graphs through automated site discovery. We visit the target site, capture its semantic structure, and map it into a typed graph of pages, regions, elements, and data nodes connected by verified transitions.

The graph is built once per site and reused for every subsequent task. Structural fingerprinting detects when a site has changed, triggering automatic re-discovery and graph updates. Nodes that no longer exist are marked stale. Nodes that still work retain their history.

Self-healing

Websites change constantly. A static graph would break just as fast as static selectors. StableBrowse's graphs adapt through empirical reliability tracking—every node carries a reliability score updated on each interaction. Nodes that consistently work rise in confidence. Nodes that start failing are deprioritized and eventually excluded from action plans.

When the graph detects that a site's structure has drifted, it re-discovers the changed portions and merges them with the existing graph. The agent always has an up-to-date map, and the transition from old structure to new structure is seamless.

Why this matters

The fundamental insight is that websites are not random. They have structure, and that structure is far more stable than the surface-level HTML. A knowledge graph captures that structure and lets agents operate on it directly.

The three primitives—navigation, extraction, interaction—are the minimal set of capabilities an agent needs to operate on any website. Each one maps to a different part of the graph: navigation uses edges to traverse between page states, extraction uses schemas attached to data nodes, and interaction uses functional types and dependency edges to execute actions deterministically.

Agents that treat the web as an undifferentiated stream of HTML will always be slow, expensive, and fragile. Agents that see the web as a graph can navigate it like a map.

i

Build once, run forever. A knowledge graph is built once per site. Every subsequent task runs at dramatically lower cost and higher reliability. The graph pays for itself on the second task.


03 / 05

The three layers of anti-bot detection

If you’ve ever built a scraper or an AI agent that touches the open web, you know the shape of the failure. Everything works locally. The code probably vibecoded but sort of works (maybe). Then the real site gives you a 403, a blank page, a redirect loop, or some cheerful “just checking your browser” screen that never clears.

The usual first thought is CAPTCHA. Maybe you need a solver. Maybe you need to click the challenge. Maybe you need to patch navigator.webdriver. Sometimes, sure. But that mental model is too small.

Modern anti-bot systems don’t sit in one place. They stack checks across the connection, the browser environment, and the user’s behavior. A lot of tools only deal with the middle layer, which is why they feel like they work on toy examples and then fall apart on real sites.

┌─────────────────────────────────────────────────────────┐
│  Layer 3 — Behavioral                                   │
│  Mouse movement · Typing cadence · Scroll patterns      │
│  Session history · IP reputation · Return patterns      │
├─────────────────────────────────────────────────────────┤
│  Layer 2 — Browser Environment                          │
│  Canvas · WebGL · AudioContext · Plugins · Fonts         │
│  navigator.webdriver · Automation artifacts              │
├─────────────────────────────────────────────────────────┤
│  Layer 1 — Network / TLS                                │
│  JA3/JA4 fingerprint · HTTP/2 SETTINGS · Cipher order   │
│  GREASE values · ALPS · Header ordering                 │
└─────────────────────────────────────────────────────────┘
  ▲ Detection happens bottom-up: network → browser → behavior
Fig 1. Anti-bot detection is a three-layer stack. Failing any single layer triggers a block.

Layer 1: The request is suspicious before HTML exists

The first decision often happens before the server sends you a single byte of page content. Cloudflare, Akamai, AWS WAF, and similar systems inspect the TLS handshake itself. When a browser connects over HTTPS, it sends a ClientHello: cipher suites, extensions, ordering, supported features, all the little negotiation details. Chrome has a recognizable shape here. So does Safari. So does Firefox.

Most HTTP clients do not.

Signal Chrome (real) reqwest + rustls
JA3/JA4 fingerprint Chrome-specific hash Generic Rust library hash
GREASE values Present, randomized Missing
ALPS extension Present Absent
H2 initial window size 6 MB Library default
Pseudo-header order :method, :authority, :scheme, :path May differ

This is why a plain HTTP client can get blocked before the DOM exists. No canvas. No WebGL. No automation flag. Just a network stack that does not match the identity it is claiming.

For Rust, the practical answer right now is wreq, a reqwest fork built around BoringSSL, the same TLS library Chrome uses. With the right profile, it can emit a Chrome-shaped handshake instead of a Rust-library-shaped one.

Layer 2: The browser has to survive the challenge

If the network layer passes, the site may still serve a JavaScript challenge instead of the actual page. These challenges run in the browser, probe the environment, compute a fingerprint, and set a pass cookie if the result looks plausible.

The checks are not mysterious once you read the scripts. They ask questions like:

  • Can the canvas draw like a real machine? The script creates an invisible canvas, draws text or shapes, and reads pixels back with getImageData(). Empty stubs or all-zero buffers stand out.
  • Does WebGL match the claimed device? A Mac-shaped User-Agent with a Linux software renderer is a bad story.
  • Does AudioContext behave like a browser with an actual audio stack? Some challenges create an oscillator, pass it through a compressor, and inspect the resulting data.
  • Is automation obvious? navigator.webdriver = true is still the loudest possible signal.
!

The patch trap. Patching navigator.webdriver, faking a WebGL string, and returning a canned canvas value works until the challenge starts checking whether your patches look like native browser code. It can inspect function source strings, property descriptors, prototype chains, and timing behavior.

Layer 3: The session has to behave like a person

Suppose you pass the network fingerprint and the JavaScript challenge. You are still not done. Systems like DataDome, HUMAN/PerimeterX, and reCAPTCHA v3 look at behavior.

Mouse movement is one example. Humans do not move pointers in perfect straight lines at constant speed. They accelerate, overshoot, correct, pause, and click slightly off-center. Typing has the same texture: uneven delays, occasional hesitation, key events in the right order.

reCAPTCHA v3 score model (approximate):

  0.0 ─────────────── 0.5 ─────────────── 1.0
  │                    │                    │
  Bot-like             Ambiguous            Human-like
  │                    │                    │
  ├─ DC IP             ├─ Residential IP    ├─ Residential IP
  ├─ No history        ├─ Some cookies      ├─ Returning visitor
  ├─ Instant typing    ├─ Basic mouse       ├─ Natural behavior
  └─ Fresh profile     └─ Partial session   └─ Full session history
Fig 2. Behavioral scoring collapses dozens of signals into a single trust score. Coherence across all layers is what pushes toward 1.0.

Individually, these signals are small. Together, they are hard to ignore. A brand-new automated session from a data center IP might score 0.1. Getting closer to 0.9 usually means residential IPs, coherent browser identity, realistic interaction timing, and a session history that does not reset every run.

The part most tools miss

The common failure mode is solving the visible layer and ignoring the other two. You patch navigator.webdriver. You return something from canvas. Meanwhile, the TLS handshake already said “not Chrome” before JavaScript ran, and the behavioral model is watching a machine type an entire form in zero milliseconds.

The durable systems either cover all three layers or avoid some of the detection surface entirely. Not “does this bypass Cloudflare today,” but “does the whole session tell one consistent story?”

i

Consistency is the game. If the network says Chrome, the JavaScript environment says Chrome, the GPU says the same machine, the timezone matches the IP, and the user behavior looks like a low-volume human—blocking becomes more expensive and less certain.


04 / 05

Reading anti-bot challenge scripts

At some point, guessing stops being useful. You can keep adding compatibility patches forever. Adjust navigator.webdriver. Tune a WebGL renderer. Return something plausible from canvas. Run it again. Get blocked again. Repeat until you are mostly debugging superstition.

The more useful move is to understand the thing that is judging the browser. Anti-bot challenges are just JavaScript. Heavily obfuscated JavaScript, yes, but still JavaScript. So we built a deobfuscation pipeline.

What the obfuscation looks like

Open AWS WAF’s challenge.js raw and it is not exactly welcoming:

// Raw obfuscated challenge code
navigator[_0xd558ce(0x5)]
_0x270af7[0x2](document[_0x270af7[0x3]])
a0_0x52cc(0x1a4)
Fig 1. Typical obfuscated challenge code — property names, API calls, and string constants are all replaced with function calls or array lookups.

Property names, API calls, string constants, comparison values: almost everything has been replaced with a function call or an array lookup. The useful strings live in a large encoded table near the top of the file. AWS WAF’s 1.37 MB challenge had roughly 5,000 of these references.

The basic strategy

The important observation is simple: the decode function is deterministic. If _0x52cc(0x5) returns "navigator" once, it will keep returning "navigator". The pipeline ended up being three passes:

Deobfuscation pipeline:

  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐
  │   Pass 1     │    │   Pass 2     │    │   Pass 3     │
  │  Instrument  │───▶│  Enumerate   │───▶│  Substitute  │
  │  & Execute   │    │  String Table│    │  Back        │
  └──────────────┘    └──────────────┘    └──────────────┘
        │                    │                    │
  Wrap decode calls    Call decode fn       Replace all refs:
  with recorders.      for range 0..2000.  - direct decode calls
  Execute in sandbox   Build complete       - runtime mappings
  with browser stubs.  string map.          - local array lookups
                                            - hex-escaped strings

  Result: 98.7% of ~5,300 obfuscated references resolved (AWS WAF)
          100% resolved (Cloudflare 29 KB challenge)
Fig 2. Three-pass deobfuscation pipeline. Each pass builds on the previous, progressively resolving obfuscated references.

What was inside

Once decoded, the checks were not exotic. They were just thorough. The challenge probed the usual automation and fingerprinting surfaces:

API Surface What It Checks
navigator.webdriver Automation flag
navigator.userAgent Browser identity
navigator.platform OS claim
canvas.getImageData() Rendered pixel signature
WebGL UNMASKED_RENDERER GPU identity
AudioContext sampleRate Audio stack behavior
crypto.subtle.digest() Proof-of-work / token hashing
performance.now() Timing behavior

The interesting part is how much the challenge cares about consistency. It is not enough to say “I am Chrome on macOS.” If navigator.platform says "MacIntel" but WebGL reports llvmpipe, the Linux software renderer, that is not a subtle mismatch.

Why agents leak more than they think

Normal page JavaScript can also look for automation framework residue. Playwright and Puppeteer talk to the browser through the Chrome DevTools Protocol. That gives page scripts several angles of attack:

// Automation artifacts detectable by page scripts:

window.__playwright__binding__     // Playwright global
window.__pwInitScripts             // Playwright init scripts

Runtime.enable                     // CDP domain — changes browser
                                   // behavior in detectable ways

getEventListeners()                // Exposed when evaluation runs
                                   // with includeCommandLineAPI: true

Error().stack                      // Stack trace reveals CDP
                                   // evaluation context frames
Fig 3. Automation frameworks leave residue that challenge scripts and page JavaScript can detect through globals, CDP behavior, and stack trace analysis.

Akamai was a different beast

Cloudflare and AWS WAF looked like variants of the same general obfuscator: string table, decode function, local arrays. Annoying, but tractable. Akamai did something else.

Its challenge used JSFuck-style primitives and arithmetic chains to build constants:

// JSFuck-style constant construction
[+!+[]]+[+[]]    // evaluates to "10"

// After constant propagation & hex decode:
// 847 integer constants resolved
// 18,815 substitutions produced
// 784 hex strings decoded
// (from Zara's 542 KB Akamai challenge)
Fig 4. Akamai's obfuscation uses arithmetic expression chains instead of string tables, requiring iterative constant propagation to resolve.
i

The real payoff. Deobfuscation turned the problem from folklore into an engineering model. After decoding, you know what the script reads, what it compares, which APIs are grouped together, and where internal consistency matters. Perfect stealth is the wrong mental model. A coherent, internally consistent browser story is the thing to measure.


05 / 05

The browser fingerprinting toolkit

Browser fingerprinting is no longer one problem. It is a stack. There is the network fingerprint: TLS, HTTP/2 settings, header order, proxy geography. There is the browser fingerprint: canvas, WebGL, AudioContext, plugins, fonts, screen state, timing APIs. There is the automation fingerprint: CDP artifacts, Playwright globals, navigator.webdriver, command-line APIs. And then there is the session fingerprint: behavior, history, IP reputation, and whether the whole thing looks coherent over time.

The useful way to think about the tooling landscape is not “which one wins?” It is “which layer does this make more coherent?”

Fingerprinting problem space → Tooling map:

  ┌────────────────────────┐
  │   Machine Native       │  ← Coherence layer (our bet)
  │   Browser              │    Session protocol · Identity graph
  │                        │    Routing logic · Verification harness
  ├────────────────────────┤
  │   Browser APIs         │  ← Binary-patched browsers
  │   Canvas · WebGL       │    Native code paths, not JS shims
  │   Audio · Plugins      │    Profile systems for consistency
  ├────────────────────────┤
  │   Network Identity     │  ← wreq (Rust + BoringSSL)
  │   TLS · HTTP/2         │    100+ browser-emulation profiles
  │   JA3/JA4              │    Chrome-shaped handshakes
  ├────────────────────────┤
  │   Session & Behavior   │  ← Action models · IP reputation
  │   Mouse · Typing       │    Session history · Return patterns
  │   Scroll · History     │    Proxy geography alignment
  └────────────────────────┘
Fig 1. Each layer of the fingerprinting stack maps to a different class of tooling. No single tool covers the full stack.

Network identity: wreq

For headless HTTP clients—scrapers, agents, test runners—wreq addresses the TLS fingerprinting problem. It is a Rust HTTP library that replaces rustls with BoringSSL and ships 100+ browser-emulation profiles.

With the right profile, wreq can produce a Chrome 137-shaped handshake: correct cipher suite ordering, GREASE values, Chrome-specific extensions such as ALPS and compressed certificates, and matching HTTP/2 SETTINGS.

!

rustls won’t help here. The maintainers of rustls explicitly closed the issue requesting browser-fingerprint support in January 2026, saying they do not intend to support that use case. For Rust HTTP clients that need browser-shaped TLS, wreq is currently the practical path.

Browser API consistency

JavaScript overrides are fragile. If navigator.webdriver or WebGLRenderingContext.getParameter is changed by injected code, that injection can itself be detected. CreepJS has a whole “lies detector” module that checks function .toString() output, descriptor configurability, prototype chain mutations, and other signs of page-side patching.

Binary-level browser work moves the answer from the page layer into the engine layer. Canvas, WebGL, AudioContext, fonts, GPU strings, screen properties, WebRTC, and network timing can behave like browser internals instead of values painted over after page load.

The machine native browser

The hard part is not just picking the right browser engine. It is making the network identity, browser profile, session state, actions, and observations agree across the whole workflow.

Machine Native Browser — Control Plane Architecture:

  ┌─────────────────────────────────────────────────┐
  │              Session Protocol                    │
  │  Identity graph · Routing logic · Verification  │
  └──────────┬──────────────┬───────────────┬───────┘
             │              │               │
     ┌───────▼──────┐ ┌────▼────────┐ ┌────▼────────┐
     │  wreq        │ │  Patched    │ │  Profile    │
     │  (HTTP)      │ │  Browser    │ │  Context    │
     │              │ │             │ │             │
     │  When full   │ │  When JS    │ │  When       │
     │  browser     │ │  challenges │ │  identity   │
     │  not needed  │ │  require    │ │  coherence  │
     │              │ │  real engine│ │  is critical│
     └──────────────┘ └─────────────┘ └─────────────┘

  The control plane decides which backend to use per request.
  All backends share one identity and one session model.
Fig 2. The machine native browser selects execution backends based on what the session requires, while maintaining a single coherent identity.

What still doesn’t work

There are still hard gaps:

Gap Why it’s hard
Canvas fidelity Hard-coded 1×1 PNGs are detectable. Real fix requires tracking draw-call state.
Long sessions WebGL fingerprint spoofing degrades over time. Detection systems maintain expected output hashes per GPU.
IP reputation Perfect browser surface on a high-risk DC IP is still a high-risk session.
Behavioral modeling at scale One plausible session is achievable. A population of realistic sessions is much harder.

How we handle the problem

Our bet is the machine native browser needs a coherence layer. The underlying engines matter, but the product is the system around them: identity management, session continuity, backend selection, action semantics, failure diagnosis, and verification.

Our approach is to make coherence a first-class system property:

  • Network requests should match the browser identity before HTML is served.
  • Browser identity should be consistent across User-Agent, platform, GPU, fonts, timezone, locale, screen, WebRTC, and storage.
  • JavaScript-visible APIs should come from the right backend layer, not brittle page-level patches.
  • Agent actions should be emitted with realistic timing and recoverable state transitions.
  • Observations should be structured enough for agents to reason over, not just pixels and screenshots.
  • Failures should be diagnosable by layer: network, challenge, browser API, behavior, or site logic.
i

One control plane, multiple backends. Existing tools prove the market has pain at every layer. Our work is turning those scattered lessons into a machine native browser: one control plane, multiple execution backends, and a single coherent session model. That is the credibility gap we are trying to close.

Build agents on a reusable site graph.

Show us the workflow. We'll show you the execution map.