StableBrowse Engineering

Reading anti-bot challenge scripts

We built a deobfuscation pipeline to turn anti-bot challenge code from folklore into an engineering model. Here is what we found inside.

← Engineering index

04 / 05

Reading anti-bot challenge scripts

At some point, guessing stops being useful.

You can keep adding compatibility patches forever. Adjust navigator.webdriver. Tune a WebGL renderer. Return something plausible from canvas. Tweak the User-Agent. Tweak the timezone. Run it again. Get blocked again. Repeat until you are mostly debugging superstition.

The more useful move is to understand the thing that is judging the browser.

Anti-bot challenges are just JavaScript. Heavily obfuscated JavaScript, yes, but still JavaScript. They run in the browser, inspect the environment, compute some kind of token or fingerprint, and decide whether the session gets a pass cookie.

So we built a deobfuscation pipeline. Not because it is clever for its own sake, but because the challenge code is the closest thing you get to a concrete map of which browser surfaces matter.

What the obfuscation looks like

Open AWS WAF’s challenge.js raw and it is not exactly welcoming:

// Raw obfuscated challenge code
navigator[_0xd558ce(0x5)]
_0x270af7[0x2](document[_0x270af7[0x3]])
a0_0x52cc(0x1a4)
Fig 1. Typical obfuscated challenge code — property names, API calls, and string constants replaced with function calls or array lookups.

That is the whole trick, repeated thousands of times. Property names, API calls, string constants, comparison values: almost everything has been replaced with a function call or an array lookup. The useful strings live in a large encoded table near the top of the file. A decode function maps numeric indices back to strings at runtime.

There is usually a second layer too. The script decodes a batch of strings into a local array, then reads from that array later. So _0x270af7[0x2] might really mean "userAgent", but you need to resolve both the decode function and the local array before that becomes obvious.

AWS WAF’s 1.37 MB challenge had roughly 5,000 of these references. Reading it manually would be pointless. The job is not to be patient; the job is to make the machine do the boring part.

The basic strategy

The important observation is simple: the decode function is deterministic. If _0x52cc(0x5) returns "navigator" once, it will keep returning "navigator". If you can get the function initialized, you can call it for every index you care about and build a map.

The pipeline ended up being three passes.

Deobfuscation pipeline:

  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐
  │   Pass 1     │    │   Pass 2     │    │   Pass 3     │
  │  Instrument  │───▶│  Enumerate   │───▶│  Substitute  │
  │  & Execute   │    │  String Table│    │  Back        │
  └──────────────┘    └──────────────┘    └──────────────┘
        │                    │                    │
  Wrap decode calls    Call decode fn       Replace all refs:
  with recorders.      for range 0..2000.  - direct decode calls
  Execute in sandbox   Build complete       - runtime mappings
  with browser stubs.  string map.          - local array lookups
                                            - hex-escaped strings
Fig 2. Three-pass deobfuscation pipeline. Each pass builds on the previous, progressively resolving obfuscated references.

Pass 1: We instrumented the file. Every suspicious decode call got wrapped with a recorder. Then we executed the challenge inside a JavaScript engine with very forgiving browser stubs. The stubs were not trying to be accurate. They were there to keep the script alive. If the challenge asks for document.createElement('canvas').getContext('2d').getImageData(...), the stub chain returns another “magic” object instead of throwing.

Pass 2: After initialization, we called the decode function directly across a range of indices. For AWS WAF, enumerating roughly 0..2000 was enough to recover the useful string table.

Pass 3: We substituted everything back into the source — direct decode calls, recorded runtime mappings, local array lookups, and hex-escaped strings.

// Before deobfuscation:
_0x270af7[0x2](document[_0x270af7[0x3]])

// After deobfuscation:
getUserAgent(document["visibilityState"])

// Resolution stats:
// AWS WAF:    98.7% of ~5,300 obfuscated references resolved
// Cloudflare: 100% of 29 KB challenge decoded cleanly
Fig 3. Before and after deobfuscation. The file stops looking like a wall of nonsense and starts looking like ordinary browser fingerprinting code.

What was inside

Once decoded, the checks were not exotic. They were just thorough.

API Surface What It Checks
navigator.webdriver Automation flag
navigator.userAgent Browser identity
navigator.platform OS claim
navigator.plugins Plugin list
navigator.hardwareConcurrency CPU thread count
navigator.deviceMemory RAM estimate
screen.width / screen.height Display dimensions
canvas.getImageData() Rendered pixel signature
WebGL UNMASKED_RENDERER GPU identity
AudioContext sampleRate Audio stack behavior
crypto.subtle.digest() Proof-of-work / token hashing
performance.now() Timing behavior

None of those is surprising in isolation. The interesting part is how much the challenge cares about consistency. It is not enough to say “I am Chrome on macOS.” The rest of the environment has to agree. If navigator.platform says "MacIntel" but WebGL reports llvmpipe, the Linux software renderer, that is not a subtle mismatch.

!

Consistency over perfection. Individual values matter less than the coherence of the whole profile. If the User-Agent says desktop Chrome but the plugin list, screen dimensions, timing behavior, and GPU strings all feel like a headless Linux box, the story falls apart.

Why agents leak more than they think

The challenge script is only part of the picture. Normal page JavaScript can also look for automation framework residue, and agent browsers can leave plenty of it without realizing.

Playwright and Puppeteer talk to the browser through the Chrome DevTools Protocol. That gives page scripts several angles of attack. Some are obvious globals:

// Automation artifacts detectable by page scripts:

window.__playwright__binding__     // Playwright global
window.__pwInitScripts             // Playwright init scripts

Runtime.enable                     // CDP domain — changes browser
                                   // behavior in detectable ways

getEventListeners()                // Exposed when evaluation runs
                                   // with includeCommandLineAPI: true

Error().stack                      // Stack trace reveals CDP
                                   // evaluation context frames
Fig 4. Automation frameworks leave residue that challenge scripts and page JavaScript can detect through globals, CDP behavior, and stack trace analysis.

Others are more indirect. Runtime.enable is a CDP domain Playwright relies on to discover JavaScript execution contexts. The problem is that its activation changes browser behavior in ways detection scripts can notice. In 2026, it is one of the louder CDP signals.

And then there are the boring inconsistencies that still affect real systems: server timezone is UTC, proxy exit IP is somewhere else, WebRTC leaks a raw host candidate, and the browser fingerprint claims a third location entirely.

Akamai was a different beast

Cloudflare and AWS WAF looked like variants of the same general obfuscator: string table, decode function, local arrays. Annoying, but tractable. Akamai did something else.

Its challenge used JSFuck-style primitives and arithmetic chains to build constants:

// JSFuck-style constant construction
[+!+[]]+[+[]]    // evaluates to "10"

// Akamai deobfuscation required a different pass:
// 1. Find variables with exactly one assignment
// 2. Evaluate the right-hand side when safe
// 3. Substitute the resolved value everywhere
// 4. Repeat until the file stops changing
// 5. Decode remaining hex escapes

// Results on Zara's 542 KB Akamai challenge:
//   847 integer constants resolved
//   18,815 substitutions produced
//   784 hex strings decoded (e.g. '\x65\x6e\x74\x72\x79' → 'entry')
Fig 5. Akamai's obfuscation uses arithmetic expression chains instead of string tables, requiring iterative constant propagation.

The decoded targets were exactly the sort of things you would expect from a serious fingerprinting script: media device enumeration, service worker support, platform claims, placeholders, password fields, and form behavior.

Different obfuscator, same lesson: the script is checking whether the browser behaves like a coherent real browser, not just whether one or two famous flags have been handled.

The real payoff

The value of deobfuscation was not that it produced a magic bypass. It did something more useful: it turned the problem from folklore into an engineering model.

Before decoding After decoding
Maybe canvas matters Know exactly which canvas calls are read and how they’re hashed
Maybe WebGL matters Know which parameters are cross-checked for GPU consistency
Maybe plugins matter Know the exact plugin list comparisons and expected lengths
Maybe timing matters Know which timing measurements are taken and what thresholds trigger flags

It also makes the economics clearer. These challenges are not designed to be impossible. They are designed to make abuse expensive and normal browser traffic cheap to classify. A determined team with enough time can usually make one session look plausible. The vendor’s goal is to make doing that reliably, at scale, cost more than the traffic is worth.

i

Engineering over guessing. Perfect stealth is the wrong mental model. A coherent, internally consistent browser story is the thing to measure. Next: the toolkit — binary patching, TLS impersonation, machine browsers, and what still does not work.

Want to go deeper?

We'll walk you through the architecture behind your workflow.