Skip to content

Glossary

For operators · scanning

For embedders · Rust

For contributors · registry

Reference

Single source of truth for Adler-specific terminology. Definitions here override any informal usage you might see in commit messages, PR descriptions, or chat — when the docs disagree with chat, the docs are right.

A site-level declaration of what the probe path needs: a country, an IP type (datacenter / residential / mobile / tor), and / or a named session. Encoded as access.geo, access.ip_type, and access.session in sites.json. Unconstrained policies (the common case) route through the default egress; constrained ones go through the egress pool.

A registry tag (bot-protected) declaring that a site serves a JavaScript login wall or a Cloudflare challenge to plain HTTP requests, so its response is identical for an existing account and a missing one. Bot-protected sites are routed through the browser backend when one is configured; without it they always return Uncertain.

A real headless Chrome (--browser-backend local) or a Browserbase cloud session (--browser-backend browserbase) that runs JS, accepts cookies, and returns the final post-render DOM. The same detection signals then apply, so bot-protected sites become verdict-able. Bounded by browser budget.

Per-scan cap on browser-routed fetches (--browser-budget N, default 50). Independent of escalation budget: a pre-tagged bot-protected site consumes browser budget; a non-pre-tagged site that escalates from HTTP to browser consumes one of each.

The HTTP path’s “no policy, no --proxy-pool match” exit — either direct or via a global --proxy <url>. Sites without an access policy always use it.

Built-in registry health check (adler --doctor). Probes each site’s known-present user (must resolve to Found) and a random nonsense user (must not), reporting any site whose detection signal no longer holds. --doctor --fix diffs the present / absent responses and proposes a corrected signature.

A network exit point — typically a proxy. Each egress spec carries match metadata (country, kind, optional name) and the proxy URL.

Operator-supplied collection of egresses loaded via --proxy-pool <file.toml>. Sites whose access policy declares a geo / IP-type requirement match against the pool; the rest use the default egress. A constrained policy with no matching egress yields Uncertain(geo_unavailable), not NotFound.

One TOML [[egress]] block: a url, optional country (ISO-3166-1 alpha-2, lowercase), optional kind (datacenter / residential / mobile / tor — defaults to datacenter), and optional name (needed for per-scan subset selection in --web).

Per-scan filter of the loaded egress pool by name. Selected via egress_names: Vec<String> on POST /api/scan from the SPA’s Advanced filters modal since v0.11. Sites whose access policy can’t be satisfied by the chosen subset land in Uncertain(geo_unavailable).

Automatic retry of an Uncertain outcome through a heavier transport (typically browser) when the cheap path hit cloudflare_challenge or rate_limited since v0.10. Triggered only on those two reasons — operator-policy Uncertains (robots_disallowed, session_required, geo_unavailable, username_not_allowed, deadline / scheduler / captcha) are kept as-is so escalation doesn’t waste budget on hopeless cases.

Per-scan cap on automatic escalations (--escalation-budget N, default 30). Independent of the browser budget. --no-escalation disables escalation entirely.

In-process TLS-fingerprint-emulating HTTP transport that performs a real BoringSSL handshake matching Chrome 134’s JA3 / JA4 fingerprint since v0.10. Built via the impersonate Cargo feature; routes TLS-fingerprint-tagged sites through wreq instead of the heavier browser backend.

Optional known_absent field on a Site: a username known to not exist on that site. Used by --doctor to assert the detection signal fires NotFound (or Uncertain) correctly on a guaranteed-absent input.

Required known_present field on a Site: either a single username string or a KnownPresent::Multiple(Vec<String>) of usernames known to exist. --doctor passes the site if any declared username resolves to Found.

Adler’s detection model: the HTTP status, body markers, and redirect behaviour are combined into one verdict — Found / NotFound / Uncertain(reason) — rather than relying on a single status check. Combines via negative-priority aggregation.

How signals vote: any NotFound vote wins over Found; no votes → Uncertain. Optimised for fewer false positives on sites that return 200 for every username (the common Sherlock failure mode).

A registry-level protection declaration that names the specific mechanism a site uses to block bots: tls-fingerprint, cloudflare, captcha, or user-auth. The router infers a default transport from this list — pure tls-fingerprintimpersonate, anything with cloudflare or mixed → browser, user-auth → needs a session. Mixed protections (e.g. tls-fingerprint + cloudflare) stay on the browser path.

Operator-supplied authenticated HTTP headers (typically Cookie, sometimes Authorization / CSRF tokens) applied to probes for sites whose access policy names them since v0.10. Loaded from a TOML file via --sessions <file>; values are redacted from logs and never written to scan output. A named-but-missing session yields Uncertain(session_required).

One detection rule on a Site: StatusFound { codes }, StatusNotFound { codes }, BodyContains, BodyAbsent, RedirectLocation, etc. A site declares one or more signals; the verdict is the negative-priority aggregation of their votes.

Which underlying transport produced an outcome: http, impersonate, or browser. Stamped on every verdict as CheckOutcome.transport since v0.10 so downstream tools (doctor, bench harness, SPA’s transport chip, JSON consumers) can tell which path produced each verdict.

The third verdict alongside Found / NotFound. Carries an UncertainReason so the operator can tell why the probe couldn’t reach a binary answer. Adler’s “honest verdicts” identity: never silently degrade an Uncertain to NotFound just because a CDN edge blocked the probe.

The closed set of reasons attached to an Uncertain verdict: rate_limited, cloudflare_challenge, captcha, robots_disallowed, deadline, scheduler_closed, network(detail), body_read(detail), browser_budget, username_not_allowed, browser_failed(detail), geo_unavailable since v0.9, session_required since v0.10, other(detail).

The three-state outcome of a probe: Found (account confirmed present), NotFound (account confirmed absent on a working response), or Uncertain. Stored as MatchKind in adler-core.


For embedders: every probe returns a CheckOutcome carrying site, url, kind (the verdict), reason (only when Uncertain), elapsed_ms, transport, escalations, optional enrichment and evidence. Full Rust API on docs.rs/adler-core.