Glossary
For operators · scanning
For contributors · registry
ReferenceSingle source of truth for Adler-specific terminology. Definitions here override any informal usage you might see in commit messages, PR descriptions, or chat — when the docs disagree with chat, the docs are right.
Access policy
Section titled “Access policy”A site-level declaration of what the probe path needs: a country, an IP
type (datacenter / residential / mobile / tor), and / or a named
session. Encoded as access.geo, access.ip_type, and access.session
in sites.json. Unconstrained policies (the common case) route through
the default egress; constrained ones go through the
egress pool.
Bot-protected
Section titled “Bot-protected”A registry tag (bot-protected) declaring that a site serves a JavaScript
login wall or a Cloudflare challenge to plain HTTP requests, so its
response is identical for an existing account and a missing one.
Bot-protected sites are routed through the browser
backend when one is configured;
without it they always return Uncertain.
Browser backend
Section titled “Browser backend”A real headless Chrome (--browser-backend local) or a Browserbase
cloud session (--browser-backend browserbase) that runs JS, accepts
cookies, and returns the final post-render DOM. The same detection
signals then apply, so bot-protected sites become
verdict-able. Bounded by browser budget.
Browser budget
Section titled “Browser budget”Per-scan cap on browser-routed fetches (--browser-budget N, default
50). Independent of escalation budget: a
pre-tagged bot-protected site consumes browser
budget; a non-pre-tagged site that escalates from HTTP to browser
consumes one of each.
Default egress
Section titled “Default egress”The HTTP path’s “no policy, no --proxy-pool match” exit — either
direct or via a global --proxy <url>. Sites without an access
policy always use it.
Doctor
Section titled “Doctor”Built-in registry health check (adler --doctor). Probes each site’s
known-present user (must resolve to
Found) and a random nonsense user (must not), reporting any
site whose detection signal no longer holds. --doctor --fix diffs the
present / absent responses and proposes a corrected signature.
Egress
Section titled “Egress”A network exit point — typically a proxy. Each egress spec carries match metadata (country, kind, optional name) and the proxy URL.
Egress pool
Section titled “Egress pool”Operator-supplied collection of egresses loaded via
--proxy-pool <file.toml>. Sites whose access policy
declares a geo / IP-type requirement match against the pool; the rest
use the default egress. A constrained policy with
no matching egress yields Uncertain(geo_unavailable),
not NotFound.
Egress spec
Section titled “Egress spec”One TOML [[egress]] block: a url, optional country (ISO-3166-1
alpha-2, lowercase), optional kind
(datacenter / residential / mobile / tor — defaults to
datacenter), and optional name (needed for per-scan subset
selection in --web).
Egress subset
Section titled “Egress subset”Per-scan filter of the loaded egress pool by name.
Selected via egress_names: Vec<String> on POST /api/scan from the
SPA’s Advanced filters modal since v0.11. Sites whose access policy can’t be
satisfied by the chosen subset land in
Uncertain(geo_unavailable).
Escalation
Section titled “Escalation”Automatic retry of an Uncertain outcome through a
heavier transport (typically browser) when the
cheap path hit cloudflare_challenge or rate_limited since v0.10. Triggered only on
those two reasons — operator-policy Uncertains
(robots_disallowed, session_required, geo_unavailable,
username_not_allowed, deadline / scheduler / captcha) are kept as-is
so escalation doesn’t waste budget on hopeless cases.
Escalation budget
Section titled “Escalation budget”Per-scan cap on automatic escalations
(--escalation-budget N, default 30). Independent of the browser
budget. --no-escalation disables escalation
entirely.
Impersonate
Section titled “Impersonate”In-process TLS-fingerprint-emulating
HTTP transport that performs a real BoringSSL handshake matching
Chrome 134’s JA3 / JA4 fingerprint since v0.10. Built via the impersonate
Cargo feature; routes TLS-fingerprint-tagged sites
through wreq instead of the heavier browser
backend.
Known-absent
Section titled “Known-absent”Optional known_absent field on a Site: a username known to not
exist on that site. Used by --doctor to assert the
detection signal fires NotFound (or Uncertain)
correctly on a guaranteed-absent input.
Known-present
Section titled “Known-present”Required known_present field on a Site: either a single username
string or a KnownPresent::Multiple(Vec<String>) of usernames known to
exist. --doctor passes the site if any declared
username resolves to Found.
Multi-signal detection
Section titled “Multi-signal detection”Adler’s detection model: the HTTP status, body markers, and redirect
behaviour are combined into one verdict — Found / NotFound /
Uncertain(reason) — rather than relying on a single
status check. Combines via negative-priority
aggregation.
Negative-priority aggregation
Section titled “Negative-priority aggregation”How signals vote: any NotFound vote wins over Found; no
votes → Uncertain. Optimised for fewer false positives
on sites that return 200 for every username (the common Sherlock
failure mode).
Protection tag
Section titled “Protection tag”A registry-level protection declaration that names the specific
mechanism a site uses to block bots: tls-fingerprint, cloudflare,
captcha, or user-auth. The router infers a default transport from
this list — pure tls-fingerprint → impersonate,
anything with cloudflare or mixed → browser,
user-auth → needs a session. Mixed protections
(e.g. tls-fingerprint + cloudflare) stay on the browser path.
Session
Section titled “Session”Operator-supplied authenticated HTTP headers (typically Cookie,
sometimes Authorization / CSRF tokens) applied to probes for sites
whose access policy names them since v0.10. Loaded from a TOML file
via --sessions <file>; values are redacted from logs and never
written to scan output. A named-but-missing session yields
Uncertain(session_required).
Signal
Section titled “Signal”One detection rule on a Site: StatusFound { codes },
StatusNotFound { codes }, BodyContains, BodyAbsent,
RedirectLocation, etc. A site declares one or more signals; the
verdict is the negative-priority
aggregation of their votes.
Transport tier
Section titled “Transport tier”Which underlying transport produced an outcome: http, impersonate,
or browser. Stamped on every verdict as
CheckOutcome.transport since v0.10 so downstream tools (doctor,
bench harness, SPA’s transport chip, JSON consumers) can tell which
path produced each verdict.
Uncertain
Section titled “Uncertain”The third verdict alongside Found / NotFound. Carries an
UncertainReason so the operator can tell why
the probe couldn’t reach a binary answer. Adler’s “honest verdicts”
identity: never silently degrade an Uncertain to NotFound just because
a CDN edge blocked the probe.
Uncertain reasons
Section titled “Uncertain reasons”The closed set of reasons attached to an Uncertain verdict:
rate_limited, cloudflare_challenge, captcha, robots_disallowed,
deadline, scheduler_closed, network(detail), body_read(detail),
browser_budget, username_not_allowed, browser_failed(detail),
geo_unavailable since v0.9, session_required since v0.10, other(detail).
Verdict
Section titled “Verdict”The three-state outcome of a probe: Found (account confirmed
present), NotFound (account confirmed absent on a working
response), or Uncertain. Stored as
MatchKind in adler-core.
CheckOutcome fields
Section titled “CheckOutcome fields”For embedders: every probe returns a CheckOutcome carrying site,
url, kind (the verdict), reason (only when
Uncertain), elapsed_ms, transport, escalations, optional
enrichment and evidence. Full Rust API on
docs.rs/adler-core.