About
What this is
A research observatory for the passive fingerprints a server can compute about connecting clients — the same signal families commercial bot-detection and device-intelligence products use. Fingerprints are computed at the network edge from raw packets and protocol behaviour, never client-side from anything the page asks your browser to compute.
Signal families
| TCP SYN (p0f) | p0f v3 raw_sig from the TCP handshake: TTL, MSS, window, option layout, quirks. TCP only — QUIC connections have none. |
|---|---|
| TCP SYN (JA4T) | JA4T fingerprint rendered at the edge from the raw TCP SYN record: TCP window, ordered option kinds, MSS and window scale. |
| QUIC transport params | The set of QUIC transport-parameter IDs the client offers (sorted, GREASE-folded) — the QUIC transport-layer analogue of the TCP SYN signature. QUIC only. |
| TLS ClientHello (JA4) | JA4 fingerprint of the TLS ClientHello with the raw ja4_r variant (full cipher / extension / signature-algorithm lists). The transport that carried it is the fingerprint's own first character: t for TCP, q for QUIC. |
| HTTP/2 frames | Akamai-style fingerprint of initial SETTINGS order, WINDOW_UPDATE, PRIORITY and pseudo-header order. Only present on HTTP/2-over-TCP connections; availability is reported in coverage stats. |
| HTTP request (JA4H) | Method, version, header count, cookie and language attributes of the request, with public ja4h_r variants preserving ordered header names and sorted cookie names where available. Cookie values stay represented by the cookie-value hash in stored and displayed population data. |
| User-Agent | The self-declared identity — interesting mostly when it disagrees with the layers below it. |
| Origin (country / ASN) | Not a wire fingerprint: coarse IP metadata derived at ingest, browsable like any other family. |
How counting works
Observations are recorded passively on page views (deduplicated per client+stack within a short window, approximating visits), actively by a browser beacon that probes both transports with a shared correlation id, and via an authenticated ingest API for other collection points. The site's own monitoring and operator traffic is excluded. Hourly per-fingerprint counts and pairwise co-occurrence counts are rolled up continuously; all observation data is pseudonymised before it is stored.
Bot claim checks
When a User-Agent claims a known crawler or user-triggered fetcher, Thumbprint compares the source IP with the ranges that operator publishes. The current checks cover Googlebot, Bingbot, GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, PerplexityBot, CCBot, Applebot, DuckDuckBot and AhrefsBot.
Privacy
- No raw IP addresses are stored — only a truncated keyed hash and the address family.
- Coarse IP metadata (country code, ASN and network name) may be derived at ingest and stored in place of the address; the lookup never leaves the server. IP address data is powered by IPinfo (IPinfo Lite, CC BY-SA 4.0).
- A network may be annotated as listed on the Spamhaus DROP / ASN-DROP lists (networks the Spamhaus Project identifies as wholly hijacked or criminal-operated) — shown as provenance, not a verdict, and never stored against your address. Data © The Spamhaus Project.
- A network's character — hosting/datacenter, commercial VPN, Tor exit, iCloud Private Relay, CDN/WAF edge, measured anycast, public-blocklist appearance, operator geolocation, and RIR registration — may be derived at lookup time from a local enrichment overlay and shown with its provenance, as a lead rather than a verdict; it is never stored against your address. Sources include the Tor Project, Apple, the cloud providers, cdncheck, bgp.tools, ipsum, X4BNet, the self-published VPN server APIs, and the RIR delegation files.
- No cookies, no client-side fingerprinting scripts; the only JavaScript is the transparent cross-transport probe on the home page, and the site works without it. The probe caches its results in tab-scoped
sessionStorage(cleared when the tab closes) so revisiting the page does not re-probe; nothing persistent is stored client-side. - User-Agent strings and protocol-level fingerprints are stored verbatim for research.
Fingerprint labels
Where shown, the application a JA4 is "identified as" comes from the community JA4+ database by FoxIO, used here under the BSD-3-Clause licence covering JA4 TLS client fingerprinting. JA4 fingerprints a client's TLS stack, not its identity, so different applications built on the same stack (notably the many Chromium-based browsers) legitimately share one JA4 — labels are therefore shown as a distribution, not a verdict. Labels are sourced only for the JA4 family; the snapshot is updated manually.
Tag colour marks provenance throughout the site: violet tags are JA4+ database inferences, green tags are this site's own measured catalog data, and blue tags are values derived directly from the wire.
Controlled-client catalog
The catalog upgrades labels from inference to measurement: known clients (browsers, CLI tools, HTTP libraries, automation engines) are driven through this site's own edge in pre-registered sessions, so each entry records exactly which client produced which multi-signal stack. Sessions are registered through an authenticated API before the run and captured via a single-use URL, so nobody can label their own traffic after the fact. Catalog traffic never enters the population statistics. Where a fingerprint page shows "matches controlled captures", that is this dataset speaking. The catalog is self-generated and exportable (JSON, CSV); Linux container captures share the capture host's TCP signature and network, and automation-bundled engines (e.g. Playwright's) are recorded as distinct from branded builds because that gap is itself research signal.
Probe origins
Browser catalog sessions use named probe origins so one declared client can be measured across transports with the same correlation id.
h2.thumbprint.me: the TCP/HTTP/2 mirror.h3.thumbprint.me: the QUIC/HTTP/3 mirror.r.thumbprint.me: the catalog-only TCP/TLS resumption probe.rq.thumbprint.me: the catalog-only QUIC/TLS resumption probe.
The resumption probes deliberately issue tickets and close connections so a second request can expose resumed ClientHellos. Ordinary observatory hosts keep automatic TLS 1.3 tickets disabled, so those server-induced pre_shared_key variants stay isolated to the resumption probes.
Prior art
This project is inspired by tlsfingerprint.io (Frolov & Wustrow, "The use of TLS in Censorship Circumvention", NDSS 2019), generalized from TLS ClientHellos to the full transport/application stack, with JA4-suite fingerprint formats by FoxIO.