How certstream works.
The path a certificate takes, from a Certificate Transparency log to your client.
End-to-end data flow
From CT logs to WebSocket and SSE clients.
Zero-copy, lock-free pipeline
Every certificate is serialized exactly once using simd-json (SIMD-accelerated, enabled
by default) into three pre-built Bytes payloads (full, lite,
domains_only), then wrapped in an Arc<PreSerializedMessage>. Each
subscriber receives a pointer clone and a zero-copy Utf8Bytes text frame — no
re-serialization, no per-client JSON re-encoding. When no clients are connected the serialize step
is skipped entirely via a receiver_count() == 0 guard. Shared state uses lock-free
DashMap throughout; there are no global read/write mutexes in the hot path. The result
is a roughly 118 MiB stable RSS under load (100 WebSocket clients, 10-minute plateau, ±5 MiB swing).
CT log polling
How certificates are fetched from classic Certificate Transparency logs.
-
Fetch the log lists
On startup the server fetches the Chrome-trusted log list from Google and the Apple log list in parallel, filters out rejected and retired logs, dedupes by log ID, and merges in any custom logs from config.
-
Spawn watchers
Each CT log gets its own async task (tokio spawn). Tasks run independently. If a state file exists, each watcher resumes from its saved position.
-
Poll tree size
The watcher calls
/ct/v1/get-sthto read the current tree size and compares it with the tracked position to find new entries. -
Fetch entries
Entries are fetched in batches via
/ct/v1/get-entries?start=X&end=Y. Batch size is configurable (default: 256), and the index only advances by the number of entries actually returned. -
Track health
Consecutive failures move a log through Healthy → Degraded → Unhealthy. Unhealthy logs pause behind a circuit breaker with exponential backoff.
-
Save state
After each batch the position is recorded so the server can resume cleanly after a restart.
Static CT protocol
Checkpoint + tile-based fetching for next-generation CT logs.
Why static CT?
Let's Encrypt has retired its RFC 6962 logs in favor of static, tile-based logs, where the tree is
served as immutable tiles instead of dynamic get-entries calls. Chrome has accepted
static-ct-api logs since April 2025, and the format is now where most new logs are headed.
-
Fetch checkpoint
Polls
/checkpointfor the current tree size. Checkpoints are signed text files carrying origin, tree size, and root hash. -
Calculate tile range
Each tile holds 256 entries. The watcher computes which tiles to fetch from the current index versus tree size, with partial-tile width validation for the last tile.
-
Fetch tile data
Tiles are downloaded from
/tile/data/<path>using hierarchical path encoding (e.g.x001/234for tile 1234). Tiles may be gzip-compressed, with a hard cap on decompressed size. -
Parse binary entries
A binary parser extracts the timestamp, entry type (x509 / precert), DER certificate, and chain fingerprints from each entry in the tile.
-
Fetch issuer certificates
Chain certificates are referenced by SHA-256 fingerprint, fetched from
/issuer/<hex>and stored in a single shared, DashMap-based issuer cache. -
Dedup and broadcast
Certificates pass through the cross-log dedup filter before being serialized once and broadcast to all clients.
Cross-log deduplication
One certificate, many logs — collapsed to a single broadcast.
How it works
The same certificate often appears in multiple CT logs at once. The dedup filter uses a
DashMap<[u8; 32], Instant> keyed by the raw 32-byte SHA-256 digest stored directly
in LeafCert::sha256_raw — a fixed-size, stack-allocated key that avoids one heap
allocation per lookup compared with a String key. The first occurrence passes through;
duplicates within the TTL window are discarded. The default window is 900 seconds (15 minutes) with
a default capacity of 200K entries, both tunable. A background task prunes expired entries every 60
seconds rather than wiping the cache on overflow.
Certificate parsing
X.509 decoding, from CT entry to extracted fields.
MerkleTreeLeaf structure (RFC 6962)
Byte 0 Version ·
Byte 1 LeafType ·
Bytes 2–9 Timestamp ·
Bytes 10–11 EntryType (0 = X509, 1 = Precert) ·
Bytes 12–14 Cert length ·
Byte 15+ DER certificate
Precert extra_data (RFC 6962)
3 bytes pre-certificate length ·
variable pre-certificate (X509 with CT poison extension) ·
3 bytes chain length ·
variable certificate chain
Extracted fields
Subject / Issuer: CN, O, C, L, ST, OU, Email
Hashes: SHA1, SHA256, fingerprint
Validity: not_before, not_after
Extensions: SubjectAltName, KeyUsage, BasicConstraints
Domains: collected from CN + SAN DNS entries
Pre-serialization
Serialize once, broadcast to everyone.
Serialize-once, broadcast-many
Instead of serializing each message per connected client, certstream serializes once per
certificate into three pre-built byte payloads (full, lite,
domains_only), wraps them in an Arc<PreSerializedMessage>, and clones
only the Arc pointer to every subscriber — zero re-serialization, zero extra heap allocation per
client. With 10,000 clients that is one serialization instead of 10,000. The broadcast channel uses
backpressure: lagging clients skip messages rather than blocking others, and a client that falls too
far behind is disconnected.
Stream formats
Three payload shapes for three use cases.
full
Complete certificate data including chain and DER-encoded certificate.
lite
Certificate metadata without chain or DER data. Best for most use cases.
domains_only
Just the domain names array. Minimal bandwidth.
State persistence
Resume from the last position after a restart.
State structure
Each CT log's state tracks current_index (last processed entry),
tree_size (known tree size), and last_success (timestamp for health
tracking). State is persisted periodically (every 30 s) and on shutdown to a JSON file, enabling
zero-loss restarts. Both RFC 6962 and static CT positions are tracked.
Atomic dirty flag
The dirty flag uses an AtomicBool instead of a lock, so state updates are never
silently dropped. State is flushed on graceful shutdown (SIGINT / SIGTERM) and when the periodic
save task is cancelled. Persistence is on by default with
state_file: "certstream_state.json".