This document describes how to mirror a transparency log, and how to obtain signatures asserting that a mirror has done so.
The base64 encoding used throughout is the standard Base 64 encoding specified in RFC 4648, Section 4.
U+ followed by four hexadecimal characters denotes a Unicode codepoint, to be
encoded in UTF-8. 0x followed by two hexadecimal characters denotes a byte
value in the 0-255 range.
[start, end), where start <= end, denotes the half-open interval containing
integers x such that start <= x < end.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 RFC 2119 RFC 8174 when, and only when, they appear in all capitals, as shown here.
A mirror is a cosigner that stores a copy of a log. A mirror's cosignature makes the additional statement that the mirror has durably logged the contents of the checkpoint and made them accessible.
A mirror is defined by a name, a public key, and by two URL prefixes: the submission prefix for write APIs and the monitoring prefix for read APIs. A mirror MAY use the same value for both the submission prefix and the monitoring prefix.
For each supported origin log, the mirror is configured with:
- The log's public key
- The log's URL prefix
- A minimum index to start mirroring (see below for how this is configured)
- An optional list of monitoring prefixes for other mirrors for the log
The mirror maintains a copy of each origin log and serves it publicly via the
tiled transparency log interface. It uses a URL prefix of
<monitoring prefix>/<encoded origin>, where encoded origin is the log's
origin, percent-encoded. The checkpoint served from this prefix MUST include
a cosignature from the mirror.
The mirror update process is designed to be safely interruptible, while avoding large atomic operations. For each origin log, a mirror maintains the following:
-
A copy of the log, pruned to its minimum index. The latest checkpoint of this log copy is known as the mirror checkpoint.
-
A pending checkpoint, which is at or ahead of the mirror checkpoint. If ahead of the mirror checkpoint, the pending checkpoint describes entries that have yet to be incorporated into the mirror.
-
A list of pending entries that have yet to be incorporated into the mirror checkpoint. The mirror's next entry is the log index of the first entry, greater or equal to the minimum index, that is not in either the log or pending entry list.
The update process ensures that all current and past pending checkpoints are consistent, and all pending entries are contained in the current pending checkpoint. Thus mirrors MAY commit pending entries to the log, serving them as entry bundles, as soon as they are added. That is, a mirror MAY use the same underlying storage for entry bundles and pending entries, without distinguishing between them. It is expected that most tiled log implementations will do this.
Mirrors update in three stages:
-
A mirror client updates the pending checkpoint with a signed checkpoint and a consistency proof.
-
A mirror client uploads new entries to the pending entry list, up to the pending checkpoint.
-
The mirror commits the pending entries to the log and updates the mirror checkpoint.
The next sections describe the HTTP endpoints used by mirror clients to update the log.
The mirror implements a witness's add-checkpoint endpoint to update its
pending checkpoint for a log:
POST <submission prefix>/add-checkpoint
The request is handled identically to that of a witness, updating the pending checkpoint (but not the mirror checkpoint), with the exception that it does not need to generate and respond with any cosignatures. The mirror MAY handle the request by internally updating the pending checkpoint and responding with an empty response body. The mirror MUST retain the log's signature in the pending checkpoint.
The mirror cosigner MUST NOT sign the checkpoint in this process. It MAY respond with witness cosignatures if the mirror operator wishes to additionally provide a separate witness service using its pending checkpoint. If so, this witness service MUST be a distinct cosigner from the mirror cosigner, with a distinct name. The mirror's signature is computed later, as described below.
The mirror implements an add-entries endpoint to upload entries for a supported
log:
POST <submission prefix>/add-entries
The request body MUST have Content-Type of application/octet-stream and
contain the following values, concatenated.
- 2 bytes, encoding a big-endian uint16:
log_origin_size log_origin_sizebytes, containing the log origin:log_origin- 8 bytes, encoding a big-endian uint64:
upload_start - 8 bytes, encoding a big-endian uint64:
upload_end - 2 bytes, encoding a big-endian uint16:
ticket_size ticket_sizebytes, containing an opaqueticketvalue, described below- A sequence of entry packages, described below
upload_end MUST be equal to the pending checkpoint's tree size, or that of a
previously valid pending checkpoint. ticket is an opaque value from the
mirror, or the empty string, to help the mirror recover past pending
checkpoints.
upload_start MUST be less or equal to upload_end. It MUST also be less or
equal to the mirror's next entry for the origin.
The request body uploads the log entries whose indices are in
[upload_start, upload_end). Entries are grouped into bounded-size entry
packages. Each package has a subtree consistency proof that allows the
mirror to verify and commit the entries in the package without buffering the
entire request body.
Each entry package is determined by a half-open interval [start, end) of log
indices. The request MUST contain entry packages whose intervals' disjoint
union, in order, is the interval [upload_start, upload_end). The overall
interval MUST be divided into packages at multiples of 256, to align with the corresponding entry bundles.
More precisely, if upload_start is equal to upload_end, there are no
packages. Otherwise, let rounded_start be upload_start rounded down to a
multiple of 256, and let rounded_end be upload_end rounded up to a
multiple of 256. The request MUST contain
num_packages = (rounded_end - rounded_start) / 256 packages. Entry
package i, for 0 <= i < num_packages, MUST be computed from the interval
[start, end) where:
start = max(upload_start, rounded_start + i * 256)
end = min(upload_end, rounded_start + (i + 1) * 256)
The package MUST contain the following values, concatenated.
- The log entries in
[start, end), each with a big-endian uint16 length prefix - 1 byte, encoding an 8-bit unsigned integer,
num_hashes, which MUST be at most 63 num_hashessubtree consistency proof hash values
The subtree consistency proof is computed from the subtree defined by
[rounded_start + i * 256, end), and the log checkpoint with tree size
upload_end.
TODO: This is currently citing an individual IETF draft for subtrees, though it is versioned and immutable. Should we, for now, copy and paste that text somewhere here? (Subtrees are also slightly more general than needed here. Every subtree we consider is directly contained in the target tree size.)
The request body has unbounded size, so the client and mirror SHOULD stream it.
The mirror processes the request as follows:
First, the mirror reads log_origin, upload_start, upload_end, and
ticket. If log_origin is not a known log, the mirror MUST respond with a
"404 Not Found" HTTP status code. If upload_end is not the tree size of a
known pending checkpoint value, the mirror MUST respond with a "409 Conflict"
HTTP status code. If upload_start is greater than the mirror's next entry, or
too far below the mirror's next entry, the mirror MUST also respond with a
"409 Conflict" HTTP status code.
The mirror SHOULD send these error responses without waiting for the entire request body to be available. Conversely, the client SHOULD be prepared to receive an error response before the request body is fully sent.
When sending a 409 response, the response body MUST have a Content-Type of
text/x.tlog.mirror-info and consist of three lines, each followed by a
newline (U+000A):
- The tree size of a valid pending checkpoint, in decimal
- The next entry, in decimal
- An opaque, possibly zero length, ticket value, encoded in base64
If the client's upload_end value was valid, the first line SHOULD contain
upload_end. This allows the client to resume an interrupted upload without
recomputing subtree consistency proofs. Otherwise, the first line SHOULD be the
tree size of the current pending checkpoint.
After receiving a 409 Conflict, the client SHOULD retry setting upload_end to
the tree size, upload_start to the advertised next entry value, and the
ticket to the received ticket. If a client doesn't have information on the
mirror, it MAY initially make a request with upload_start and upload_end set
to zero to obtain it.
To reduce the chance of retry failures as the mirror state changes, mirrors
SHOULD accept any of the last several pending checkpoint values as upload_end.
This MAY be implemented with extra state, or by storing the signed checkpoint in
the ticket. The mirror MUST authenticate any information it derives from a
ticket. For example, the ticket MAY be encrypted with a symmetric secret known
only to the mirror.
If upload_end and upload_start are valid, the mirror proceeds to read and
process each entry package. For each entry package, it MUST authenticate the
entries by verifying the subtree consistency proof: First, it reconstructs the
subtree hash based on the received entries and entries already in the log. It
then verifies the subtree consistency proof using this hash and the checkpoint
at upload_end.
If this verification process fails, it MUST respond with a "422 Unprocessable Entity" HTTP status code and end processing. Otherwise, it saves the entries as pending entries. If some entry has already been written to the log or the pending entry list, the mirror MUST skip saving that entry.
Once all expected entry packages are successfully validated and committed, the
next entry will be greater or equal to upload_end. The mirror then finishes
committing entries up to upload_end to the log. For example, a mirror that
stores individual tiles might compute new tiles and start serving them.
Finally, the mirror performs the following steps atomically:
- Check if the mirror checkpoint's tree size is less than
upload_end - If so, sign the pending checkpoint of size
upload_endand update the mirror checkpoint to the newly-signed checkpoint.
Unlike the add-checkpoint endpoint, the add-entries endpoint is not
processed as a single atomic transaction. A mirror SHOULD permit multiple
clients to concurrently send requests to the endpoint. This avoids a
denial-of-service attack if one client begins an add-entries stream but pauses
it partway through. Additionally, clients and mirrors MUST continue operating
correctly if an add-entries stream is interrupted. The API is designed to
support this with minimal synchronization.
When checking the upload_start and upload_end values, the mirror MUST act on
some valid copy of its pending checkpoint and next index state. However, it
MAY act on stale data without impacting correctness of the protocol. That is, it
is not necessary to globally synchronize this check with add-checkpoint
handling, or other instances of add-entries.
When committing authenticated entries to the pending entries list, it is
possible that, due to a concurrent instance of add-entries, some entries have
already been added to the pending entries list or the mirror. Mirrors MUST
correctly handle this case and continue operating correctly. This may require
synchronization of individual log resources. In doing so, the mirror MAY assume
that the two copies of the entries are identical. They will both be proven
consistent with the pending checkpoints.
When updating the mirror checkpoint to upload_end, it is possible that some
concurrent instance of add-entries has already updated the mirror checkpoint to upload_end
or past it. In this case, the mirror MUST NOT rewind the checkpoint and MUST
instead skip the update.
Entry packages are divided at multiples of 256 to align with the entry bundle representation in the tiled log interface. It is expected that most implementations will compute exactly one entry bundle from each entry package and commit it directly to log storage when the package is authenticated.
A mirror that uses a different representation MAY buffer entry packages and defer committing them. For example, if the mirror internally stores entry bundles of size 512, it might commit entry packages two at a time.
A mirror MAY process an entry package without waiting for the previous entry package to be durably committed to storage. However, the mirror MUST NOT sign or update its mirror checkpoint until all entries are durably committed. If the mirror commits entries out of order, it MUST correctly compute the next entry to be the first missing entry, even if some subsequent entries have been committed. Mirror clients will then reupload the subsequent entries.
A mirror MAY additionally implement other update processes, provided it continues
to correctly operate add-entries and never violates its cosigner requirements
on mirror checkpoints.
As with a tiled transparency log, a mirror maintains a minimum index value to support pruning. The discussion for origin logs also applies to mirrors. In particular, without a retention policy for when pruning is permitted, mirrors MUST NOT be pruned. That is, the minimum index value MUST be set to zero.
Given a retention policy, minimum indices for mirrors MAY be maintained independently from that of the origin log. A retention policy MAY require that mirrors retain data for longer than the origin log, in which case the mirror's minimum index will be kept at a lower value than the origin log's. However, for a new mirror to be initialized at a lower minimum index than the origin log, the missing entries MUST be available from some other source, e.g. another mirror.
Conversely, a retention policy MAY permit mirrors to retain less data than the origin log and set a higher minimum index. For example, a log ecosystem MAY permit new mirrors to start from some recent index, to reduce the initial bandwidth cost of bootstrapping the mirror. However, in this case, log clients MUST NOT treat the mirror as contributing to a quorum for entries below its minimum index.
A mirror that is already mirroring a log at some minimum index MAY lower its minimum index, provided the newly-mirrored entries can be obtained, e.g., via another mirror. Before committing these entries, the mirror MUST authenticate them against its mirror or pending checkpoint by checking a subtree consistency proof.