Skip to content

Commit 1b9a954

Browse files
authored
Merge pull request #402 from ipip-402
IPIP-402: Partial CAR Support on Trustless Gateways
2 parents 17e46c6 + 917efb9 commit 1b9a954

4 files changed

Lines changed: 595 additions & 23 deletions

File tree

ipip-template.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -36,13 +36,6 @@ interoperable implementations.
3636
When modifying an existing specification file, this section should provide a
3737
summary of changes. When adding new specification files, list all of them.
3838

39-
## Test fixtures
40-
41-
List relevant CIDs. Describe how implementations can use them to determine
42-
specification compliance. This section can be skipped if IPIP does not deal
43-
with the way IPFS handles content-addressed data, or the modified specification
44-
file already includes this information.
45-
4639
## Design rationale
4740

4841
The rationale fleshes out the specification by describing what motivated
@@ -67,6 +60,13 @@ Explain the security implications/considerations relevant to the proposed change
6760

6861
Describe alternate designs that were considered and related work.
6962

63+
## Test fixtures
64+
65+
List relevant CIDs. Describe how implementations can use them to determine
66+
specification compliance. This section can be skipped if IPIP does not deal
67+
with the way IPFS handles content-addressed data, or the modified specification
68+
file already includes this information.
69+
7070
### Copyright
7171

7272
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).

src/http-gateways/path-gateway.md

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ editors:
2121
url: https://hacdias.com/
2222
xref:
2323
- url
24+
- trustless-gateway
2425
tags: ['httpGateways', 'lowLevelHttpGateways']
2526
order: 0
2627
---
@@ -214,11 +215,13 @@ These are the equivalents:
214215
- `format=cbor``Accept: application/cbor`
215216
- `format=ipns-record``Accept: application/vnd.ipfs.ipns-record`
216217

217-
<!-- TODO Planned: https://github.com/ipfs/go-ipfs/issues/8769
218-
- `selector=<cid>` can be used for passing a CID with [IPLD selector](https://ipld.io/specs/selectors)
219-
- Selector should be in dag-json or dag-cbor format
220-
- This is a powerful primitive that allows for fetching subsets of data in specific order, either as raw bytes, or a CAR stream. Think “HTTP range requests”, but for IPLD, and more powerful.
221-
-->
218+
### `dag-scope` (request query parameter)
219+
220+
Only used on CAR requests, same as :ref[dag-scope] from :cite[trustless-gateway].
221+
222+
### `entity-bytes` (request query parameter)
223+
224+
Only used on CAR requests, same as :ref[entity-bytes] from :cite[trustless-gateway].
222225

223226
# HTTP Response
224227

@@ -592,7 +595,11 @@ The following response types require an explicit opt-in, can only be requested w
592595
- Raw Block (`?format=raw`)
593596
- Opaque bytes, see [application/vnd.ipld.raw](https://www.iana.org/assignments/media-types/application/vnd.ipld.raw).
594597
- CAR (`?format=car`)
595-
- Arbitrary DAG as a verifiable CAR file or a stream, see [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car).
598+
- A CAR file or a stream that contains all blocks required to trustlessly verify the requested content path query, see [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) and :cite[trustless-gateway].
599+
- **Note:** by default, block order in CAR response is not deterministic,
600+
blocks can be returned in different order, depending on implementation
601+
choices (traversal, speed at which blocks arrive from the network, etc).
602+
An opt-in ordered CAR responses MAY be introduced in a future IPIP.
596603
- TAR (`?format=tar`)
597604
- Deserialized UnixFS files and directories as a TAR file or a stream, see :cite[ipip-0288].
598605
- IPNS Record

src/http-gateways/trustless-gateway.md

Lines changed: 180 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ description: >
44
Trustless Gateways are a minimal subset of Path Gateways that allow light IPFS
55
clients to retrieve data behind a CID and verify its integrity without delegating any
66
trust to the gateway itself.
7-
date: 2023-03-30
7+
date: 2023-06-20
88
maturity: reliable
99
editors:
1010
- name: Marcin Rataj
@@ -17,25 +17,33 @@ tags: ['httpGateways', 'lowLevelHttpGateways']
1717
order: 1
1818
---
1919

20-
Trustless Gateway is a minimal _subset_ of :cite[path-gateway]
20+
Trustless Gateway is a _subset_ of :cite[path-gateway]
2121
that allows light IPFS clients to retrieve data behind a CID and verify its
2222
integrity without delegating any trust to the gateway itself.
2323

2424
The minimal implementation means:
2525

26-
- data is requested by CID, only supported path is `/ipfs/{cid}`
27-
- no path traversal or recursive resolution, no UnixFS/IPLD decoding server-side
2826
- response type is always fully verifiable: client can decide between a raw block or a CAR stream
27+
- no UnixFS/IPLD deserialization
28+
- for CAR files:
29+
- the behavior is identical to :cite[path-gateway]
30+
- for raw blocks:
31+
- data is requested by CID, only supported path is `/ipfs/{cid}`
32+
- no path traversal or recursive resolution
2933

3034
# HTTP API
3135

3236
A subset of "HTTP API" of :cite[path-gateway].
3337

34-
## `GET /ipfs/{cid}[?{params}]`
38+
## `GET /ipfs/{cid}[/{path}][?{params}]`
3539

36-
Downloads data at specified CID.
40+
Downloads verifiable data for the specified **immutable** content path.
3741

38-
## `HEAD /ipfs/{cid}[?{params}]`
42+
Optional `path` is permitted for requests that specify CAR format (`application/vnd.ipld.car`).
43+
44+
For RAW requests, only `GET /ipfs/{cid}[?{params}]` is supported.
45+
46+
## `HEAD /ipfs/{cid}[/{path}][?{params}]`
3947

4048
Same as GET, but does not return any payload.
4149

@@ -45,13 +53,13 @@ Downloads data at specified IPNS Key. Verifiable :cite[ipns-record] can be reque
4553

4654
## `HEAD /ipns/{key}[?{params}]`
4755

48-
same as GET, but does not return any payload.
56+
Same as GET, but does not return any payload.
4957

5058
# HTTP Request
5159

5260
Same as in :cite[path-gateway], but with limited number of supported response types.
5361

54-
## HTTP Request Headers
62+
## Request Headers
5563

5664
### `Accept` (request header)
5765

@@ -67,12 +75,174 @@ Below response types SHOULD to be supported:
6775
Gateway SHOULD return HTTP 400 Bad Request when running in strict trustless
6876
mode (no deserialized responses) and `Accept` header is missing.
6977

78+
## Request Query Parameters
79+
80+
### :dfn[dag-scope] (request query parameter)
81+
82+
Optional, `dag-scope=(block|entity|all)` with default value `all`, only available for CAR requests.
83+
84+
Describes the shape of the DAG fetched the terminus of the specified path whose blocks
85+
are included in the returned CAR file after the blocks required to traverse
86+
path segments.
87+
88+
- `block` - Only the root block at the end of the path is returned after blocks
89+
required to verify the specified path segments.
90+
91+
- `entity` - For queries that traverse UnixFS data, `entity` roughly means return
92+
blocks needed to verify the terminating element of the requested content path.
93+
For UnixFS, all the blocks needed to read an entire UnixFS file, or enumerate a UnixFS directory.
94+
For all queries that reference non-UnixFS data, `entity` is equivalent to `block`
95+
96+
- `all` - Transmit the entire contiguous DAG that begins at the end of the path
97+
query, after blocks required to verify path segments
98+
99+
When present, returned `Etag` must include unique prefix based on the passed scope type.
100+
101+
### :dfn[entity-bytes] (request query parameter)
102+
103+
The optional `entity-bytes=from:to` parameter is available only for CAR
104+
requests.
105+
106+
It implies `dag-scope=entity` and serves as a trustless equivalent of an HTTP
107+
Range Request.
108+
109+
When the terminating entity at the end of the specified content path:
110+
111+
- can be interpreted as a continuous array of bytes (such as a UnixFS file), a
112+
Gateway MUST return only the minimal set of blocks necessary to verify the
113+
specified byte range of that entity.
114+
115+
- When dealing with a sharded UnixFS file (`dag-pb`, `0x70`) and a non-zero
116+
`from` value, the UnixFS data and `blocksizes` determine the
117+
corresponding starting block for a given `from` offset.
118+
119+
- cannot be interpreted as a continuous array of bytes (such as a DAG-CBOR/JSON
120+
map or UnixFS directory), the parameter MUST be ignored, and the request is
121+
equivalent to `dag-scope=entity`.
122+
123+
Allowed values for `from` and `to` follow a subset of section 14.1.2 from
124+
:cite[rfc9110], where they are defined as offset integers that limit the
125+
returned blocks to only those necessary to satisfy the range `[from,to]`:
126+
127+
- `from` value gives the byte-offset of the first byte in a range.
128+
- `to` value gives the byte-offset of the last byte in the range;
129+
that is, the byte positions specified are inclusive.
130+
131+
The following additional values are supported:
132+
133+
- `*` can be substituted for end-of-file
134+
- `entity-bytes=0:*` is the entire file (a verifiable version of HTTP request for `Range: 0-`)
135+
- Negative numbers can be used for referring to bytes from the end of a file
136+
- `entity-bytes=-1024:*` is the last 1024 bytes of a file
137+
(verifiable version of HTTP request for `Range: -1024`)
138+
- It is also permissible (unlike with HTTP Range Requests) to ask for the
139+
range of 500 bytes from the beginning of the file to 1000 bytes from the
140+
end: `entity-bytes=499:-1000`
141+
142+
A Gateway MUST augment the returned `Etag` based on the passed `entity-bytes`.
143+
144+
A Gateway SHOULD return an HTTP 400 Bad Request error when the requested range
145+
cannot be parsed as valid offset positions.
146+
147+
In more nuanced error scenarios, a Gateway MUST return a valid CAR response
148+
that includes enough blocks for the client to understand why the requested
149+
`entity-bytes` was incorrect or why only a part of the requested byte range was
150+
returned:
151+
152+
- If the requested `entity-bytes` resolves to a range that partially falls
153+
outside of the entity's byte range, the response MUST include the subset of
154+
blocks within the entity's bytes.
155+
- This allows clients to request valid ranges of the entity without needing
156+
to know its total size beforehand, and it does not require the Gateway to
157+
buffer the entire entity before returning the response.
158+
159+
- If the requested `entity-bytes` resolves to a zero-length range or falls
160+
fully outside of the entity's bytes, the response is equivalent to
161+
`dag-scope=block`.
162+
- This allows client to produce a meaningful error (e.g, in case of UnixFS,
163+
leverage `Data.blocksizes` information present in the root `dag-pb` block).
164+
165+
- In streaming scenarios, if a Gateway is capable of returning the root block
166+
but lacks prior knowledge of the final component of the requested content
167+
path being invalid or absent in the DAG, a Gateway SHOULD respond with HTTP 200.
168+
- This behavior is a consequence of HTTP streaming limitations: blocks are
169+
not buffered, by the time a related parent block is being parsed and
170+
returned to the client, the HTTP status code has already been sent to the
171+
client.
172+
70173
# HTTP Response
71174

72175
Below MUST be implemented **in addition** to "HTTP Response" of :cite[path-gateway].
73176

74-
## HTTP Response Headers
177+
## Response Headers
178+
179+
### `Content-Type` (response header)
180+
181+
MUST be returned and include additional format-specific parameters when possible.
182+
183+
If a CAR stream was requested, the response MUST include the parameter specifying CAR version.
184+
For example: `Content-Type: application/vnd.ipld.car; version=1`
75185

76186
### `Content-Disposition` (response header)
77187

78188
MUST be returned and set to `attachment` to ensure requested bytes are not rendered by a web browser.
189+
190+
## Response Payload
191+
192+
### Block Response
193+
194+
An opaque bytes matching the requested block CID
195+
([application/vnd.ipld.raw](https://www.iana.org/assignments/media-types/application/vnd.ipld.raw)).
196+
197+
The Body hash MUST match the Multihash from the requested CID.
198+
199+
### CAR Response
200+
201+
A CAR stream for the requested
202+
[application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car)
203+
content type, path and optional `dag-scope` and `entity-bytes` URL parameters.
204+
205+
#### CAR version
206+
207+
Value returned in
208+
[`CarV1Header.version`](https://ipld.io/specs/transport/car/carv1/#header)
209+
field MUST match the `version` parameter returned in `Content-Type` header.
210+
211+
#### CAR roots
212+
213+
The behavior associated with the
214+
[`CarV1Header.roots`](https://ipld.io/specs/transport/car/carv1/#header) field
215+
is not currently specified.
216+
217+
Clients MAY ignore it.
218+
219+
:::issue
220+
221+
As of 2023-06-20, the behavior of the `roots` CAR field remains an [unresolved item within the CARv1 specification](https://web.archive.org/web/20230328013837/https://ipld.io/specs/transport/car/carv1/#unresolved-items).
222+
223+
:::
224+
225+
#### CAR determinism
226+
227+
The default CAR header and block order in a CAR response is not specified and is non-deterministic.
228+
229+
Clients MUST NOT assume that CAR responses are deterministic (byte-for-byte identical) across different gateways.
230+
231+
Clients MUST NOT assume that CAR includes CIDs and their blocks in the same order across different gateways.
232+
233+
:::issue
234+
235+
In controlled environments, clients MAY choose to rely on undocumented CAR determinism,
236+
subject to the agreement of the following conditions between the client and the
237+
gateway:
238+
- CAR version
239+
- content of [`CarV1Header.roots`](https://ipld.io/specs/transport/car/carv1/#header) field
240+
- order of blocks
241+
- status of duplicate blocks
242+
243+
In the future, there may be an introduction of a convention to indicate aspects
244+
of determinism in CAR responses. Please refer to
245+
[IPIP-412](https://github.com/ipfs/specs/pull/412) for potential developments
246+
in this area.
247+
248+
:::

0 commit comments

Comments
 (0)