Skip to content

Commit faf4a0b

Browse files
committed
ipip-0402: car-scope=file → dag-scope=entity & bytes → entity-bytes
This change incorporates feedback from Adin, Rod and Juan: - bytes: #402 (review) - car-scope: #402 (comment) I really hope these names will be good enough, but I am running on artisan, recycled electrons so can do this all day :-)
1 parent 278a277 commit faf4a0b

3 files changed

Lines changed: 103 additions & 54 deletions

File tree

src/http-gateways/path-gateway.md

Lines changed: 4 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -214,35 +214,13 @@ These are the equivalents:
214214
- `format=cbor``Accept: application/cbor`
215215
- `format=ipns-record``Accept: application/vnd.ipfs.ipns-record`
216216

217-
## Query Parameters for CAR Requests
217+
### `dag-scope` (request query parameter)
218218

219-
The following query parameters are only available for requests made with either a `format=car` query parameter or an `Accept: application/vnd.ipld.car` request header. These parameters modify shape of the IPLD graph returned within the car file.
219+
Only used on CAR requests, same as [dag-scope](/http-gateways/trustless-gateway/#dag-scope-request-query-parameter) from :cite[trustless-gateway]
220220

221-
### `car-scope` (request query parameter)
221+
### `entity-bytes` (request query parameter)
222222

223-
Optional, `car-scope=(block|file|all)` with default value 'all', describes the shape of the dag fetched the terminus of the specified path whose blocks are included in the returned CAR file after the blocks required to traverse path segments.
224-
225-
`block` - Only the root block at the end of the path is returned After blocks required to verify the specified path segments.
226-
227-
`file` - For queries that traverse UnixFS data, `file` roughly means return blocks needed to verify the end of the path as a filesystem entity. In other words, all the blocks needed to 'cat' a UnixFS file at the end of the specified path, or to 'ls' a UnixFS directory at the end of the specified path. For all queries that do not reference non-UnixFS data, `file` is equivalent to `block`
228-
229-
`all` - Transmit the entire contiguous DAG that begins at the end of the path query, after blocks required to verify path segments
230-
231-
### `bytes` (request query parameter)
232-
233-
Optional, `bytes=x:y` with default value `0:*`. When the entity at the end of the specified path can be intepreted as a contingous array of bytes (such as a UnixFS file), returns only the blocks required to verify the specified byte range of said entity. Put another way, the `bytes` parameters can serve as a trustless form of an HTTP range request. If the entity at the end of the path cannot be interpreted as a continguous array of bytes (such as a CBOR/JSON map), this parameter has no effect. Allowed values for `x` and `y` are positive integers where y >= x, which limit the return blocks to needed to satify the range [x, y]. In addition the following additional values are permitted:
234-
235-
- `*` can be substituted for end-of-file
236-
- `?bytes=0:*` is the entire file (i.e. to fulfill HTTP Range Request `x-` requests)
237-
- Negative numbers can be used for referring to bytes from the end of a file
238-
- `?bytes=-1024:*` is the last 1024 bytes of a file (i.e. to fulfill HTTP Range Request `-y` requests)
239-
- It is also permissible (unlike with HTTP Range Requests) to ask for the range of 500 bytes from the beginning of the file to 1000 bytes from the end by `?bytes=499:-1000`
240-
241-
<!-- TODO Planned: https://github.com/ipfs/go-ipfs/issues/8769
242-
- `selector=<cid>` can be used for passing a CID with [IPLD selector](https://ipld.io/specs/selectors)
243-
- Selector should be in dag-json or dag-cbor format
244-
- This is a powerful primitive that allows for fetching subsets of data in specific order, either as raw bytes, or a CAR stream. Think “HTTP range requests”, but for IPLD, and more powerful.
245-
-->
223+
Only used on CAR requests, same as [entity-bytes](/http-gateways/trustless-gateway/#entity-bytes-request-query-parameter) from :cite[trustless-gateway]
246224

247225
# HTTP Response
248226

src/http-gateways/trustless-gateway.md

Lines changed: 58 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ Same as GET, but does not return any payload.
5959

6060
Same as in :cite[path-gateway], but with limited number of supported response types.
6161

62-
## HTTP Request Headers
62+
## Request Headers
6363

6464
### `Accept` (request header)
6565

@@ -75,6 +75,63 @@ Below response types SHOULD to be supported:
7575
Gateway SHOULD return HTTP 400 Bad Request when running in strict trustless
7676
mode (no deserialized responses) and `Accept` header is missing.
7777

78+
## Request Query Parameters
79+
80+
### :dfn[dag-scope] (request query parameter)
81+
82+
Optional, `dag-scope=(block|entity|all)` with default value `all`, only available for CAR requests.
83+
84+
Describes the shape of the DAG fetched the terminus of the specified path whose blocks
85+
are included in the returned CAR file after the blocks required to traverse
86+
path segments.
87+
88+
- `block` - Only the root block at the end of the path is returned after blocks
89+
required to verify the specified path segments.
90+
91+
- `entity` - For queries that traverse UnixFS data, `entity` roughly means return
92+
blocks needed to verify the terminating element of the requested content path.
93+
For UnixFS, all the blocks needed to read an entire UnixFS file, or enumerate a UnixFS directory.
94+
For all queries that reference non-UnixFS data, `entity` is equivalent to `block`
95+
96+
- `all` - Transmit the entire contiguous DAG that begins at the end of the path
97+
query, after blocks required to verify path segments
98+
99+
When present, returned `Etag` must include unique prefix based on the passed scope type.
100+
101+
### :dfn[entity-bytes] (request query parameter)
102+
103+
Optional, `entity-bytes=from:to` with the default value `0:*`, only available for CAR requests.
104+
Serves as a trustless form of an HTTP Range Request.
105+
106+
When the terminating entity at the end of the specified content path can be
107+
interpreted as a continuous array of bytes (such as a UnixFS file), returns
108+
only the minimal set of blocks required to verify the specified byte range of
109+
said entity.
110+
111+
Allowed values for `from` and `to` are positive integers where `to` >= `from`, which
112+
limit the return blocks to needed to satisfy the range `[from,to]`:
113+
114+
- `from` value gives the byte-offset of the first byte in a range.
115+
- `to` value gives the byte-offset of the last byte in the range; that is,
116+
the byte positions specified are inclusive. Byte offsets start at zero.
117+
118+
If the entity at the end of the path cannot be interpreted as a continuous
119+
array of bytes (such as a DAG-CBOR/JSON map, or UnixFS directory), this
120+
parameter has no effect.
121+
122+
The following additional values are supported:
123+
124+
- `*` can be substituted for end-of-file
125+
- `entity-bytes=0:*` is the entire file (a verifiable version of HTTP request for `Range: 0-`)
126+
- Negative numbers can be used for referring to bytes from the end of a file
127+
- `entity-bytes=-1024:*` is the last 1024 bytes of a file
128+
(verifiable version of HTTP request for `Range: -1024`)
129+
- It is also permissible (unlike with HTTP Range Requests) to ask for the
130+
range of 500 bytes from the beginning of the file to 1000 bytes from the
131+
end: `entity-bytes=499:-1000`
132+
133+
When present, returned `Etag` must include unique prefix based on the passed range.
134+
78135
# HTTP Response
79136

80137
Below MUST be implemented **in addition** to "HTTP Response" of :cite[path-gateway].

src/ipips/ipip-0402.md

Lines changed: 41 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,10 @@ ipip: proposal
55
editors:
66
- name: Hannah Howard
77
github: hannahhoward
8+
- name: Adin Schmahmann
9+
github: aschmahmann
10+
- name: Rod Vagg
11+
github: rvagg
812
- name: Marcin Rataj
913
github: lidel
1014
url: https://lidel.org/
@@ -39,11 +43,15 @@ Save round-trips, allow more efficient resume and parallel downloads.
3943

4044
The solution is to allow the :cite[trustless-gateway] to support partial
4145
responses by:
46+
4247
- allowing for requesting sub-paths within a DAG, and getting blocks necessary
4348
for traversing all path segments for end-to-end verification
44-
- opt-in `car-scope` parameter that allows for narrowing down returned blocks
45-
to a `block`, `file` (aka logical IPLD entity), or `all` (default)
46-
- opt-in `bytes` parameter that allows for returning only a subset of blocks
49+
50+
- opt-in `dag-scope` parameter that allows for narrowing down returned blocks
51+
to a `block`, `entity` (a logical IPLD entity, such as a file, directory,
52+
CBOR document), or `all` (default)
53+
54+
- opt-in `entity-bytes` parameter that allows for returning only a subset of blocks
4755
within a logical IPLD entity
4856

4957
Details are in :cite[trustless-gateway].
@@ -66,14 +74,15 @@ Terse rationale for each feature:
6674
- The ability to narrow down CAR response based on logical scope or specific byte
6775
range within an entity comes directly from the types of requests existing
6876
path gateways need to handle.
69-
- `car-scope=block` allows for resolving content paths to the final CID, and
77+
- `dag-scope=block` allows for resolving content paths to the final CID, and
7078
learn its type (unixfs file/directory, or a custom codec)
71-
- `car-scope=file` covers the majority of website hosting needs (returning a
72-
file, or enumerating directory contents)
73-
- `car-scope=all` returns all blocks in a DAG: was the existing behavior and
79+
- `dag-scope=entity` covers the majority of website hosting needs (returning a
80+
file, enumerating directory contents, or any other IPLD entity)
81+
- `dag-scope=all` returns all blocks in a DAG: was the existing behavior and
7482
remains the implicit default
75-
- `bytes=from:to` enables efficient, verifiable analog to HTTP Range Requests
83+
- `entity-bytes=from:to` enables efficient, verifiable analog to HTTP Range Requests
7684
(resuming downloads or seeking within bigger files, such as videos)
85+
- `from` and `to` match the behavior of HTTP Range Requests.
7786

7887
### User benefit
7988

@@ -121,7 +130,7 @@ introduce additional blocks required for verifying.
121130
As long the client was written in a trustless manner, and follows ring and was discarding
122131
unexpected blocks, this will be a backward-compatible change.
123132

124-
#### CAR format with `bytes` and `car-scope` parameters
133+
#### CAR format with `entity-bytes` and `dag-scope` parameters
125134

126135
These parameters are opt-in, which means no breaking changes.
127136

@@ -159,7 +168,7 @@ risks, and weak value proposition, as [discussed during IPFS Thing 2022](https:/
159168
#### Additional "Web" Scope
160169

161170
A request for
162-
`/ipfs/bafybeiaysi4s6lnjev27ln5icwm6tueaw2vdykrtjkwiphwekaywqhcjze/wiki/?format=car&car-scope=file`
171+
`/ipfs/bafybeiaysi4s6lnjev27ln5icwm6tueaw2vdykrtjkwiphwekaywqhcjze/wiki/?format=car&dag-scope=entity`
163172
returns all blocks required for enumeration of the big HAMT `/wiki` directory,
164173
and then an additional request for `index.html` needs to be issued.
165174

@@ -181,7 +190,7 @@ It is impossible to know if some entity on a sub-path is a file or a directory,
181190
without sending a probe for the root block, which introduces one round-trip overhead
182191
per entity.
183192

184-
This problem is not present in the case of `car-scope=file`, which shifts the
193+
This problem is not present in the case of `dag-scope=entity`, which shifts the
185194
decision to the server, and allows for fetching unknown UnixFS entity with a
186195
single request.
187196

@@ -197,7 +206,7 @@ The main utility of this scope is saving round-trips when retrieving a specific
197206
entity as a member of a bigger DAG.
198207

199208
To test, request a small file that fits in a single block from a sub-path. The
200-
returned CAR MUST include both the block with the `file` data and blocks
209+
returned CAR MUST include both the block with the file data and all blocks
201210
necessary for traversing from the root CID to the terminating element (all
202211
parents, root CID and a subdirectory below it).
203212

@@ -213,7 +222,7 @@ Fixtures:
213222

214223
:::
215224

216-
### Testing `car-scope=block`
225+
### Testing `dag-scope=block`
217226

218227
The main utility of this scope is resolving content paths. This means a CAR
219228
response with blocks related to path traversal, and the root block of the
@@ -227,13 +236,13 @@ Fixtures:
227236

228237
:::example
229238

230-
- TODO(gateway-conformance): `/ipfs/cid/parent/directory?format=car&car-scope=block` (UnixFS directory on a path)
239+
- TODO(gateway-conformance): `/ipfs/cid/parent/directory?format=car&dag-scope=block` (UnixFS directory on a path)
231240

232-
- TODO(gateway-conformance): `/ipfs/cid/parent1/parent2/file?format=car&car-scope=block` (UnixFS file on a path)
241+
- TODO(gateway-conformance): `/ipfs/cid/parent1/parent2/file?format=car&dag-scope=block` (UnixFS file on a path)
233242

234243
:::
235244

236-
### Testing `car-scope=file`
245+
### Testing `dag-scope=entity`
237246

238247
The main utility of this scope is retrieving all blocks related to a meaningful
239248
IPLD entity. Currently, the most popular entity types are:
@@ -252,48 +261,48 @@ Fixtures:
252261

253262
:::example
254263

255-
- TODO(gateway-conformance): `/ipfs/cid/chunked-dag-pb-file?format=car&car-scope=file`
264+
- TODO(gateway-conformance): `/ipfs/cid/chunked-dag-pb-file?format=car&dag-scope=entity`
256265
- Request a `chunked-dag-pb-file` (UnixFS file encoded with `dag-pb` with
257266
more than one chunk). Returned blocks MUST be enough to deserialize the file.
258267

259-
- TODO(gateway-conformance): `/ipfs/cid/dag-cbor-with-link?format=car&car-scope=file`
268+
- TODO(gateway-conformance): `/ipfs/cid/dag-cbor-with-link?format=car&dag-scope=entity`
260269
- Request a `dag-cbor-with-link` (DAG-CBOR document with CBOR Tag 42 pointing
261270
at a third-party CID). The response MUST include the terminating entity (DAG-CBOR)
262271
and MUST NOT include the CID from the Tag 42 (IPLD Link).
263272

264-
- TODO(gateway-conformance): `/ipfs/cid/flat-directory/file?format=car&car-scope=file`
273+
- TODO(gateway-conformance): `/ipfs/cid/flat-directory/file?format=car&dag-scope=entity`
265274
- Request UnixFS `flat-directory`. The response MUST include the minimal set of
266275
blocks required for enumeration of directory contents, and no blocks that
267276
belong to child entities.
268277

269-
- TODO(gateway-conformance): `/ipfs/cid/hamt-directory/file?format=car&car-scope=file`
278+
- TODO(gateway-conformance): `/ipfs/cid/hamt-directory/file?format=car&dag-scope=entity`
270279
- Request UnixFS `hamt-directory`. The response MUST include the minimal set of
271280
blocks required for enumeration of directory contents, and no blocks that
272281
belong to child entities.
273282

274283
:::
275284

276-
### Testing `car-scope=all`
285+
### Testing `dag-scope=all`
277286

278-
This is the implicit default used when `car-scope` is not present,
287+
This is the implicit default used when `dag-scope` is not present,
279288
and explicitly used in the context of proxy gateway supporting :cite[ipip-0288].
280289

281290
Fixtures:
282291

283292
:::example
284293

285-
- TODO(gateway-conformance): `/ipfs/cid-of-a-directory?format=car&car-scope=all`
294+
- TODO(gateway-conformance): `/ipfs/cid-of-a-directory?format=car&dag-scope=all`
286295
- Request a CID of UnixFS `directory` which contains two files. The response MUST
287296
contain all blocks that can be accessed by recursively traversing all IPLD
288297
Links from the root CID.
289298

290-
- TODO(gateway-conformance): `/ipfs/cid/chunked-dag-pb-file?format=car&car-scope=all`
299+
- TODO(gateway-conformance): `/ipfs/cid/chunked-dag-pb-file?format=car&dag-scope=all`
291300
- Request a CID of UnixFS `file` encoded with `dag-pb` codec and more than
292301
one chunk. The response MUST contain blocks for all `file` chunks.
293302

294303
:::
295304

296-
### Testing `bytes=from:to`
305+
### Testing `entity-bytes=from:to`
297306

298307
This type of CAR response is used for facilitating HTTP Range Requests and
299308
byte seek within bigger entities.
@@ -302,20 +311,25 @@ byte seek within bigger entities.
302311

303312
Properly testing this type of response requires synthetic DAG that is only
304313
partially retrievable. This ensures systems that perform internal caching
305-
won't pass the test due to the entire DAG being cached.
314+
won't pass the test due to the entire DAG being precached, or fetched in full.
306315

307316
:::
308317

309318
Use of the below fixture is highly recommended:
310319

311320
:::example
312321

313-
- TODO(gateway-conformance): `/ipfs/dag-pb-file?format=car&bytes=40000000000-40000000002`
322+
- TODO(gateway-conformance): `/ipfs/dag-pb-file?format=car&entity-bytes=40000000000-40000000002`
314323

315324
- Request a byte range from the middle of a big UnixFS `file`. The response MUST
316325
contain only the minimal set of blocks necessary for fullfilling the range
317326
request.
318327

328+
- TODO(gateway-conformance): `/ipfs/10-bytes-cid?format=car&entity-bytes=4:-2`
329+
330+
- Request a byte range from the middle of a small file, to -2 bytes from the end.
331+
- (TODO confirm we want keep this -- added since it was explicitly stated as a supported thing in path-gateway.md)
332+
319333
:::
320334

321335
### Copyright

0 commit comments

Comments
 (0)