Skip to content

Commit 63ea4ff

Browse files
committed
IPIP-412: Signaling Block Order in CARs on Gateways
First draft based on various prior art and recent discussions cited in the header front matter.
1 parent b07b1bc commit 63ea4ff

2 files changed

Lines changed: 241 additions & 1 deletion

File tree

src/http-gateways/trustless-gateway.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ mode and `Accept` header is missing
6363
Below response types MUST to be supported:
6464

6565
- [application/vnd.ipld.raw](https://www.iana.org/assignments/media-types/application/vnd.ipld.raw) – requests a single, verifiable raw block to be returned
66-
- [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) – disables IPLD/IPFS deserialization, requests a verifiable CAR stream to be returned
66+
- [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) – disables IPLD/IPFS deserialization, requests a verifiable CAR stream to be returned, implementations MAY support optional parameters (:cite[ipip-0412])
6767
- [application/vnd.ipfs.ipns-record](https://www.iana.org/assignments/media-types/application/vnd.ipfs.ipns-record) – requests a verifiable :cite[ipns-record] (multicodec `0x0300`).
6868

6969
# HTTP Response

src/ipips/ipip-0412.md

Lines changed: 240 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,240 @@
1+
---
2+
title: "IPIP-0412: Signaling Block Order in CARs on HTTP Gateways"
3+
date: 2023-05-15
4+
ipip: proposal
5+
editors:
6+
- name: Marcin Rataj
7+
github: lidel
8+
url: https://lidel.org/
9+
- name: Jorropo
10+
github: Jorropo
11+
relatedIssues:
12+
- https://github.com/ipfs/specs/issues/348
13+
- https://github.com/ipfs/specs/pull/330
14+
- https://github.com/ipfs/specs/pull/402
15+
- https://github.com/ipfs/specs/pull/412
16+
order: 412
17+
tags: ['ipips']
18+
---
19+
20+
## Summary
21+
22+
Adds support for additional, optional content type options that allow the
23+
client and server to signal or negotiate a specific block order in the returned
24+
CAR.
25+
26+
## Motivation
27+
28+
We want to make it easier to build light-clients for IPFS. We want them to have
29+
low memory footprints on arbitrary sized files. The main pain point preventing
30+
this is the fact that CAR ordering isn't specified.
31+
32+
This require to keeping some kind of reference either on disk, or in memory to
33+
previously seen blocks for two reasons.
34+
35+
1. Blocks can arrive out of order, meaning when a block is consumed (data is
36+
red and returned to the consumer) and when it's received might not match.
37+
1. Blocks can be reused multiple times, this is handy for cases when you plan
38+
to cache on disk but not at all when you want to process a stream with use &
39+
forget policy.
40+
41+
What we really want is for the gateway to help us a bit, and give us blocks in
42+
a useful order.
43+
44+
The existing Trustless Gateway specification does not provide a mechanism for
45+
negotiating the order of blocks in CAR responses.
46+
47+
This IPIP aims to improve the status quo.
48+
49+
## Detailed design
50+
51+
CAR content type
52+
([`application/vnd.ipld.car`](https://www.iana.org/assignments/media-types/application/vnd.ipld.car))
53+
already supports `version` parameter, which allows gateway to indicate which
54+
CAR flavour is returned with the response.
55+
56+
The proposed solution introduces two new parameters for the content type headers
57+
in HTTP requests and responses: `order` and `dups`.
58+
59+
The `order` parameter allows the client to indicate its preference for a
60+
specific block order in the CAR response, and the `dups` parameter specifies
61+
whether duplicate blocks are allowed in the response.
62+
63+
### Signaling in Request
64+
65+
Content type negotiation is based on section 12.5.1 of :cite[rfc9110].
66+
67+
Clients MAY indicate their preferred block order by sending an `Accept` header in
68+
the HTTP request. The `Accept` header format is as follows:
69+
70+
```
71+
Accept: application/vnd.ipld.car; version=1; order=dfs; dups=y
72+
```
73+
74+
In the future, when more orders or parameters exist, clients will be able to
75+
specify a list of preferences, for example:
76+
77+
```
78+
Accept: application/vnd.ipld.car;order=foo, application/vnd.ipld.car;order=dfs;dups=y;q=0.5
79+
```
80+
81+
The above example is a list of preferences, the client would really like to use
82+
the hypothetical `order=foo` however if this isn't available it would accept
83+
`order=dfs` with `dups=y` instead (lower priority indicated via `q` parameter,
84+
as noted in :cite[rfc9110]).
85+
86+
#### `order` CAR content type parameter
87+
88+
The `order` parameter accepts the following values:
89+
90+
- `dfs`: [Depth-First Search](https://en.wikipedia.org/wiki/Depth-first_search)
91+
order, allows for streaming responses with minimal memory usage
92+
- `rnd`: Unknown (random) order, the implicit default when `order` parameter is missing.
93+
94+
#### `dups` CAR content type parameter
95+
96+
The `dups` parameter specifies whether duplicate blocks (the same block
97+
occuring multiple times in the requested DAG) will be present in the CAR
98+
response.
99+
100+
It accepts two values:
101+
- `y`: duplicate blocks are allowed
102+
- `n`: duplicates are not allowed
103+
104+
When allowed (`y`), light clients are able to discard blocks after
105+
reading them, removing the need for caching in-memory or on-disk.
106+
107+
<!-- TODO: do we need a parameter for inclusion of identity CIDs?
108+
It seems to be only relevant in Filecoin due to legacy hiccup:
109+
https://github.com/ipfs/specs/pull/330#issuecomment-1274106892 -->
110+
111+
### Signaling in Response
112+
113+
The Trustless Gateway MUST always respond with a `Content-Type` header that includes
114+
information about all supported/known parameters, even if the client did not
115+
specify them in the request.
116+
117+
The `Content-Type` header format is as follows:
118+
119+
```
120+
Content-Type: application/vnd.ipld.car;version=1;order=dfs;dups=y
121+
```
122+
123+
124+
Gateway implementations are free to decide on the implicit default ordering or
125+
other parameters, and use it in responses when client did not explicitly
126+
specify, or requested unsupported or unknown query parameter.
127+
128+
Implementations MAY choose to implement only some of the parameters.
129+
130+
## Design rationale
131+
132+
The proposed specification change aims to address the limitations of the
133+
existing Trustless Gateway specification by introducing a mechanism for
134+
negotiating the block order in CAR responses.
135+
136+
By allowing clients to indicate their preferred block order, Trustless Gateways
137+
can cache CAR responses for popular content, resulting in improved performance
138+
and reduced network load. Clients benefit from more efficient data handling by
139+
deserializing blocks as they arrive,
140+
141+
We reuse exiting HTTP content type negotiation, and the CAR content type, which
142+
already had the optional `version` parameter.
143+
144+
### User benefit
145+
146+
The proposed specification change brings several benefits to end users:
147+
148+
1. Improved Performance: Gateways can decide on their implicit default ordering
149+
and cache CAR responses for popular content. In turn, clients can benefit
150+
from strong `Etag` in ordered (deterministic) responses. This reduces the
151+
response time for subsequent requests, resulting in faster content retrieval
152+
for users.
153+
154+
2. Reduced Memory Usage: Clients no longer need to buffer the entire CAR
155+
response in memory until the deserialization of the requested entity is
156+
finished. With the ability to deserialize blocks as they arrive, users can
157+
conserve memory resources, especially when dealing with large CAR responses.
158+
159+
3. Efficient Data Handling: By discarding blocks as soon as the CID is
160+
validated and data is deserialized, clients can efficiently process the data
161+
in real-time. This is particularly useful for light clients, IoT devices,
162+
mobile web browsers, and other streaming applications where immediate access
163+
to the data is required.
164+
165+
4. Customizable Ordering: Clients can indicate their preferred block order in the
166+
`Accept` header, allowing them to prioritize specific ordering strategies that
167+
align with their use cases. This flexibility enhances the user experience
168+
and empowers users to optimize content retrieval according to their needs.
169+
170+
### Compatibility
171+
172+
The proposed specification change is backward compatible with existing client
173+
and server implementations.
174+
175+
Trustless Gateways that do not support the negotiation of block order in CAR
176+
responses will continue to function as before, providing their existing default
177+
behavior, and the clients will be able to detect it by inspecting the
178+
`Content-Type` header present in HTTP response.
179+
180+
Clients that do not send the `Accept` header or do not recognize the `order`
181+
and `dups` parameters in the `Content-Type` header will receive and process CAR
182+
responses as they did before: buffering/caching all blocks until done with the
183+
final deserialization.
184+
185+
Existing implementations can choose to adopt the new specification and
186+
implement support for the negotiation of block order incrementally. This allows
187+
for a smooth transition and ensures compatibility with both new and old
188+
clients.
189+
190+
### Security
191+
192+
The proposed specification change does not introduce any negative security
193+
implications beyond those already present in the existing Trustless Gateway
194+
specification. It focuses on enhancing performance and data handling without
195+
affecting the underlying security model of IPFS.
196+
197+
Light clients with support for `order` and `dups` CAR content type parameters
198+
will be able to detect malicious response faster, reducing risks of
199+
memory-based DoS attacks from malicious gateways.
200+
201+
### Alternatives
202+
203+
Several alternative approaches were considered before arriving at the proposed solution:
204+
205+
1. Implicit Server-Side Configuration: Instead of negotiating the block order,
206+
in the CAR response, the Trustless Gateway could have a server-side
207+
configuration that specifies the default order. However, this approach would
208+
limit the flexibility for clients, requiring them to have prior knowledge
209+
about order supported by each gateway.
210+
211+
2. Fixed Block Order: Another option was to enforce a fixed block order in the
212+
CAR responses. However, this approach would not cater to the varying needs
213+
and preferences of different clients and use cases, and is not backward
214+
compatible with the existing Trustless Gateways which return CAR responses
215+
with Weak `Etag` and unspecified block order.
216+
217+
3. Separate `X-` HTTP Header: Introduction of a separate HTTP reader was
218+
rejected because we try to use HTTP semantics where possible, and gateways
219+
already use HTTP content type negotiation for CAR `version` and reusing it
220+
saves a few bytes in each round-trip. Also, :cite[rfc6648] advises against
221+
use of `X-` and similar constructs in new protocols.
222+
223+
The proposed solution of negotiating the block order through headers si
224+
future-proof, allows for flexibility, interoperability, and customization while
225+
maintaining compatibility with existing implementations.
226+
227+
## Test fixtures
228+
229+
Implementation compliance can be determined by testing the negotiation process
230+
between clients and Trustless Gateways using various combinations of `order` and
231+
`dups` parameters.
232+
233+
TODO:
234+
1. a CAR with blocks for a small file in DFS order
235+
2. a CAR with blocks for a small file with one block appearing twice
236+
237+
238+
### Copyright
239+
240+
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).

0 commit comments

Comments
 (0)