Opened at 2021-08-16T19:33:34Z
Last modified at 2021-08-18T13:41:50Z
#3766 new enhancement
Protocol is potentially high-latency and high bandwidth overhead for small files
Reported by: | itamarst | Owned by: | exarkun |
---|---|---|---|
Priority: | normal | Milestone: | HTTP Storage Protocol v2 |
Component: | unknown | Version: | n/a |
Keywords: | Cc: | ||
Launchpad Bug: |
Description
Imagine uploading a new, small file. As I understand it, this will require:
- Create a storage index.
- Upload each of the shares, e.g. 10 HTTP queries if there's 10 shares.
One can't do _all_ queries in parallel, only uploads, because of the race condition between the uploads and the storage index existing. So even a clever, async client implementation will still require two HTTP roundtrips for each upload.
In addition to double latency (or 11× latency for a naive client, which maybe we don't care about), there's also a bunch of HTTP protocol overhead for uploading a file.
One can imagine an optimized variant of the API that includes both storage index and share creation in a single HTTP API call, for smaller files. This is, however, an optimization, and probably needn't exist in the first version.
Change History (2)
comment:1 Changed at 2021-08-18T13:38:38Z by exarkun
comment:2 Changed at 2021-08-18T13:41:50Z by exarkun
- Milestone changed from HTTP Storage Protocol to HTTP Storage Protocol v2
Just to the point of such a naive client specifically: there are other motivations to not be this naive. Primarily, all shares are produced at the same time as the cleartext is processed. If you only upload one of them at a time, you have to store all the rest of them locally until you're ready to upload them. If you upload in parallel (which the current Tahoe-LAFS does using the Foolscap protocol) then you never have to store any of them locally, you can stream them all up as they're generated.
For small files, who cares. But for large files this is likely to be pretty crummy - especially given ZFEC expansion which means you might end up storing 2x or 3x or more (technically the maximum is 255x I think, but that's not a very likely client configuration).