#252 closed enhancement (fixed)
smaller segments would facilitate better client progress indicators and better alacrity
Reported by: | zooko | Owned by: | warner |
---|---|---|---|
Priority: | major | Milestone: | 1.1.0 |
Component: | code-frontend-web | Version: | 0.7.0 |
Keywords: | wui progress meter | Cc: | |
Launchpad Bug: |
Description
The FireFox? download progress meter, and also the upload progress meter is either absent entirely or confused, according to a report from user "cryptomail".
I suspect this is because the 1 MiB segment size is too large for FireFox?'s progress estimator on typical links.
Reducing it to 256 KiB or 128 KiB would increase overhead a bit but not, if I recall correctly a lot.
Someone should try that (i.e., change this constant) and see if FireFox? (or other HTTP client) progress meters work better.
Change History (9)
comment:1 Changed at 2008-01-05T05:55:48Z by warner
comment:2 Changed at 2008-01-06T05:39:49Z by zooko
I believe cryptomail was using http://nooxie.zooko.com:8123, which is pretty close to the storage servers.
I just now tested download of this text file:
time curl --dump-header wheefoo 'http://127.0.0.1:8123/uri/URI%3ACHK%3Aq3hkqnogpff1psg4fga3cdnpyw%3Axzem1ws4fbkmqmsa4n3sj6pqhgixxrtbb37qkpyj4ug8yzj8ppky%3A3%3A12%3A683?filename=The+Kleptones+-+A+Night+At+The+Hip-Hopera.txt
There is a Content-Length header.
HTTP/1.1 200 OK Date: Sun, 06 Jan 2008 05:30:20 GMT Content-length: 683 Content-type: text/plain Server: TwistedWeb/2.5.0
comment:3 Changed at 2008-02-20T20:06:15Z by zooko
The constant in question has moved to here: src/allmydata/client.py@2169#L48
http://allmydata.org/trac/tahoe/browser/src/allmydata/client.py#L48
comment:4 Changed at 2008-03-03T22:27:03Z by zooko
- Summary changed from smaller segments would enable HTTP client progress indicators to smaller segments would facilitate better client progress indicators and better alacrity
comment:5 Changed at 2008-03-08T01:27:46Z by warner
Using 128KiB makes firefox's progress meter much smoother. Our work DSL line gives us about 1Mbps downstream, so that makes for one segment per second, which is probably a good goal.
According to 'python misc/sizes.py -m gamma' (patched to use 3-of-10, set MAX_SEGSIZE to either 1MiB or 128KiB, account for the fact that we keep three sets of hashes, and fix some basic confusion):
- 1MiB segsize gives us something like .12% overhead
- 256KiB gives us about .45% overhead
- 128KiB gives us something like .93% overhead
I uploaded a 2MB file (2035089 bytes, to be exact) at a 128KiB segsize, and each of the 10 resulting shares were 682633 bytes each, so 6.83MB total. Discounting the 3.3x expansion factor, this is a net overhead of .63%
So I'm ok with changing the default segsize for immutable files to be 128KiB.
Can we do this safely right now? I think so.. CHK keys will change (so already-uploaded files will be uploaded again), but we hash the encoding parameters into the CHK key, so these should not cause multiple encodings for the same SI or anything too weird.
Mutable files need to remain at 1MB, since the current design only allows one segment, and 128KiB would impose too small of a size limit on our directory nodes.
I'll do this change today.
comment:6 Changed at 2008-03-10T19:04:28Z by warner
Uh oh, it appears that lowering the mag-seg-size to 128KiB also dropped our in-colo upload and download speeds drastically. DSL speeds were unaffected, which suggests to me that the extra round-trip times are a problem (we're not taking full advantage of the link capacity, which is a bigger deal in-colo because the links are so much faster).
max-seg-size | upload-speed (100MB) | download-speed 100MB | dl 10MB | dl 1MB |
1 MiB | 2 MBps | 4.63 MBps | 4.55 | 4.69 |
512 KiB | 1.5/1.8/1.84 MBps | 4.18/4.34/4.34 MBps | 4.39/3.30/4.37 MBps | 4.79/4.48/4.49 MBps |
256 KiB | 1.7 MBps | 3.65 MBps | 4.0 MBps | 4.23 MBps |
128 KiB | 1.4 MBps | 2.0 MBps | 3.44 MBps | 3.74 MBps |
At 1MiB MSS, speeds were mostly constant across file sizes (1MB, 10MB, 100MB). Upload varied by about 5% (e.g. 1MB at 2.01MBps, and 100MB at 1.86MBps), and download by about 4% (1MB at 4.80MBps, 10MB at 4.50MBps, and 100MB at 4.61MBps).
At 128 KiB MSS, upload speeds were mostly constant, but download speeds were significantly faster for smaller files: 1MB ran at 3.74MBps, 10MB ran at 3.44MBps, while 100MB ran at 2.07MBps.
comment:7 Changed at 2008-03-10T19:09:41Z by warner
- Owner set to warner
- Status changed from new to assigned
Zooko pointed out that pipelining our block sends could probably help this.. we'd just have to limit the maximum number of outstanding segments to keep our overall memory consumption reasonable (with 128KiB segments, allowing two to be outstanding at once has a good chance of keeping the pipeline full, while not allowing more than 128KiB*expansion*2 to be in the transport buffers at any one time, which remains much smaller than the 1MiB segsize case).
The code needs to be a bit more complex, however.
I'm not sure how to prioritize this. For the benefit of end-users, I'd like the smaller segsize. For the benefit of our own servers, I'd like faster in-colo upload/download speeds. It might be possible to accomplish both, later, by adding pipelining code. But if that doesn't work, we might be stuck with the slower approach.
I guess I'll leave it at 128KiB for now and hope to implement pipelining within the next two weeks.
comment:8 Changed at 2008-04-23T18:58:02Z by warner
- Milestone changed from undecided to 1.0.1
- Resolution set to fixed
- Status changed from assigned to closed
closing this ticket, since we're now using 128KiB segments. Created ticket #392 to track pipelining multiple segments to regain the speed lost by using 128KiB segments.
comment:9 Changed at 2008-05-05T21:08:36Z by zooko
- Milestone changed from 1.0.1 to 1.1.0
Milestone 1.0.1 deleted
What was the environment like? Specifically, were they using a local tahoe node, or a remote one? If the bottleneck is between the node and the storage servers, then yeah, the browser would see nothing for a while, then suddenly a whole segment, then nothing, then a segment. But if it's using a remote node that is fairly close to the storage servers, I'd expect to see smoother downloading.
We should also verify that we're providing a Content-Length header properly. Without that, the download progress meter doesn't have a chance (and would probably revert to the "spinner" behavior, where it indicates that progress is being made but doesn't try to imply percentage completion at all).