[tahoe-dev] [tahoe-lafs] #392: pipeline upload segments to make upload faster

Wed Apr 15 13:10:20 PDT 2009

#392: pipeline upload segments to make upload faster
------------------------------+---------------------------------------------
 Reporter:  warner            |           Owner:  warner    
     Type:  enhancement       |          Status:  new       
 Priority:  major             |       Milestone:  eventually
Component:  code-performance  |         Version:  1.0.0     
 Keywords:  speed             |   Launchpad_bug:            
------------------------------+---------------------------------------------

Comment(by warner):

 So, using the attached patch, I added pipelined writes to the immutable
 upload operation. The {{{Pipeline}}} class allows up to 50KB in the pipe
 before it starts blocking the sender (specifically, the calls to
 {{{WriteBucketProxy._write}}} return {{{defer.succeed}}} until there is
 more
 than 50KB of unacknowledged data in the pipe, after which it returns
 regular
 Deferreds until some of those writes get retired. A terminal {{{flush()}}}
 call causes the Upload to wait for the pipeline to drain before it is
 considered complete).

 A quick performance test (in the same environments that we do the buildbot
 performance tests on: my home DSL line and tahoecs2 in colo) showed a
 significant improvement in the DSL per-file overhead, but only about a 10%
 improvement in the overall upload rate (for both DSL and colo).

 Basically, the 7 writes used to write a small file (header, segment 0,
 crypttext_hashtree, block_hashtree, share_hashtree, uri_extension, close)
 are
 all put on the wire together, so they take bandwidth plus 1 RTT instead of
 bandwidth plus 7 RTT. The savings of 6 RTT appears to save us about 1.8
 seconds over my DSL line. (my ping time to the servers is about 11ms, but
 then there's kernel/python/twisted/foolscap/tahoe overhead on top of
 that).

 For a larger file, pipelining might increase the utilization of the wire,
 particularly if you have a "long fat" pipe (high bandwidth but high
 latency).
 However, with 10 shares going out at the same time, the wire is probably
 pretty full already: the ratio of interest is segsize*N/k/BW / RTT . You
 send
 N blocks for a single segment at once, then you wait for all the replies
 to
 come back, then generate the next blocks. If the time it takes to send a
 single block is greater than the server's turnaround time, then N-1
 responses
 will be received before the last block is finished sending, so you've only
 got one RTT of idle time (while you wait for the last server to respond).
 Pipelining will fill this last RTT, but my guess is that isn't that much
 of a
 help, and that something else is needed to explain the performance hit we
 saw
 in colo when we moved to larger segments.

 DSL no pipelining:

 {{{
 TIME (startup): 2.36461615562 up, 0.719145059586 down
 TIME (1x 200B): 2.38471603394 up, 0.734190940857 down
 TIME (10x 200B): 21.7909920216 up, 8.98366594315 down
 TIME (1MB): 45.8974239826 up, 5.21775698662 down
 TIME (10MB): 449.196600914 up, 34.1318571568 down
 upload per-file time: 2.179s
 upload speed (1MB): 22.87kBps
 upload speed (10MB): 22.37kBps
 }}}

 DSL with pipelining:

 {{{
 TIME (startup): 0.437352895737 up, 0.185742139816 down
 TIME (1x 200B): 0.493880987167 up, 0.202013969421 down
 TIME (10x 200B): 5.15211510658 up, 2.04516386986 down
 TIME (1MB): 43.141931057 up, 2.09753513336 down
 TIME (10MB): 416.777194977 up, 19.6058299541 down
 upload per-file time: 0.515s
 upload speed (1MB): 23.46kBps
 upload speed (10MB): 24.02kBps
 }}}

 The in-colo tests showed roughly the same improvement to upload speed, but
 very little change to the per-file time. The RTT time there is shorter
 (ping
 time is about 120us), which might explain the difference. But I think the
 slowdown lies elsewhere. Pipelining shaves about 30ms off each file, and
 increases the overall upload speed by about 10%.

 colo no pipelining:

 {{{
 TIME (startup): 0.29696393013 up, 0.0784759521484 down
 TIME (1x 200B): 0.285771131516 up, 0.0790619850159 down
 TIME (10x 200B): 3.23165798187 up, 0.849181175232 down
 TIME (100x 200B): 31.7827451229 up, 8.95765590668 down
 TIME (1MB): 1.00738477707 up, 0.347244977951 down
 TIME (10MB): 7.12743496895 up, 2.9827849865 down
 TIME (100MB): 70.9683670998 up, 25.6454920769 down
 upload per-file time: 0.318s
 upload per-file times-avg-RTT: 83.833386
 upload per-file times-total-RTT: 20.958347
 upload speed (1MB): 1.45MBps
 upload speed (10MB): 1.47MBps
 upload speed (100MB): 1.42MBps
 }}}

 colo with pipelining:

 {{{
 TIME (startup): 0.262734889984 up, 0.0758249759674 down
 TIME (1x 200B): 0.271718025208 up, 0.0812950134277 down
 TIME (10x 200B): 2.80361104012 up, 0.838641881943 down
 TIME (100x 200B): 28.4790999889 up, 9.36092710495 down
 TIME (1MB): 0.853738069534 up, 0.337486028671 down
 TIME (10MB): 6.6658270359 up, 2.67381596565 down
 TIME (100MB): 64.6233050823 up, 26.5593090057 down
 upload per-file time: 0.285s
 upload per-file times-avg-RTT: 77.205647
 upload per-file times-total-RTT: 19.301412
 upload speed (1MB): 1.76MBps
 upload speed (10MB): 1.57MBps
 upload speed (100MB): 1.55MBps
 }}}

 I want to run some more tests before landing this patch, to make sure it's
 really doing what I though it should be doing. I'd also like to improve
 the
 automated speed-test to do a simple TCP transfer to measure the available
 upstream bandwidth, so we can compare tahoe's upload speed against the
 actual
 wire.

-- 
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/392#comment:3>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid