[tahoe-dev] effect of packet size on erasure coding speed

Sun Sep 14 10:20:13 PDT 2008

Dear people of zfec-dev and tahoe-dev:

A researcher named Jim Plank and a few collaborators including myself  
have written a paper on performance of open source implementations of  
various erasure coding schemes.  Along the way, I ran a few  
measurements of zfec performance using different "stride lengths".  A  
"stride length" is an internal tuning optimization in the zfec  
library which is not exposed in the API and has no compatibility  
implications -- it is entirely invisible to the user except in its  
effect on performance.

(Note for users of zfec or Tahoe who don't care about these details:  
you can safely ignore all of this.  Future releases of zfec will  
probably be faster, but it won't require any change on your part to  
take advantage of the more efficient encoding.)

This graph of the performance on a PowerPC G4 867 MHz is interesting:

-------------- next part --------------
A non-text attachment was scrubbed...
Name: benchresults-from-draco.eps
Type: application/postscript
Size: 93357 bytes
Desc: not available
Url : http://allmydata.org/pipermail/tahoe-dev/attachments/20080914/4bb5417f/attachment-0002.eps 
-------------- next part --------------

The x axis is the stride length on a logarithmic scale from 1 byte to  
about 20,000 bytes and the y axis is millions of bytes per second  
encoded from about 10,500,000 to about 12,000,000.

This shows that the performance of zfec on a cache-constrained 32-bit  
CPU could be improved by setting the stride-length to be around 592  
bytes instead of its current setting (in zfec-1.4.0) of 8192 bytes,  
and greatly improved from its old setting (in zfec-1.1.0) of  
"whatever size the input buffer is".

Here's the same experiment (with a different distribution of stride  
lengths tested) on a 64-bit Intel Core 2 Duo:

-------------- next part --------------
A non-text attachment was scrubbed...
Name: benchresults-from-ootles.eps
Type: application/postscript
Size: 96649 bytes
Desc: not available
Url : http://allmydata.org/pipermail/tahoe-dev/attachments/20080914/4bb5417f/attachment-0003.eps 
-------------- next part --------------

The x axis is the stride length on a logarithmic scale from 1 byte to  
about 20,000 bytes and the y axis is millions of bytes per second  
encoded from about 60,000,000 to about 77,000,000.

This shows that the larger the stride, the faster zfec runs on such a  
modern, cache-rich, high-memory-bandwidth machine.  I ran other  
experiments, not shown here, to see if it would turn down again, and  
it doesn't -- it maxes out at about 77,000,000 no matter what the  
stride.  Basically the high memory bandwidth and the way the CPU  
cleverly pre-fetches from main RAM exceeds the speed of the zfec Reed- 
Solomon computation.

Now in theory, if you have a modern, cache-rich CPU like this and you  
load it down with lots of concurrent processes, then the shape of the  
graph should resemble the first graph, not the second one!  The  
theory is that the contention among multiple processes for accessing  
the cache and RAM should make larger strides more expensive in the  
same way that the PowerPC G4's small cache and low memory bandwidth  
making exceeding the cache expensive.  If this is true, then using a  
limited stride should let you serve significantly more erasure-code  
clients with the same server.

However, I haven't figured out a way to test this hypothesis --  
perhaps that could be future work for Jim Plank.  :-)

Personally, I'm going to be working on pycryptopp [1, 2], and  
immutable file checking and repair for the next little while.

Regards,

Zooko

[1] http://allmydata.org/trac/pycryptopp/report/1
[2] http://allmydata.org/trac/tahoe/ticket/331

---
http://allmydata.org -- Tahoe, the Least-Authority Filesystem
http://allmydata.com -- back up all your files for $5/month