wiki:Performance

Version 40 (modified by brimstone, at 2013-11-15T14:47:23Z) (diff)

Updated links to point to proper subdomain

(See also copious notes and data about performance of older versions of Tahoe-LAFS, archived at Performance/Old.)

In September 2011 Brian Warner did thorough benchmarks on a four-machine grid of dedicated physical hardware which wasn't doing any other tasks except for running the benchmarks: see here.

Here are two tickets about building automation to measure and display performance measurements: #1406, #1530

In late 2010 Kyle Markley did some benchmarking of what were then the release candidates for Tahoe-LAFS v1.8.0. This helped us catch two major performance regressions in Brian's New Downloader and helped make Tahoe-LAFS v1.8.0 into an excellent new release (see epic ticket #1170 for mind-numbing details). Kyle also contributed code for his benchmarking scripts (in Perl), but nobody to my knowledge has yet tried to re-use that script.

We also experimented with different segment sizes and immutable uploader pipeline depths, and the results tentatively confirmed that the current segment size (128 KiB) and immutable uploader pipeline depth (50,000 B) were better on both of Kyle's networks than any of the alternatives that Kyle tried.

Along the way Terrell Russell did some benchmarking and contributed a bash script which I used several times during the process:

At about the same time Nathan Eisenberg of Atlas Networks did a couple of manual benchmarks:

Also François Deppierraz has run a few benchmarks. (Can't find a link to his results.)

Jeff Darcy benchmarked Tahoe-LAFS vs. his new CloudFS (based on Gluster) vs. encfs vs. ecryptfs vs. Gluster, using iozone: https://lists.fedorahosted.org/pipermail/cloudfs-devel/2011-June/000097.html https://lists.fedorahosted.org/pipermail/cloudfs-devel/2011-June/000099.html

Ticket #932 (benchmark Tahoe-LAFS compared to nosql dbs) is a ticket to run the standard "YCSB" benchmarks for nosql databases on Tahoe-LAFS.

What we really want, of course, is automated benchmarks that get executed at regularly scheduled intervals, or whenever a new patch is committed to revision control, or both. This would ideally run on some dedicated hardware or at least on some virtualized hardware which had a fairly consistent load of other tenants, so that the resulting measurements would not get too much noise from other people's behavior. You can see on Performance/Old that we used to have such an automated setup, including graphs of the resulting performance.

Attachments (1)

Download all attachments as: .zip