[tahoe-dev] [tahoe-lafs] #678: converge same file, same K, different M
tahoe-lafs
trac at allmydata.org
Tue Aug 25 01:37:44 PDT 2009
#678: converge same file, same K, different M
---------------------------+------------------------------------------------
Reporter: zooko | Owner:
Type: enhancement | Status: new
Priority: major | Milestone: undecided
Component: code-encoding | Version: 1.3.0
Keywords: | Launchpad_bug:
---------------------------+------------------------------------------------
Comment(by warner):
One significant barrier to this is the share hash tree. When a
file is uploaded, we take the hash of all shares (well, the block
hash tree root for all shares) and put them into a list, then we
build another merkle tree over that list, and store the root of
that tree in the UEB (where it gets folded into the
integrity-checking part of the filecap).
This list is of length "N", the number of shares that were
created, padded up to a power-of-two boundary (for the merkle
tree). This makes it impossible to add more shares to the same
hash tree.
So if a file has already been uploaded with 3-of-10, and now you
decide you'd like to encode it with 3-of-16, then shares 0..9
will be the same, but the merkle tree that was distributed with
the first upload won't contain the hashes for shares 10..15, and
the UEB will be different for the two uploads, so the filecaps
will be different (to be specific, the readkey [and therefore the
storage-index] part will be the same, but the UEB hash that
provides all the integrity information will be different).
Let's analyze the compatibility grid. One one axis is the person
doing the download: either Sally holding the "short" 3-of-10
filecap, or Lucy holding the "long" 3-of-16 filecap. The other is
the class of share being used:
* A: the original 10 shares from the 3-of-10 upload
* B: shares 0..9 from the new 3-of-16 upload
* C: shares 10..15 from the new 3-of-16 upload
(in ideal circumstances, the 3-of-16 upload would skip the "B"
shares, but if that uploader didn't happen to see the "A" shares
out there already, they might generate "B" shares)
Sally handles "A" shares just fine, as usual. When she sees "B"
shares, the current code will reject them because the UEB is
different than the filecap. The shares don't contain a full share
hash tree, but rather just the minimal chain necessary to
validate that one share. If she has enough "A" shares to know the
expected value of the "B" share in question (i.e. that leaf of
the short tree), then she can use the "B" share with confidence
(she'd use the share, but ignore the [long] hash chain that comes
with it). This basically means she must get that share's sibling:
if she has A0, she can validate B1; if she has A3, she can
validate B2, etc.
Should she see "C" shares (say that 5 times fast!), she's stuck.
There is no hash chain that will get her from the C share's hash
to her filecap's UEB hash. She could download the C share anyways
and throw it into FEC, and then test the resulting ciphertext
segment against the ciphertext hash tree: if it matches, great,
the share was good. If it doesn't, then she won't even know which
share had a problem.
Now, what about Lucy? When she sees an "A" share, she'll be in
the same boat as Sally looking at a "B" share. If Lucy manages to
find a sibling "B" share for each potential "A" share, then she
can validate the share hash, and use the share. But if she can't
(such as if there are *no* "B" shares, because the uploader
didn't create any because it saw the "A" shares out there), then
she won't be able to validate the "A" shares, and then she'll be
in the Sally-vs-C-shares boat: try FEC and catch problems with
the ciphertext hash.
So, what could be done to improve this?
* include a full copy of the share hash tree in each share, not
just the minimal chain: this would let Sally use arbitrary "B"
shares and Lucy use arbitrary "A" shares, without forcing them
to locate a sibling share. It still wouldn't let Sally use "C"
shares with confidence.
* allow the Downloader to speculate a bit: keep track of which
shares went into FEC, if the ciphertext hash fails then
iterate through the various combinations of shares, use some
sort of constraint-based logic to figure out which
combinations are still worth trying. The combinatorics will be
smaller with fewer non-validated shares, so the peer-selection
code should prioritize validatable shares.
* if you plan ahead, you can pretend you're encoding to 3-of-16
or 3-of-32 or whatever, but then only actually upload the
first 10 shares. You still have to do all the encoding work,
however, because you have to compute the correct merkle tree
to put in those 10 shares, but you can save the bandwidth and
storage space for the shares that you'll create later. It's
not clear where the break-even point would be.
I'm not sure what else could be done.
For mutable files, it's a slightly different story. The encoding
scheme is basically the same, but if you're willing to modify all
the existing shares, you can increase the size of the share hash
tree later on (build a bigger tree, sign the root, then update
all shares with the new tree). Unfortunately I believe that the
share format puts the share hash tree before the share data, so
if you increase its size, you must also move the data (and the
remote API has no insert() method), so you'll basically have to
re-upload everything, including the old share data. So it's
possible, but expensive.
--
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/678#comment:3>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid
More information about the tahoe-dev
mailing list