as before, I think I'd like to continue using "storage index" for what you're
calling the "file identifier", but yeah split out "server selector" or
"peer-selection index" or some similar term for the purpose of determining
which servers you're going to be talking to. One way of describing this would
be "we used to use the storage-index as the peer-selection index, but these
days we put two separate values in the filecap".
I am also starting to think of these as separate concepts, but remember that
we've yet to actually implement such a split.
Sylvan's concern was about availability: he considered a backup system to be
broken if its design has a built-in probability of file unrecoverability.
It's easier to see the problem if we set the encryption-key and hash lengths
to infinity, but restrict the storage index to say four bits. Then upload two
files, and try to download one of them.. you've got a 1/16 chance of getting
a download failure because your two files had the same storage-index, you
downloaded the wrong bits, and now they won't pass the integrity check.
Also, when we talk about this, we should be careful to distinguish between
the failure modes of mutable files versus immutable files.. they're very
distinct. And, collisions at different levels have very different
consequences: if the storage index is too small, we'll get availability
failures; if the immutable encryption key or mutable writekey is too small,
we'll get confidentiality failures. I've been assuming that we'll keep the
security parameters sufficiently large.. this ticket was specifically about
the availability concerns resulting from a too-small storage index.
If we compress the filecap by deriving the storage-index from the writekey,
then clearly we're limited by min(len(writekey),len(storage-index)).
Mostly I ticketed this issue because it's something I want to keep in mind as
we design the next revision of the filecap format. If we don't already have a
wiki page for it, I'll add one to organize our ideas.. I think they're
currently spread across half a dozen tickets.
I updated the table in the description: I think 192-bit caps will let us have
an effectively-infinite number of files (264) with an effectively-zero
chance of collision (2-65).
[edited to fix trac markup]