[tahoe-dev] [tahoe-lafs] #778: "shares of happiness" is the wrong measure; "servers of happiness" is better
tahoe-lafs
trac at allmydata.org
Sat Oct 10 15:23:54 PDT 2009
#778: "shares of happiness" is the wrong measure; "servers of happiness" is
better
--------------------------------+-------------------------------------------
Reporter: zooko | Owner: kevan
Type: defect | Status: new
Priority: critical | Milestone: 1.5.1
Component: code-peerselection | Version: 1.4.1
Keywords: reliability | Launchpad_bug:
--------------------------------+-------------------------------------------
Comment(by kevan):
Hm. That scenario would be a problem, and I don't really see an obvious
solution to it.
We could alter the logic at
[source:src/allmydata/immutable/upload.py at 4045#L225] to not just give up
after determining that there are no homeless shares, but that there aren't
enough distinct servers with shares to consider the upload a success.
We could, for example, figure out how many more servers need to have
shares on them for the upload to work ( {{{n = servers_of_happiness -
servers_with_shares}}}). We could then unallocate {{{n}}} shares from
servers that have more than one share allocated, stick them back in
{{{self.homeless_shares}}}, and then let the selection process continue as
normal. We'd need a way to prevent it from looping, though -- maybe it
should only do this if there are uncontacted peers. Would we want to
remove shares from servers that happen to already have them if we're not
counting them in the upload? If so, is there a way to do that?
Does that idea make sense?
Regarding holding up this patch versus committing now and making it a
separate issue:
* We'd probably want to write tests for this behavior. Do the test tools
in Tahoe include a way to configure a grid so that it looks like the one
in your example (I spent a while looking for such tools last weekend when
I was trying to implement a test for your first example, but couldn't find
them)? If not, we'd probably need to write them.
* We'd probably want to make a better-defined algorithm for what I said
in the paragraph up there (assuming that it is agreeable to everyone).
I have school and work to keep me busy, so I'd be able to dedicate maybe
an afternoon or two a week to keep working on this issue. I'm happy to do
that -- I'd like to finish it -- but it would probably be a little while
before we ended up committing a fix if we waited for that to be done (if
someone with more time on their hands wanted to take over, that issue
would be solved, I guess). So I guess that's one argument for making it a
separate issue. On the other hand, it'd be nice to eliminate edge cases
before committing. So there's that. I'm not sure which way I lean.
--
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/778#comment:55>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid
More information about the tahoe-dev
mailing list