[tahoe-dev] on backward-compatibility and new features
Brian Warner
warner at lothar.com
Thu Nov 19 21:01:30 PST 2009
Zooko Wilcox-O'Hearn wrote:
> Folks:
>
> This is a general lesson that I've learned over the years, from
> following the darcs project among other things.
...
> So concretely, we should make sure that in Tahoe-LAFS v1.6
> directories are created mutable by default, just like in Tahoe-LAFS
> v1.5. This is easy because the way that directories were created in
> Tahoe-LAFS v1.5 was to create a new empty mutable directory, and
> there is no point in creating a new empty immutable directory. :-)
Indeed, immutable directories are an entirely new creature, with very
different semantics, so there are no instances of writeable directory
creation that would be appropriate to replace with immutable
directories.
(by way of foreshadowing, note that "tahoe backup" has always created
non-writeable directories).
> It would probably be wise to make the "tahoe backup" command continue
> to create mutable directories by default, because there might be
> people who use "tahoe backup" to create directories which they then
> share with their friends. I know that immutable directories in
> "tahoe backup" are sweet (#606), and I want people to use them, but
> for Tahoe-LAFS v1.6 the way to get people to use them should be to
> announce them in the NEWS and explain them in the docs and tell
> people to turn them on in their configuration.
Hm, I disagree. I appreciate your point in general, but in my mind the
tradeoffs in this particular case are weighted more heavily towards the
new-feature side:
* "tahoe backup" is incredibly faster with immutable directories (no
RSA key generation per directory), possibly an order of magnitude for
lots of directories and small files. I expect that many users will
finally start to saturate their upstream links instead of their CPU.
* the directories it creates are repairable, unlike the read-only
mutable directories that "tahoe backup" used to create. Big win.
* The only directory that has a DIR2-CHK objects placed into it is the
top-level target of the "tahoe backup" command, i.e. the "BAR" when
you run "tahoe backup ~/SOURCE alias:BAR", and that directory is
specified to be owned by the backup command: nothing else goes in
there. This limits the impact on v1.4 clients (which throw an
exception when they encounter unknown objects).
* I don't believe that "tahoe backup" is the tool of choice for sharing
files: when sharing, I always create a new (mutable) directory, add
some files to it, then share the readcap. So I think it's highly
unlikely that anyone will be inconvenienced in either direction
(being unable to read backups their friend has uploaded, or being
afraid to upgrade because their friends won't be able to read their
backups)
So yeah, if we were talking about changing "tahoe mkdir" to create new
caps, then I'd agree with a conservative approach and add a config knob
to slowly introduce the new feature over time. But for "tahoe backup", I
think the benefits outweigh the downsides.
> I would like people to get used to the idea that they can always
> upgrade to the new version of Tahoe-LAFS and continue sharing with
> their more conservative friends without having to even think about
> these issues.
Ok, but then how is it possible to ever introduce new features? (please
note, I'm not disagreeing with you, I'm just uncertain how to accomodate
both this noble provide-for-the-users goal and the general forward march
of progress)
A good concrete example would be Elk Point.. assuming we ever manage to
implement it, it will fall squarely into this compatibility concern. As
a format with basically the same semantics as our current formats, it's
plausible that Tahoe can eventually create Elk Point files by default
(i.e. it's not an entirely new class of file, like immutable dirnodes).
To provide for the I-always-upgrade-but-my-friends-might-not use case,
we'd add a tahoe.cfg knob that says whether to create new files or old
files, and have it default to "old", while adding code to *read* both
types (the usual "be liberal in what you accept, conservative in what
you send" approach).
But then what would prompt a user to turn that knob "on"? I guess a user
who actually reads the NEWS file might see the knob's announcement, read
enthusiastically about the tangible benefits, enumerate who they share
files with, talk to them all and agree on an upgrade schedule, and then
all excitedly flip the switch at the same time. Or they don't bother
with NEWS, never touch the knob, and keep generating the old format
forever.
And what would prompt us to change the default to "on"? Maybe just the
passage of time. We establish some sort of policy about how many past
versions we're going to support (perhaps in the form of "if you're
running something older than X, we'll ask you to upgrade before
responding to your questions"). Then we don't turn the default to "on"
until all of the versions that we're willing to "support" are capable of
reading the new format. This could mean that some snazzy new features
might be implemented and shipped but go effectively unused for a year or
more, depending upon the release schedule and our overall patience with
old versions.
Hm. Not sure what the best answer is. It clearly depends upon how many
users we have, and how they're distributed along the reads-NEWS-or-not
casual-to-power-user spectrum.
> This is also why "forward-compatibility" features are so important to
> me. They can sometimes ease these transitions.
Yes! I'm very glad you pushed for #708 (tolerate unknown nodes), because
having it implemented in v1.5 made DIR2-CHK much easier to deploy in
v1.6, and makes me feel more confident about things like Elk Point.
cheers,
-Brian
More information about the tahoe-dev
mailing list