[tahoe-dev] [tahoe-lafs] #731: what to do with filenames that are illegal on some systems
tahoe-lafs
trac at allmydata.org
Sun Jun 14 09:49:38 PDT 2009
#731: what to do with filenames that are illegal on some systems
-----------------------------------+----------------------------------------
Reporter: zooko | Owner:
Type: defect | Status: new
Priority: major | Milestone: 1.5.0
Component: code-dirnodes | Version: 1.4.1
Keywords: forward-compatibility | Launchpad_bug:
-----------------------------------+----------------------------------------
Comment(by swillden):
Replying to [comment:3 bewst]:
> It seems to me that tahoe probably has enough flexibility to store
''any'' filename, and many
> people will only be using it to store and retrieve files to/from the
same system, so it should
> "just work" for that use case.
This is my thought as well, at least for backup use cases. Tahoe in
general has a broader usage model, and so solutions appropriate for backup
may not be adequate for those other use cases, but for backups, I think
the top priority is ensuring that backups succeed reliably and don't lose
any data -- including file name data.
That's why the approach I've chosen for GridBackup (which, BTW, is finally
starting to write to a grid, Yay!) is to make sure that:
1. ALL names can be backed up, regardless of whether or not they make any
sense on any filesystem in existence.
2. When restoring to a system that uses the same encoding as the backup
source, all names are restored byte-for-byte identically to what was read
from the file system during backup.
3. When restoring to a system that uses a different encoding, I try to
transcode the names but just error out if it doesn't work. Eventually my
plan is to give the user a list of paths that broke and let them decide
what to name each of them, with some suggestions based on attempts to
decode the name with all Python-supported codecs.
During a restore, there's room for human intervention to address naming
problems, but during backup, I just want to get the data. I'm taking a
similar approach to other metadata. Extended attributes, ACLs, resource
forks, even POSIX permissions -- there are destination systems to which
none of these things will make sense, but that's okay. The backup will
grab everything and we can deal with how to make use of the data, if
possible, during restore.
--
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/731#comment:4>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid
More information about the tahoe-dev
mailing list