[tahoe-dev] [tahoe-lafs] #628: "mtime" and "ctime": I don't think that word means what you think it means.
tahoe-lafs
trac at allmydata.org
Tue Apr 7 19:42:26 PDT 2009
#628: "mtime" and "ctime": I don't think that word means what you think it means.
---------------------------+------------------------------------------------
Reporter: zooko | Owner: zooko
Type: defect | Status: assigned
Priority: major | Milestone: 1.4.0
Component: code-dirnodes | Version: 1.3.0
Keywords: | Launchpad_bug:
---------------------------+------------------------------------------------
Comment(by zooko):
Okay, so I don't want people building on top of Tahoe and starting to rely
on the current timestamp semantics, and I don't want to keep populating
Tahoe filesystems with "ctime" values for which we will never be able to
tell whether they were creation times or unix ctimes. Therefore, I've
prepared a patch which attempts to clarify things. Unfortunately this
patch grew and grew as I worked on it, and it explicitly reverts some
behavior that Brian deliberately put into "tahoe backup", and I don't
understand how changing that behavior will change "tahoe backup", and it
causes the "tahoe backup" unit tests to fail in ways I don't understand.
However, I still wonder if some subset of this patch (such as the part
that populates the metadata dict with {{{lcrtime}}} and {{{lmotime}}},
without the part that ''uses'' those values) might be appropriate for
Tahoe-1.4.0, so please read this.
Here's what I've got:
1. To the "metadata" dict associated with each link, add "link creation
time" -- spelled {{{lcrtime}}} -- and "link modify time" -- spelled
{{{lmotime}}}. Link creation time is set whenever a link is added to the
directory when there was not previous a link by that name. Link modify
time is set whenever an extant link is overwritten by a new link (even if
it points to the same file). These are the semantics that Tahoe 1.3.0
assigned to {{{ctime}}} and {{{mtime}}} respectively.
2. Leave {{{ctime}}} and {{{mtime}}} unchanged for now for backwards
compatibility.
3. Since this expands the size of a directory entry, raise the size of
the initial read chunk from 2000 bytes to 4000 bytes, so that we still get
the first few directory entries in the initial read.
4. If you are overwriting a link, and the old link does not have a
{{{lcrtime}}} but does have a {{{ctime}}}, then copy its {{{ctime}}} into
{{{lcrtime}}}.
5. In {{{tahoe ls}}}, look for {{{lcrtime}}} first and if that is not
present then look for {{{ctime}}}, likewise look for {{{lmotime}}} first
and if that is not present then look for {{{mtime}}}. Note that {{{tahoe
ls}}} is then going to present those values in the format that GNU ls
users expect to find unix ctime and mtime. Note further that GNU ls users
apparently think that unix ctime means something that it doesn't mean in
unix, but does mean in Tahoe. I don't know how to untangle this.
6. Likewise in directory listings in the WUI, if {{{lcrtime}}} exists
then list it (as "lcr:"), else list {{{ctime}}} (as "c:"), and if
{{{lmotime}}} exists then list it (as "lmo:"), else list {{{mtime}}} (as
"m:").
7. In {{{tahoe backup}}}, don't set any of these bits of metadata --
{{{ctime}}}, {{{mtime}}}, {{{lcrtime}}}, {{{lmotime}}}. Within the
context of Tahoe, all of these are properties of the links in the Tahoe
filesystem, not of the files in a local filesystem, and it doesn't make
sense to copy them. Probably what we ought to do is copy the timestamps
such as {{{mtime}}} out of the local filesystem into a *different* field
of the metadata, perhaps named {{{local_mtime}}} or something, so as not
to confuse it with the Tahoe {{{mtime}}} (née {{{lmotime}}}), which is
something different. If we are going to use the local filesystem's
timestamps for any reason, then if platform is Windows then set
{{{local_crtime=s.st_ctime}}}, else set {{{local_ctime=s.st_ctime}}}.
This means on Windows the {{{local_ctime}}} field doesn't get set, and on
non-Window the {{{local_crtime}}} field doesn't get set.
Anyway, the attached patch just removes the code which reads timestamps
from the local filesystem. It also changes the behavior of dirnodes to
always update the timestamps even if the caller explicitly passed in
timestamps in its metadata. Basically, the {{{ctime}}}, {{{mtime}}},
{{{lcrtime}}}, and {{{lmotime}}} are now considered to be special and to
be controlled only by source:src/allmydata/dirnode.py and not by external
callers. (Perhaps they should be moved to a different namespace -- there
should be a "caller-usable metadata" and a "tahoe filesystem metadata".)
This is the part that explicitly reverses Brian's design and that breaks
things in ways that I don't understand.
8. Open http://bugs.python.org/issue5720 (ctime: I don't think that word
means what you think it means.) to trac my proposal to disambiguate and
extend the Python stat API.
--
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/628#comment:7>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid
More information about the tahoe-dev
mailing list