[tahoe-dev] [tahoe-lafs] #683: So-called URIs aren't
tahoe-lafs
trac at allmydata.org
Sun Apr 19 11:02:39 PDT 2009
#683: So-called URIs aren't
---------------------+------------------------------------------------------
Reporter: kpreid | Owner: nobody
Type: defect | Status: new
Priority: major | Milestone: undecided
Component: unknown | Version: 1.3.0
Keywords: | Launchpad_bug:
---------------------+------------------------------------------------------
Tahoe has things it calls URIs which identify files. For example:
URI:CHK:twpnflhnjeubo2tluuglxrbvdu:oan4set42mwkwxonqmq4xlull6ggnl2f2zggjmp6fgji7uv7py2a:3:10:34295
However, they are not URIs (which term is defined by RFC); in particular,
URIs have the syntax <scheme>:<scheme-specific-part>, where the possible
values for <scheme> are administered by the IETF:
http://www.iana.org/assignments/uri-schemes.html
Since Tahoe "URIs" do have the properties a URI should, I believe the
appropriate fix for this is to register a {{{tahoe:}}} URI scheme. As far
as I know, the "URI:" part of a Tahoe URI is always the same, so it
conveys no information and can be replaced with this for only a two-
character addition:
tahoe:CHK:twpnflhnjeubo2tluuglxrbvdu:oan4set42mwkwxonqmq4xlull6ggnl2f2zggjmp6fgji7uv7py2a:3:10:34295
--- The remainder of this text is not a matter of correctness but
additional functionality ---
Furthermore, so that these URIs are also URLs (readily usable for
contacting the resource with no local context), I would recommend
including in the the syntax of the scheme-specific-path a provision for an
OPTIONAL location hint for the grid, i.e. some host that can be contacted
by some protocol that can put the client in communication with appropriate
storage servers. This is essentially the same provision as in CapTP URIs;
borrowing their syntax, it would be like:
tahoe://example.net:1234,192.168.33.91:1234/CHK:twpnflhnjeubo2tluuglxrbvdu:oan4set42mwkwxonqmq4xlull6ggnl2f2zggjmp6fgji7uv7py2a:3:10:34295
That is, {{{tahoe://}}} comma-separated list of hosts {{{/}}} current
Tahoe-URI components.
----
Besides correctness in terminology, another advantage of having registered
Tahoe URI syntax is that Tahoe files can participate as first-class
entities in URI-based systems, and vice-versa.
For example, if Tahoe directories could store arbitrary URIs, of which
Tahoe URIs were a special case, then they could include references not
just to other things in the same Tahoe grid, but any other URLs-as-
capabilities system, including other Tahoe grids or Waterken servers or
...whatever. You could use the {{{data:}}} URI scheme to store
sufficiently small files directly in directories. (I vaguely recall that
Tahoe might already have that capability.)
If there is a registered Tahoe scheme, then systems which work exclusively
with URLs, but are extensible to handle additional URL schemes, can be
extended to support Tahoe, rather than necessarily going through a Tahoe
web gateway, thus providing useful information (e.g. 'this is immutable'),
perhaps more efficient downloading, etc.
--
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/683>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid
More information about the tahoe-dev
mailing list