Context Navigation

outline.rst

Visit:

Last change on this file was 82579ce, checked in by Zooko Wilcox-O'Hearn <zooko@…>, at 2013-11-08T21:08:05Z

magic first line tells emacs to use utf8+bom

Add ".. -*- coding: utf-8-with-signature -*-" to the first line of each .rst
file. This tells emacs to treat the file contents as utf-8, and also to prepend
a so-called utf-8 "bom" marker at the beginning of the file. This patch also
prepends those markers to each of those files.

Property mode set to 100644

File size: 11.3 KB

Line
1	.. -- coding: utf-8-with-signature --
2
3	==============================
4	Specification Document Outline
5	==============================
6
7	While we do not yet have a clear set of specification documents for Tahoe
8	(explaining the file formats, so that others can write interoperable
9	implementations), this document is intended to lay out an outline for what
10	these specs ought to contain. Think of this as the ISO 7-Layer Model for
11	Tahoe.
12
13	We currently imagine 4 documents.
14
15	1. `#1: Share Format, Encoding Algorithm`_
16	2. `#2: Share Exchange Protocol`_
17	3. `#3: Server Selection Algorithm, filecap format`_
18	4. `#4: Directory Format`_
19
20	#1: Share Format, Encoding Algorithm
21	====================================
22
23	This document will describe the way that files are encrypted and encoded into
24	shares. It will include a specification of the share format, and explain both
25	the encoding and decoding algorithms. It will cover both mutable and
26	immutable files.
27
28	The immutable encoding algorithm, as described by this document, will start
29	with a plaintext series of bytes, encoding parameters "k" and "N", and either
30	an encryption key or a mechanism for deterministically deriving the key from
31	the plaintext (the CHK specification). The algorithm will end with a set of N
32	shares, and a set of values that must be included in the filecap to provide
33	confidentiality (the encryption key) and integrity (the UEB hash).
34
35	The immutable decoding algorithm will start with the filecap values (key and
36	UEB hash) and "k" shares. It will explain how to validate the shares against
37	the integrity information, how to reverse the erasure-coding, and how to
38	decrypt the resulting ciphertext. It will result in the original plaintext
39	bytes (or some subrange thereof).
40
41	The sections on mutable files will contain similar information.
42
43	This document is not responsible for explaining the filecap format, since
44	full filecaps may need to contain additional information as described in
45	document #3. Likewise it it not responsible for explaining where to put the
46	generated shares or where to find them again later.
47
48	It is also not responsible for explaining the access control mechanisms
49	surrounding share upload, download, or modification ("Accounting" is the
50	business of controlling share upload to conserve space, and mutable file
51	shares require some sort of access control to prevent non-writecap holders
52	from destroying shares). We don't yet have a document dedicated to explaining
53	these, but let's call it "Access Control" for now.
54
55
56	#2: Share Exchange Protocol
57	===========================
58
59	This document explains the wire-protocol used to upload, download, and modify
60	shares on the various storage servers.
61
62	Given the N shares created by the algorithm described in document #1, and a
63	set of servers who are willing to accept those shares, the protocols in this
64	document will be sufficient to get the shares onto the servers. Likewise,
65	given a set of servers who hold at least k shares, these protocols will be
66	enough to retrieve the shares necessary to begin the decoding process
67	described in document #1. The notion of a "storage index" is used to
68	reference a particular share: the storage index is generated by the encoding
69	process described in document #1.
70
71	This document does not describe how to identify or choose those servers,
72	rather it explains what to do once they have been selected (by the mechanisms
73	in document #3).
74
75	This document also explains the protocols that a client uses to ask a server
76	whether or not it is willing to accept an uploaded share, and whether it has
77	a share available for download. These protocols will be used by the
78	mechanisms in document #3 to help decide where the shares should be placed.
79
80	Where cryptographic mechanisms are necessary to implement access-control
81	policy, this document will explain those mechanisms.
82
83	In the future, Tahoe will be able to use multiple protocols to speak to
84	storage servers. There will be alternative forms of this document, one for
85	each protocol. The first one to be written will describe the Foolscap-based
86	protocol that tahoe currently uses, but we anticipate a subsequent one to
87	describe a more HTTP-based protocol.
88
89	#3: Server Selection Algorithm, filecap format
90	==============================================
91
92	This document has two interrelated purposes. With a deeper understanding of
93	the issues, we may be able to separate these more cleanly in the future.
94
95	The first purpose is to explain the server selection algorithm. Given a set
96	of N shares, where should those shares be uploaded? Given some information
97	stored about a previously-uploaded file, how should a downloader locate and
98	recover at least k shares? Given a previously-uploaded mutable file, how
99	should a modifier locate all (or most of) the shares with a reasonable amount
100	of work?
101
102	This question implies many things, all of which should be explained in this
103	document:
104
105	* the notion of a "grid", nominally a set of servers who could potentially
106	hold shares, which might change over time
107	* a way to configure which grid should be used
108	* a way to discover which servers are a part of that grid
109	* a way to decide which servers are reliable enough to be worth sending
110	shares
111	* an algorithm to handle servers which refuse shares
112	* a way for a downloader to locate which servers have shares
113	* a way to choose which shares should be used for download
114
115	The server-selection algorithm has several obviously competing goals:
116
117	* minimize the amount of work that must be done during upload
118	* minimize the total storage resources used
119	* avoid "hot spots", balance load among multiple servers
120	* maximize the chance that enough shares will be downloadable later, by
121	uploading lots of shares, and by placing them on reliable servers
122	* minimize the work that the future downloader must do
123	* tolerate temporary server failures, permanent server departure, and new
124	server insertions
125	* minimize the amount of information that must be added to the filecap
126
127	The server-selection algorithm is defined in some context: some set of
128	expectations about the servers or grid with which it is expected to operate.
129	Different algorithms are appropriate for different situtations, so there will
130	be multiple alternatives of this document.
131
132	The first version of this document will describe the algorithm that the
133	current (1.3.0) release uses, which is heavily weighted towards the two main
134	use case scenarios for which Tahoe has been designed: the small, stable
135	friendnet, and the allmydata.com managed grid. In both cases, we assume that
136	the storage servers are online most of the time, they are uniformly highly
137	reliable, and that the set of servers does not change very rapidly. The
138	server-selection algorithm for this environment uses a permuted server list
139	to achieve load-balancing, uses all servers identically, and derives the
140	permutation key from the storage index to avoid adding a new field to the
141	filecap.
142
143	An alternative algorithm could give clients more precise control over share
144	placement, for example by a user who wished to make sure that k+1 shares are
145	located in each datacenter (to allow downloads to take place using only local
146	bandwidth). This algorithm could skip the permuted list and use other
147	mechanisms to accomplish load-balancing (or ignore the issue altogether). It
148	could add additional information to the filecap (like a list of which servers
149	received the shares) in lieu of performing a search at download time, perhaps
150	at the expense of allowing a repairer to move shares to a new server after
151	the initial upload. It might make up for this by storing "location hints"
152	next to each share, to indicate where other shares are likely to be found,
153	and obligating the repairer to update these hints.
154
155	The second purpose of this document is to explain the format of the file
156	capability string (or "filecap" for short). There are multiple kinds of
157	capabilties (read-write, read-only, verify-only, repaircap, lease-renewal
158	cap, traverse-only, etc). There are multiple ways to represent the filecap
159	(compressed binary, human-readable, clickable-HTTP-URL, "tahoe:" URL, etc),
160	but they must all contain enough information to reliably retrieve a file
161	(given some context, of course). It must at least contain the confidentiality
162	and integrity information from document #1 (i.e. the encryption key and the
163	UEB hash). It must also contain whatever additional information the
164	upload-time server-selection algorithm generated that will be required by the
165	downloader.
166
167	For some server-selection algorithms, the additional information will be
168	minimal. For example, the 1.3.0 release uses the hash of the encryption key
169	as a storage index, and uses the storage index to permute the server list,
170	and uses an Introducer to learn the current list of servers. This allows a
171	"close-enough" list of servers to be compressed into a filecap field that is
172	already required anyways (the encryption key). It also adds k and N to the
173	filecap, to speed up the downloader's search (the downloader knows how many
174	shares it needs, so it can send out multiple queries in parallel).
175
176	But other server-selection algorithms might require more information. Each
177	variant of this document will explain how to encode that additional
178	information into the filecap, and how to extract and use that information at
179	download time.
180
181	These two purposes are interrelated. A filecap that is interpreted in the
182	context of the allmydata.com commercial grid, which uses tahoe-1.3.0, implies
183	a specific peer-selection algorithm, a specific Introducer, and therefore a
184	fairly-specific set of servers to query for shares. A filecap which is meant
185	to be interpreted on a different sort of grid would need different
186	information.
187
188	Some filecap formats can be designed to contain more information (and depend
189	less upon context), such as the way an HTTP URL implies the existence of a
190	single global DNS system. Ideally a tahoe filecap should be able to specify
191	which "grid" it lives in, with enough information to allow a compatible
192	implementation of Tahoe to locate that grid and retrieve the file (regardless
193	of which server-selection algorithm was used for upload).
194
195	This more-universal format might come at the expense of reliability, however.
196	Tahoe-1.3.0 filecaps do not contain hostnames, because the failure of DNS or
197	an individual host might then impact file availability (however the
198	Introducer contains DNS names or IP addresses).
199
200	#4: Directory Format
201	====================
202
203	Tahoe directories are a special way of interpreting and managing the contents
204	of a file (either mutable or immutable). These "dirnode" files are basically
205	serialized tables that map child name to filecap/dircap. This document
206	describes the format of these files.
207
208	Tahoe-1.3.0 directories are "transitively readonly", which is accomplished by
209	applying an additional layer of encryption to the list of child writecaps.
210	The key for this encryption is derived from the containing file's writecap.
211	This document must explain how to derive this key and apply it to the
212	appropriate portion of the table.
213
214	Future versions of the directory format are expected to contain
215	"deep-traversal caps", which allow verification/repair of files without
216	exposing their plaintext to the repair agent. This document wil be
217	responsible for explaining traversal caps too.
218
219	Future versions of the directory format will probably contain an index and
220	more advanced data structures (for efficiency and fast lookups), instead of a
221	simple flat list of (childname, childcap). This document will also need to
222	describe metadata formats, including what access-control policies are defined
223	for the metadata.

Note: See TracBrowser for help on using the repository browser.

Context Navigation

source: trunk/docs/specifications/outline.rst

Download in other formats: