Ticket #393: 393status14.dpatch

File 393status14.dpatch, 359.7 KB (added by kevan, at 2010-07-02T23:38:26Z)
Line 
1Thu Jun 24 16:46:37 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
2  * Misc. changes to support the work I'm doing
3 
4      - Add a notion of file version number to interfaces.py
5      - Alter mutable file node interfaces to have a notion of version,
6        though this may be changed later.
7      - Alter mutable/filenode.py to conform to these changes.
8      - Add a salt hasher to util/hashutil.py
9
10Thu Jun 24 16:48:33 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
11  * nodemaker.py: create MDMF files when asked to
12
13Thu Jun 24 16:49:05 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
14  * storage/server.py: minor code cleanup
15
16Thu Jun 24 16:49:24 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
17  * test/test_mutable.py: alter some tests that were failing due to MDMF; minor code cleanup.
18
19Fri Jun 25 17:35:20 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
20  * test/test_mutable.py: change the definition of corrupt() to work with MDMF as well as SDMF files, change users of corrupt to use the new definition
21
22Sat Jun 26 16:41:18 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
23  * Alter the ServermapUpdater to find MDMF files
24 
25  The servermapupdater should find MDMF files on a grid in the same way
26  that it finds SDMF files. This patch makes it do that.
27
28Sat Jun 26 16:42:04 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
29  * Make a segmented mutable uploader
30 
31  The mutable file uploader should be able to publish files with one
32  segment and files with multiple segments. This patch makes it do that.
33  This is still incomplete, and rather ugly -- I need to flesh out error
34  handling, I need to write tests, and I need to remove some of the uglier
35  kludges in the process before I can call this done.
36
37Sat Jun 26 16:43:14 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
38  * Write a segmented mutable downloader
39 
40  The segmented mutable downloader can deal with MDMF files (files with
41  one or more segments in MDMF format) and SDMF files (files with one
42  segment in SDMF format). It is backwards compatible with the old
43  file format.
44 
45  This patch also contains tests for the segmented mutable downloader.
46
47Mon Jun 28 15:50:48 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
48  * mutable/checker.py: check MDMF files
49 
50  This patch adapts the mutable file checker and verifier to check and
51  verify MDMF files. It does this by using the new segmented downloader,
52  which is trained to perform verification operations on request. This
53  removes some code duplication.
54
55Mon Jun 28 15:52:01 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
56  * mutable/retrieve.py: learn how to verify mutable files
57
58Wed Jun 30 11:33:05 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
59  * interfaces.py: add IMutableSlotWriter
60
61Thu Jul  1 16:28:06 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
62  * test/test_mutable.py: temporarily disable two tests that are now irrelevant
63
64Fri Jul  2 15:55:31 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
65  * Add MDMF reader and writer, and SDMF writer
66 
67  The MDMF/SDMF reader MDMF writer, and SDMF writer are similar to the
68  object proxies that exist for immutable files. They abstract away
69  details of connection, state, and caching from their callers (in this
70  case, the download, servermap updater, and uploader), and expose methods
71  to get and set information on the remote server.
72 
73  MDMFSlotReadProxy reads a mutable file from the server, doing the right
74  thing (in most cases) regardless of whether the file is MDMF or SDMF. It
75  allows callers to tell it how to batch and flush reads.
76 
77  MDMFSlotWriteProxy writes an MDMF mutable file to a server.
78 
79  SDMFSlotWriteProxy writes an SDMF mutable file to a server.
80 
81  This patch also includes tests for MDMFSlotReadProxy,
82  SDMFSlotWriteProxy, and MDMFSlotWriteProxy.
83
84Fri Jul  2 15:55:54 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
85  * mutable/publish.py: cleanup + simplification
86
87Fri Jul  2 15:57:10 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
88  * test/test_mutable.py: remove tests that are no longer relevant
89
90New patches:
91
92[Misc. changes to support the work I'm doing
93Kevan Carstensen <kevan@isnotajoke.com>**20100624234637
94 Ignore-this: fdd18fa8cc05f4b4b15ff53ee24a1819
95 
96     - Add a notion of file version number to interfaces.py
97     - Alter mutable file node interfaces to have a notion of version,
98       though this may be changed later.
99     - Alter mutable/filenode.py to conform to these changes.
100     - Add a salt hasher to util/hashutil.py
101] {
102hunk ./src/allmydata/interfaces.py 7
103      ChoiceOf, IntegerConstraint, Any, RemoteInterface, Referenceable
104 
105 HASH_SIZE=32
106+SALT_SIZE=16
107+
108+SDMF_VERSION=0
109+MDMF_VERSION=1
110 
111 Hash = StringConstraint(maxLength=HASH_SIZE,
112                         minLength=HASH_SIZE)# binary format 32-byte SHA256 hash
113hunk ./src/allmydata/interfaces.py 811
114         writer-visible data using this writekey.
115         """
116 
117+    def set_version(version):
118+        """Tahoe-LAFS supports SDMF and MDMF mutable files. By default,
119+        we upload in SDMF for reasons of compatibility. If you want to
120+        change this, set_version will let you do that.
121+
122+        To say that this file should be uploaded in SDMF, pass in a 0. To
123+        say that the file should be uploaded as MDMF, pass in a 1.
124+        """
125+
126+    def get_version():
127+        """Returns the mutable file protocol version."""
128+
129 class NotEnoughSharesError(Exception):
130     """Download was unable to get enough shares"""
131 
132hunk ./src/allmydata/mutable/filenode.py 8
133 from twisted.internet import defer, reactor
134 from foolscap.api import eventually
135 from allmydata.interfaces import IMutableFileNode, \
136-     ICheckable, ICheckResults, NotEnoughSharesError
137+     ICheckable, ICheckResults, NotEnoughSharesError, MDMF_VERSION, SDMF_VERSION
138 from allmydata.util import hashutil, log
139 from allmydata.util.assertutil import precondition
140 from allmydata.uri import WriteableSSKFileURI, ReadonlySSKFileURI
141hunk ./src/allmydata/mutable/filenode.py 67
142         self._sharemap = {} # known shares, shnum-to-[nodeids]
143         self._cache = ResponseCache()
144         self._most_recent_size = None
145+        # filled in after __init__ if we're being created for the first time;
146+        # filled in by the servermap updater before publishing, otherwise.
147+        # set to this default value in case neither of those things happen,
148+        # or in case the servermap can't find any shares to tell us what
149+        # to publish as.
150+        # TODO: Set this back to None, and find out why the tests fail
151+        #       with it set to None.
152+        self._protocol_version = SDMF_VERSION
153 
154         # all users of this MutableFileNode go through the serializer. This
155         # takes advantage of the fact that Deferreds discard the callbacks
156hunk ./src/allmydata/mutable/filenode.py 472
157     def _did_upload(self, res, size):
158         self._most_recent_size = size
159         return res
160+
161+
162+    def set_version(self, version):
163+        # I can be set in two ways:
164+        #  1. When the node is created.
165+        #  2. (for an existing share) when the Servermap is updated
166+        #     before I am read.
167+        assert version in (MDMF_VERSION, SDMF_VERSION)
168+        self._protocol_version = version
169+
170+
171+    def get_version(self):
172+        return self._protocol_version
173hunk ./src/allmydata/util/hashutil.py 90
174 MUTABLE_READKEY_TAG = "allmydata_mutable_writekey_to_readkey_v1"
175 MUTABLE_DATAKEY_TAG = "allmydata_mutable_readkey_to_datakey_v1"
176 MUTABLE_STORAGEINDEX_TAG = "allmydata_mutable_readkey_to_storage_index_v1"
177+MUTABLE_SALT_TAG = "allmydata_mutable_segment_salt_v1"
178 
179 # dirnodes
180 DIRNODE_CHILD_WRITECAP_TAG = "allmydata_mutable_writekey_and_salt_to_dirnode_child_capkey_v1"
181hunk ./src/allmydata/util/hashutil.py 134
182 def plaintext_segment_hasher():
183     return tagged_hasher(PLAINTEXT_SEGMENT_TAG)
184 
185+def mutable_salt_hash(data):
186+    return tagged_hash(MUTABLE_SALT_TAG, data)
187+def mutable_salt_hasher():
188+    return tagged_hasher(MUTABLE_SALT_TAG)
189+
190 KEYLEN = 16
191 IVLEN = 16
192 
193}
194[nodemaker.py: create MDMF files when asked to
195Kevan Carstensen <kevan@isnotajoke.com>**20100624234833
196 Ignore-this: 26c16aaca9ddab7a7ce37a4530bc970
197] {
198hunk ./src/allmydata/nodemaker.py 3
199 import weakref
200 from zope.interface import implements
201-from allmydata.interfaces import INodeMaker
202+from allmydata.util.assertutil import precondition
203+from allmydata.interfaces import INodeMaker, MustBeDeepImmutableError, \
204+                                 SDMF_VERSION, MDMF_VERSION
205 from allmydata.immutable.filenode import ImmutableFileNode, LiteralFileNode
206 from allmydata.immutable.upload import Data
207 from allmydata.mutable.filenode import MutableFileNode
208hunk ./src/allmydata/nodemaker.py 92
209             return self._create_dirnode(filenode)
210         return None
211 
212-    def create_mutable_file(self, contents=None, keysize=None):
213+    def create_mutable_file(self, contents=None, keysize=None,
214+                            version=SDMF_VERSION):
215         n = MutableFileNode(self.storage_broker, self.secret_holder,
216                             self.default_encoding_parameters, self.history)
217hunk ./src/allmydata/nodemaker.py 96
218+        n.set_version(version)
219         d = self.key_generator.generate(keysize)
220         d.addCallback(n.create_with_keys, contents)
221         d.addCallback(lambda res: n)
222hunk ./src/allmydata/nodemaker.py 102
223         return d
224 
225-    def create_new_mutable_directory(self, initial_children={}):
226+    def create_new_mutable_directory(self, initial_children={},
227+                                     version=SDMF_VERSION):
228+        # initial_children must have metadata (i.e. {} instead of None)
229+        for (name, (node, metadata)) in initial_children.iteritems():
230+            precondition(isinstance(metadata, dict),
231+                         "create_new_mutable_directory requires metadata to be a dict, not None", metadata)
232+            node.raise_error()
233         d = self.create_mutable_file(lambda n:
234hunk ./src/allmydata/nodemaker.py 110
235-                                     pack_children(n, initial_children))
236+                                     pack_children(n, initial_children),
237+                                     version)
238         d.addCallback(self._create_dirnode)
239         return d
240 
241}
242[storage/server.py: minor code cleanup
243Kevan Carstensen <kevan@isnotajoke.com>**20100624234905
244 Ignore-this: 2358c531c39e48d3c8e56b62b5768228
245] {
246hunk ./src/allmydata/storage/server.py 569
247                                          self)
248         return share
249 
250-    def remote_slot_readv(self, storage_index, shares, readv):
251+    def remote_slot_readv(self, storage_index, shares, readvs):
252         start = time.time()
253         self.count("readv")
254         si_s = si_b2a(storage_index)
255hunk ./src/allmydata/storage/server.py 590
256             if sharenum in shares or not shares:
257                 filename = os.path.join(bucketdir, sharenum_s)
258                 msf = MutableShareFile(filename, self)
259-                datavs[sharenum] = msf.readv(readv)
260+                datavs[sharenum] = msf.readv(readvs)
261         log.msg("returning shares %s" % (datavs.keys(),),
262                 facility="tahoe.storage", level=log.NOISY, parent=lp)
263         self.add_latency("readv", time.time() - start)
264}
265[test/test_mutable.py: alter some tests that were failing due to MDMF; minor code cleanup.
266Kevan Carstensen <kevan@isnotajoke.com>**20100624234924
267 Ignore-this: afb86ec1fbdbfe1a5ef6f46f350273c0
268] {
269hunk ./src/allmydata/test/test_mutable.py 151
270             chr(ord(original[byte_offset]) ^ 0x01) +
271             original[byte_offset+1:])
272 
273+def add_two(original, byte_offset):
274+    # It isn't enough to simply flip the bit for the version number,
275+    # because 1 is a valid version number. So we add two instead.
276+    return (original[:byte_offset] +
277+            chr(ord(original[byte_offset]) ^ 0x02) +
278+            original[byte_offset+1:])
279+
280 def corrupt(res, s, offset, shnums_to_corrupt=None, offset_offset=0):
281     # if shnums_to_corrupt is None, corrupt all shares. Otherwise it is a
282     # list of shnums to corrupt.
283hunk ./src/allmydata/test/test_mutable.py 187
284                 real_offset = offset1
285             real_offset = int(real_offset) + offset2 + offset_offset
286             assert isinstance(real_offset, int), offset
287-            shares[shnum] = flip_bit(data, real_offset)
288+            if offset1 == 0: # verbyte
289+                f = add_two
290+            else:
291+                f = flip_bit
292+            shares[shnum] = f(data, real_offset)
293     return res
294 
295 def make_storagebroker(s=None, num_peers=10):
296hunk ./src/allmydata/test/test_mutable.py 423
297         d.addCallback(_created)
298         return d
299 
300+
301     def test_modify_backoffer(self):
302         def _modifier(old_contents, servermap, first_time):
303             return old_contents + "line2"
304hunk ./src/allmydata/test/test_mutable.py 658
305         d.addCallback(_created)
306         return d
307 
308+
309     def _copy_shares(self, ignored, index):
310         shares = self._storage._peers
311         # we need a deep copy
312}
313[test/test_mutable.py: change the definition of corrupt() to work with MDMF as well as SDMF files, change users of corrupt to use the new definition
314Kevan Carstensen <kevan@isnotajoke.com>**20100626003520
315 Ignore-this: 836e59e2fde0535f6b4bea3468dc8244
316] {
317hunk ./src/allmydata/test/test_mutable.py 168
318                 and shnum not in shnums_to_corrupt):
319                 continue
320             data = shares[shnum]
321-            (version,
322-             seqnum,
323-             root_hash,
324-             IV,
325-             k, N, segsize, datalen,
326-             o) = unpack_header(data)
327-            if isinstance(offset, tuple):
328-                offset1, offset2 = offset
329-            else:
330-                offset1 = offset
331-                offset2 = 0
332-            if offset1 == "pubkey":
333-                real_offset = 107
334-            elif offset1 in o:
335-                real_offset = o[offset1]
336-            else:
337-                real_offset = offset1
338-            real_offset = int(real_offset) + offset2 + offset_offset
339-            assert isinstance(real_offset, int), offset
340-            if offset1 == 0: # verbyte
341-                f = add_two
342-            else:
343-                f = flip_bit
344-            shares[shnum] = f(data, real_offset)
345-    return res
346+            # We're feeding the reader all of the share data, so it
347+            # won't need to use the rref that we didn't provide, nor the
348+            # storage index that we didn't provide. We do this because
349+            # the reader will work for both MDMF and SDMF.
350+            reader = MDMFSlotReadProxy(None, None, shnum, data)
351+            # We need to get the offsets for the next part.
352+            d = reader.get_verinfo()
353+            def _do_corruption(verinfo, data, shnum):
354+                (seqnum,
355+                 root_hash,
356+                 IV,
357+                 segsize,
358+                 datalen,
359+                 k, n, prefix, o) = verinfo
360+                if isinstance(offset, tuple):
361+                    offset1, offset2 = offset
362+                else:
363+                    offset1 = offset
364+                    offset2 = 0
365+                if offset1 == "pubkey":
366+                    real_offset = 107
367+                elif offset1 in o:
368+                    real_offset = o[offset1]
369+                else:
370+                    real_offset = offset1
371+                real_offset = int(real_offset) + offset2 + offset_offset
372+                assert isinstance(real_offset, int), offset
373+                if offset1 == 0: # verbyte
374+                    f = add_two
375+                else:
376+                    f = flip_bit
377+                shares[shnum] = f(data, real_offset)
378+            d.addCallback(_do_corruption, data, shnum)
379+            ds.append(d)
380+    dl = defer.DeferredList(ds)
381+    dl.addCallback(lambda ignored: res)
382+    return dl
383 
384 def make_storagebroker(s=None, num_peers=10):
385     if not s:
386hunk ./src/allmydata/test/test_mutable.py 1177
387         return d
388 
389     def test_download_fails(self):
390-        corrupt(None, self._storage, "signature")
391-        d = self.shouldFail(UnrecoverableFileError, "test_download_anyway",
392+        d = corrupt(None, self._storage, "signature")
393+        d.addCallback(lambda ignored:
394+            self.shouldFail(UnrecoverableFileError, "test_download_anyway",
395                             "no recoverable versions",
396                             self._fn.download_best_version)
397         return d
398hunk ./src/allmydata/test/test_mutable.py 1232
399         return d
400 
401     def test_check_all_bad_sig(self):
402-        corrupt(None, self._storage, 1) # bad sig
403-        d = self._fn.check(Monitor())
404+        d = corrupt(None, self._storage, 1) # bad sig
405+        d.addCallback(lambda ignored:
406+            self._fn.check(Monitor()))
407         d.addCallback(self.check_bad, "test_check_all_bad_sig")
408         return d
409 
410hunk ./src/allmydata/test/test_mutable.py 1239
411     def test_check_all_bad_blocks(self):
412-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
413+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
414         # the Checker won't notice this.. it doesn't look at actual data
415hunk ./src/allmydata/test/test_mutable.py 1241
416-        d = self._fn.check(Monitor())
417+        d.addCallback(lambda ignored:
418+            self._fn.check(Monitor()))
419         d.addCallback(self.check_good, "test_check_all_bad_blocks")
420         return d
421 
422hunk ./src/allmydata/test/test_mutable.py 1252
423         return d
424 
425     def test_verify_all_bad_sig(self):
426-        corrupt(None, self._storage, 1) # bad sig
427-        d = self._fn.check(Monitor(), verify=True)
428+        d = corrupt(None, self._storage, 1) # bad sig
429+        d.addCallback(lambda ignored:
430+            self._fn.check(Monitor(), verify=True))
431         d.addCallback(self.check_bad, "test_verify_all_bad_sig")
432         return d
433 
434hunk ./src/allmydata/test/test_mutable.py 1259
435     def test_verify_one_bad_sig(self):
436-        corrupt(None, self._storage, 1, [9]) # bad sig
437-        d = self._fn.check(Monitor(), verify=True)
438+        d = corrupt(None, self._storage, 1, [9]) # bad sig
439+        d.addCallback(lambda ignored:
440+            self._fn.check(Monitor(), verify=True))
441         d.addCallback(self.check_bad, "test_verify_one_bad_sig")
442         return d
443 
444hunk ./src/allmydata/test/test_mutable.py 1266
445     def test_verify_one_bad_block(self):
446-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
447+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
448         # the Verifier *will* notice this, since it examines every byte
449hunk ./src/allmydata/test/test_mutable.py 1268
450-        d = self._fn.check(Monitor(), verify=True)
451+        d.addCallback(lambda ignored:
452+            self._fn.check(Monitor(), verify=True))
453         d.addCallback(self.check_bad, "test_verify_one_bad_block")
454         d.addCallback(self.check_expected_failure,
455                       CorruptShareError, "block hash tree failure",
456hunk ./src/allmydata/test/test_mutable.py 1277
457         return d
458 
459     def test_verify_one_bad_sharehash(self):
460-        corrupt(None, self._storage, "share_hash_chain", [9], 5)
461-        d = self._fn.check(Monitor(), verify=True)
462+        d = corrupt(None, self._storage, "share_hash_chain", [9], 5)
463+        d.addCallback(lambda ignored:
464+            self._fn.check(Monitor(), verify=True))
465         d.addCallback(self.check_bad, "test_verify_one_bad_sharehash")
466         d.addCallback(self.check_expected_failure,
467                       CorruptShareError, "corrupt hashes",
468hunk ./src/allmydata/test/test_mutable.py 1287
469         return d
470 
471     def test_verify_one_bad_encprivkey(self):
472-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
473-        d = self._fn.check(Monitor(), verify=True)
474+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
475+        d.addCallback(lambda ignored:
476+            self._fn.check(Monitor(), verify=True))
477         d.addCallback(self.check_bad, "test_verify_one_bad_encprivkey")
478         d.addCallback(self.check_expected_failure,
479                       CorruptShareError, "invalid privkey",
480hunk ./src/allmydata/test/test_mutable.py 1297
481         return d
482 
483     def test_verify_one_bad_encprivkey_uncheckable(self):
484-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
485+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
486         readonly_fn = self._fn.get_readonly()
487         # a read-only node has no way to validate the privkey
488hunk ./src/allmydata/test/test_mutable.py 1300
489-        d = readonly_fn.check(Monitor(), verify=True)
490+        d.addCallback(lambda ignored:
491+            readonly_fn.check(Monitor(), verify=True))
492         d.addCallback(self.check_good,
493                       "test_verify_one_bad_encprivkey_uncheckable")
494         return d
495}
496[Alter the ServermapUpdater to find MDMF files
497Kevan Carstensen <kevan@isnotajoke.com>**20100626234118
498 Ignore-this: 25f6278209c2983ba8f307cfe0fde0
499 
500 The servermapupdater should find MDMF files on a grid in the same way
501 that it finds SDMF files. This patch makes it do that.
502] {
503hunk ./src/allmydata/mutable/servermap.py 7
504 from itertools import count
505 from twisted.internet import defer
506 from twisted.python import failure
507-from foolscap.api import DeadReferenceError, RemoteException, eventually
508+from foolscap.api import DeadReferenceError, RemoteException, eventually, \
509+                         fireEventually
510 from allmydata.util import base32, hashutil, idlib, log
511 from allmydata.storage.server import si_b2a
512 from allmydata.interfaces import IServermapUpdaterStatus
513hunk ./src/allmydata/mutable/servermap.py 17
514 from allmydata.mutable.common import MODE_CHECK, MODE_ANYTHING, MODE_WRITE, MODE_READ, \
515      DictOfSets, CorruptShareError, NeedMoreDataError
516 from allmydata.mutable.layout import unpack_prefix_and_signature, unpack_header, unpack_share, \
517-     SIGNED_PREFIX_LENGTH
518+     SIGNED_PREFIX_LENGTH, MDMFSlotReadProxy
519 
520 class UpdateStatus:
521     implements(IServermapUpdaterStatus)
522hunk ./src/allmydata/mutable/servermap.py 254
523         """Return a set of versionids, one for each version that is currently
524         recoverable."""
525         versionmap = self.make_versionmap()
526-
527         recoverable_versions = set()
528         for (verinfo, shares) in versionmap.items():
529             (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
530hunk ./src/allmydata/mutable/servermap.py 366
531         self._servers_responded = set()
532 
533         # how much data should we read?
534+        # SDMF:
535         #  * if we only need the checkstring, then [0:75]
536         #  * if we need to validate the checkstring sig, then [543ish:799ish]
537         #  * if we need the verification key, then [107:436ish]
538hunk ./src/allmydata/mutable/servermap.py 374
539         #  * if we need the encrypted private key, we want [-1216ish:]
540         #   * but we can't read from negative offsets
541         #   * the offset table tells us the 'ish', also the positive offset
542-        # A future version of the SMDF slot format should consider using
543-        # fixed-size slots so we can retrieve less data. For now, we'll just
544-        # read 2000 bytes, which also happens to read enough actual data to
545-        # pre-fetch a 9-entry dirnode.
546+        # MDMF:
547+        #  * Checkstring? [0:72]
548+        #  * If we want to validate the checkstring, then [0:72], [143:?] --
549+        #    the offset table will tell us for sure.
550+        #  * If we need the verification key, we have to consult the offset
551+        #    table as well.
552+        # At this point, we don't know which we are. Our filenode can
553+        # tell us, but it might be lying -- in some cases, we're
554+        # responsible for telling it which kind of file it is.
555         self._read_size = 4000
556         if mode == MODE_CHECK:
557             # we use unpack_prefix_and_signature, so we need 1k
558hunk ./src/allmydata/mutable/servermap.py 432
559         self._queries_completed = 0
560 
561         sb = self._storage_broker
562+        # All of the peers, permuted by the storage index, as usual.
563         full_peerlist = sb.get_servers_for_index(self._storage_index)
564         self.full_peerlist = full_peerlist # for use later, immutable
565         self.extra_peers = full_peerlist[:] # peers are removed as we use them
566hunk ./src/allmydata/mutable/servermap.py 439
567         self._good_peers = set() # peers who had some shares
568         self._empty_peers = set() # peers who don't have any shares
569         self._bad_peers = set() # peers to whom our queries failed
570+        self._readers = {} # peerid -> dict(sharewriters), filled in
571+                           # after responses come in.
572 
573         k = self._node.get_required_shares()
574hunk ./src/allmydata/mutable/servermap.py 443
575+        # For what cases can these conditions work?
576         if k is None:
577             # make a guess
578             k = 3
579hunk ./src/allmydata/mutable/servermap.py 456
580         self.num_peers_to_query = k + self.EPSILON
581 
582         if self.mode == MODE_CHECK:
583+            # We want to query all of the peers.
584             initial_peers_to_query = dict(full_peerlist)
585             must_query = set(initial_peers_to_query.keys())
586             self.extra_peers = []
587hunk ./src/allmydata/mutable/servermap.py 464
588             # we're planning to replace all the shares, so we want a good
589             # chance of finding them all. We will keep searching until we've
590             # seen epsilon that don't have a share.
591+            # We don't query all of the peers because that could take a while.
592             self.num_peers_to_query = N + self.EPSILON
593             initial_peers_to_query, must_query = self._build_initial_querylist()
594             self.required_num_empty_peers = self.EPSILON
595hunk ./src/allmydata/mutable/servermap.py 474
596             # might also avoid the round trip required to read the encrypted
597             # private key.
598 
599-        else:
600+        else: # MODE_READ, MODE_ANYTHING
601+            # 2k peers is good enough.
602             initial_peers_to_query, must_query = self._build_initial_querylist()
603 
604         # this is a set of peers that we are required to get responses from:
605hunk ./src/allmydata/mutable/servermap.py 490
606         # before we can consider ourselves finished, and self.extra_peers
607         # contains the overflow (peers that we should tap if we don't get
608         # enough responses)
609+        # I guess that self._must_query is a subset of
610+        # initial_peers_to_query?
611+        assert set(must_query).issubset(set(initial_peers_to_query))
612 
613         self._send_initial_requests(initial_peers_to_query)
614         self._status.timings["initial_queries"] = time.time() - self._started
615hunk ./src/allmydata/mutable/servermap.py 549
616         # errors that aren't handled by _query_failed (and errors caused by
617         # _query_failed) get logged, but we still want to check for doneness.
618         d.addErrback(log.err)
619-        d.addBoth(self._check_for_done)
620         d.addErrback(self._fatal_error)
621hunk ./src/allmydata/mutable/servermap.py 550
622+        d.addCallback(self._check_for_done)
623         return d
624 
625     def _do_read(self, ss, peerid, storage_index, shnums, readv):
626hunk ./src/allmydata/mutable/servermap.py 569
627         d = ss.callRemote("slot_readv", storage_index, shnums, readv)
628         return d
629 
630+
631+    def _got_corrupt_share(self, e, shnum, peerid, data, lp):
632+        """
633+        I am called when a remote server returns a corrupt share in
634+        response to one of our queries. By corrupt, I mean a share
635+        without a valid signature. I then record the failure, notify the
636+        server of the corruption, and record the share as bad.
637+        """
638+        f = failure.Failure(e)
639+        self.log(format="bad share: %(f_value)s", f_value=str(f),
640+                 failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
641+        # Notify the server that its share is corrupt.
642+        self.notify_server_corruption(peerid, shnum, str(e))
643+        # By flagging this as a bad peer, we won't count any of
644+        # the other shares on that peer as valid, though if we
645+        # happen to find a valid version string amongst those
646+        # shares, we'll keep track of it so that we don't need
647+        # to validate the signature on those again.
648+        self._bad_peers.add(peerid)
649+        self._last_failure = f
650+        # XXX: Use the reader for this?
651+        checkstring = data[:SIGNED_PREFIX_LENGTH]
652+        self._servermap.mark_bad_share(peerid, shnum, checkstring)
653+        self._servermap.problems.append(f)
654+
655+
656+    def _cache_good_sharedata(self, verinfo, shnum, now, data):
657+        """
658+        If one of my queries returns successfully (which means that we
659+        were able to and successfully did validate the signature), I
660+        cache the data that we initially fetched from the storage
661+        server. This will help reduce the number of roundtrips that need
662+        to occur when the file is downloaded, or when the file is
663+        updated.
664+        """
665+        self._node._add_to_cache(verinfo, shnum, 0, data, now)
666+
667+
668     def _got_results(self, datavs, peerid, readsize, stuff, started):
669         lp = self.log(format="got result from [%(peerid)s], %(numshares)d shares",
670                       peerid=idlib.shortnodeid_b2a(peerid),
671hunk ./src/allmydata/mutable/servermap.py 630
672         else:
673             self._empty_peers.add(peerid)
674 
675-        last_verinfo = None
676-        last_shnum = None
677+        ss, storage_index = stuff
678+        ds = []
679+
680         for shnum,datav in datavs.items():
681             data = datav[0]
682hunk ./src/allmydata/mutable/servermap.py 635
683-            try:
684-                verinfo = self._got_results_one_share(shnum, data, peerid, lp)
685-                last_verinfo = verinfo
686-                last_shnum = shnum
687-                self._node._add_to_cache(verinfo, shnum, 0, data, now)
688-            except CorruptShareError, e:
689-                # log it and give the other shares a chance to be processed
690-                f = failure.Failure()
691-                self.log(format="bad share: %(f_value)s", f_value=str(f.value),
692-                         failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
693-                self.notify_server_corruption(peerid, shnum, str(e))
694-                self._bad_peers.add(peerid)
695-                self._last_failure = f
696-                checkstring = data[:SIGNED_PREFIX_LENGTH]
697-                self._servermap.mark_bad_share(peerid, shnum, checkstring)
698-                self._servermap.problems.append(f)
699-                pass
700-
701-        self._status.timings["cumulative_verify"] += (time.time() - now)
702+            reader = MDMFSlotReadProxy(ss,
703+                                       storage_index,
704+                                       shnum,
705+                                       data)
706+            self._readers.setdefault(peerid, dict())[shnum] = reader
707+            # our goal, with each response, is to validate the version
708+            # information and share data as best we can at this point --
709+            # we do this by validating the signature. To do this, we
710+            # need to do the following:
711+            #   - If we don't already have the public key, fetch the
712+            #     public key. We use this to validate the signature.
713+            if not self._node.get_pubkey():
714+                # fetch and set the public key.
715+                d = reader.get_verification_key()
716+                d.addCallback(lambda results, shnum=shnum, peerid=peerid:
717+                    self._try_to_set_pubkey(results, peerid, shnum, lp))
718+                # XXX: Make self._pubkey_query_failed?
719+                d.addErrback(lambda error, shnum=shnum, peerid=peerid:
720+                    self._got_corrupt_share(error, shnum, peerid, data, lp))
721+            else:
722+                # we already have the public key.
723+                d = defer.succeed(None)
724+            # Neither of these two branches return anything of
725+            # consequence, so the first entry in our deferredlist will
726+            # be None.
727 
728hunk ./src/allmydata/mutable/servermap.py 661
729-        if self._need_privkey and last_verinfo:
730-            # send them a request for the privkey. We send one request per
731-            # server.
732-            lp2 = self.log("sending privkey request",
733-                           parent=lp, level=log.NOISY)
734-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
735-             offsets_tuple) = last_verinfo
736-            o = dict(offsets_tuple)
737+            # - Next, we need the version information. We almost
738+            #   certainly got this by reading the first thousand or so
739+            #   bytes of the share on the storage server, so we
740+            #   shouldn't need to fetch anything at this step.
741+            d2 = reader.get_verinfo()
742+            d2.addErrback(lambda error, shnum=shnum, peerid=peerid:
743+                self._got_corrupt_share(error, shnum, peerid, data, lp))
744+            # - Next, we need the signature. For an SDMF share, it is
745+            #   likely that we fetched this when doing our initial fetch
746+            #   to get the version information. In MDMF, this lives at
747+            #   the end of the share, so unless the file is quite small,
748+            #   we'll need to do a remote fetch to get it.
749+            d3 = reader.get_signature()
750+            d3.addErrback(lambda error, shnum=shnum, peerid=peerid:
751+                self._got_corrupt_share(error, shnum, peerid, data, lp))
752+            #  Once we have all three of these responses, we can move on
753+            #  to validating the signature
754 
755hunk ./src/allmydata/mutable/servermap.py 679
756-            self._queries_outstanding.add(peerid)
757-            readv = [ (o['enc_privkey'], (o['EOF'] - o['enc_privkey'])) ]
758-            ss = self._servermap.connections[peerid]
759-            privkey_started = time.time()
760-            d = self._do_read(ss, peerid, self._storage_index,
761-                              [last_shnum], readv)
762-            d.addCallback(self._got_privkey_results, peerid, last_shnum,
763-                          privkey_started, lp2)
764-            d.addErrback(self._privkey_query_failed, peerid, last_shnum, lp2)
765-            d.addErrback(log.err)
766-            d.addCallback(self._check_for_done)
767-            d.addErrback(self._fatal_error)
768+            # Does the node already have a privkey? If not, we'll try to
769+            # fetch it here.
770+            if self._need_privkey:
771+                d4 = reader.get_encprivkey()
772+                d4.addCallback(lambda results, shnum=shnum, peerid=peerid:
773+                    self._try_to_validate_privkey(results, peerid, shnum, lp))
774+                d4.addErrback(lambda error, shnum=shnum, peerid=peerid:
775+                    self._privkey_query_failed(error, shnum, data, lp))
776+            else:
777+                d4 = defer.succeed(None)
778 
779hunk ./src/allmydata/mutable/servermap.py 690
780+            dl = defer.DeferredList([d, d2, d3, d4])
781+            dl.addCallback(lambda results, shnum=shnum, peerid=peerid:
782+                self._got_signature_one_share(results, shnum, peerid, lp))
783+            dl.addErrback(lambda error, shnum=shnum, data=data:
784+               self._got_corrupt_share(error, shnum, peerid, data, lp))
785+            dl.addCallback(lambda verinfo, shnum=shnum, peerid=peerid, data=data:
786+                self._cache_good_sharedata(verinfo, shnum, now, data))
787+            ds.append(dl)
788+        # dl is a deferred list that will fire when all of the shares
789+        # that we found on this peer are done processing. When dl fires,
790+        # we know that processing is done, so we can decrement the
791+        # semaphore-like thing that we incremented earlier.
792+        dl = defer.DeferredList(ds, fireOnOneErrback=True)
793+        # Are we done? Done means that there are no more queries to
794+        # send, that there are no outstanding queries, and that we
795+        # haven't received any queries that are still processing. If we
796+        # are done, self._check_for_done will cause the done deferred
797+        # that we returned to our caller to fire, which tells them that
798+        # they have a complete servermap, and that we won't be touching
799+        # the servermap anymore.
800+        dl.addCallback(self._check_for_done)
801+        dl.addErrback(self._fatal_error)
802         # all done!
803         self.log("_got_results done", parent=lp, level=log.NOISY)
804hunk ./src/allmydata/mutable/servermap.py 714
805+        return dl
806+
807+
808+    def _try_to_set_pubkey(self, pubkey_s, peerid, shnum, lp):
809+        if self._node.get_pubkey():
810+            return # don't go through this again if we don't have to
811+        fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
812+        assert len(fingerprint) == 32
813+        if fingerprint != self._node.get_fingerprint():
814+            raise CorruptShareError(peerid, shnum,
815+                                "pubkey doesn't match fingerprint")
816+        self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
817+        assert self._node.get_pubkey()
818+
819 
820     def notify_server_corruption(self, peerid, shnum, reason):
821         ss = self._servermap.connections[peerid]
822hunk ./src/allmydata/mutable/servermap.py 734
823         ss.callRemoteOnly("advise_corrupt_share",
824                           "mutable", self._storage_index, shnum, reason)
825 
826-    def _got_results_one_share(self, shnum, data, peerid, lp):
827+
828+    def _got_signature_one_share(self, results, shnum, peerid, lp):
829+        # It is our job to give versioninfo to our caller. We need to
830+        # raise CorruptShareError if the share is corrupt for any
831+        # reason, something that our caller will handle.
832         self.log(format="_got_results: got shnum #%(shnum)d from peerid %(peerid)s",
833                  shnum=shnum,
834                  peerid=idlib.shortnodeid_b2a(peerid),
835hunk ./src/allmydata/mutable/servermap.py 744
836                  level=log.NOISY,
837                  parent=lp)
838-
839-        # this might raise NeedMoreDataError, if the pubkey and signature
840-        # live at some weird offset. That shouldn't happen, so I'm going to
841-        # treat it as a bad share.
842-        (seqnum, root_hash, IV, k, N, segsize, datalength,
843-         pubkey_s, signature, prefix) = unpack_prefix_and_signature(data)
844-
845-        if not self._node.get_pubkey():
846-            fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
847-            assert len(fingerprint) == 32
848-            if fingerprint != self._node.get_fingerprint():
849-                raise CorruptShareError(peerid, shnum,
850-                                        "pubkey doesn't match fingerprint")
851-            self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
852-
853-        if self._need_privkey:
854-            self._try_to_extract_privkey(data, peerid, shnum, lp)
855-
856-        (ig_version, ig_seqnum, ig_root_hash, ig_IV, ig_k, ig_N,
857-         ig_segsize, ig_datalen, offsets) = unpack_header(data)
858+        _, verinfo, signature, __ = results
859+        (seqnum,
860+         root_hash,
861+         saltish,
862+         segsize,
863+         datalen,
864+         k,
865+         n,
866+         prefix,
867+         offsets) = verinfo[1]
868         offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
869 
870hunk ./src/allmydata/mutable/servermap.py 756
871-        verinfo = (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
872+        # XXX: This should be done for us in the method, so
873+        # presumably you can go in there and fix it.
874+        verinfo = (seqnum,
875+                   root_hash,
876+                   saltish,
877+                   segsize,
878+                   datalen,
879+                   k,
880+                   n,
881+                   prefix,
882                    offsets_tuple)
883hunk ./src/allmydata/mutable/servermap.py 767
884+        # This tuple uniquely identifies a share on the grid; we use it
885+        # to keep track of the ones that we've already seen.
886 
887         if verinfo not in self._valid_versions:
888hunk ./src/allmydata/mutable/servermap.py 771
889-            # it's a new pair. Verify the signature.
890-            valid = self._node.get_pubkey().verify(prefix, signature)
891+            # This is a new version tuple, and we need to validate it
892+            # against the public key before keeping track of it.
893+            assert self._node.get_pubkey()
894+            valid = self._node.get_pubkey().verify(prefix, signature[1])
895             if not valid:
896hunk ./src/allmydata/mutable/servermap.py 776
897-                raise CorruptShareError(peerid, shnum, "signature is invalid")
898+                raise CorruptShareError(peerid, shnum,
899+                                        "signature is invalid")
900 
901hunk ./src/allmydata/mutable/servermap.py 779
902-            # ok, it's a valid verinfo. Add it to the list of validated
903-            # versions.
904-            self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
905-                     % (seqnum, base32.b2a(root_hash)[:4],
906-                        idlib.shortnodeid_b2a(peerid), shnum,
907-                        k, N, segsize, datalength),
908-                     parent=lp)
909-            self._valid_versions.add(verinfo)
910-        # We now know that this is a valid candidate verinfo.
911+        # ok, it's a valid verinfo. Add it to the list of validated
912+        # versions.
913+        self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
914+                 % (seqnum, base32.b2a(root_hash)[:4],
915+                    idlib.shortnodeid_b2a(peerid), shnum,
916+                    k, n, segsize, datalen),
917+                    parent=lp)
918+        self._valid_versions.add(verinfo)
919+        # We now know that this is a valid candidate verinfo. Whether or
920+        # not this instance of it is valid is a matter for the next
921+        # statement; at this point, we just know that if we see this
922+        # version info again, that its signature checks out and that
923+        # we're okay to skip the signature-checking step.
924 
925hunk ./src/allmydata/mutable/servermap.py 793
926+        # (peerid, shnum) are bound in the method invocation.
927         if (peerid, shnum) in self._servermap.bad_shares:
928             # we've been told that the rest of the data in this share is
929             # unusable, so don't add it to the servermap.
930hunk ./src/allmydata/mutable/servermap.py 808
931         self.versionmap.add(verinfo, (shnum, peerid, timestamp))
932         return verinfo
933 
934+
935     def _deserialize_pubkey(self, pubkey_s):
936         verifier = rsa.create_verifying_key_from_string(pubkey_s)
937         return verifier
938hunk ./src/allmydata/mutable/servermap.py 813
939 
940-    def _try_to_extract_privkey(self, data, peerid, shnum, lp):
941-        try:
942-            r = unpack_share(data)
943-        except NeedMoreDataError, e:
944-            # this share won't help us. oh well.
945-            offset = e.encprivkey_offset
946-            length = e.encprivkey_length
947-            self.log("shnum %d on peerid %s: share was too short (%dB) "
948-                     "to get the encprivkey; [%d:%d] ought to hold it" %
949-                     (shnum, idlib.shortnodeid_b2a(peerid), len(data),
950-                      offset, offset+length),
951-                     parent=lp)
952-            # NOTE: if uncoordinated writes are taking place, someone might
953-            # change the share (and most probably move the encprivkey) before
954-            # we get a chance to do one of these reads and fetch it. This
955-            # will cause us to see a NotEnoughSharesError(unable to fetch
956-            # privkey) instead of an UncoordinatedWriteError . This is a
957-            # nuisance, but it will go away when we move to DSA-based mutable
958-            # files (since the privkey will be small enough to fit in the
959-            # write cap).
960-
961-            return
962-
963-        (seqnum, root_hash, IV, k, N, segsize, datalen,
964-         pubkey, signature, share_hash_chain, block_hash_tree,
965-         share_data, enc_privkey) = r
966-
967-        return self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
968 
969     def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
970hunk ./src/allmydata/mutable/servermap.py 815
971-
972+        """
973+        Given a writekey from a remote server, I validate it against the
974+        writekey stored in my node. If it is valid, then I set the
975+        privkey and encprivkey properties of the node.
976+        """
977         alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
978         alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
979         if alleged_writekey != self._node.get_writekey():
980hunk ./src/allmydata/mutable/servermap.py 892
981         self._queries_completed += 1
982         self._last_failure = f
983 
984-    def _got_privkey_results(self, datavs, peerid, shnum, started, lp):
985-        now = time.time()
986-        elapsed = now - started
987-        self._status.add_per_server_time(peerid, "privkey", started, elapsed)
988-        self._queries_outstanding.discard(peerid)
989-        if not self._need_privkey:
990-            return
991-        if shnum not in datavs:
992-            self.log("privkey wasn't there when we asked it",
993-                     level=log.WEIRD, umid="VA9uDQ")
994-            return
995-        datav = datavs[shnum]
996-        enc_privkey = datav[0]
997-        self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
998 
999     def _privkey_query_failed(self, f, peerid, shnum, lp):
1000         self._queries_outstanding.discard(peerid)
1001hunk ./src/allmydata/mutable/servermap.py 906
1002         self._servermap.problems.append(f)
1003         self._last_failure = f
1004 
1005+
1006     def _check_for_done(self, res):
1007         # exit paths:
1008         #  return self._send_more_queries(outstanding) : send some more queries
1009hunk ./src/allmydata/mutable/servermap.py 912
1010         #  return self._done() : all done
1011         #  return : keep waiting, no new queries
1012-
1013         lp = self.log(format=("_check_for_done, mode is '%(mode)s', "
1014                               "%(outstanding)d queries outstanding, "
1015                               "%(extra)d extra peers available, "
1016hunk ./src/allmydata/mutable/servermap.py 1117
1017         self._servermap.last_update_time = self._started
1018         # the servermap will not be touched after this
1019         self.log("servermap: %s" % self._servermap.summarize_versions())
1020+
1021         eventually(self._done_deferred.callback, self._servermap)
1022 
1023     def _fatal_error(self, f):
1024hunk ./src/allmydata/test/test_mutable.py 637
1025         d.addCallback(_created)
1026         return d
1027 
1028-    def publish_multiple(self):
1029+    def publish_mdmf(self):
1030+        # like publish_one, except that the result is guaranteed to be
1031+        # an MDMF file.
1032+        # self.CONTENTS should have more than one segment.
1033+        self.CONTENTS = "This is an MDMF file" * 100000
1034+        self._storage = FakeStorage()
1035+        self._nodemaker = make_nodemaker(self._storage)
1036+        self._storage_broker = self._nodemaker.storage_broker
1037+        d = self._nodemaker.create_mutable_file(self.CONTENTS, version=1)
1038+        def _created(node):
1039+            self._fn = node
1040+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
1041+        d.addCallback(_created)
1042+        return d
1043+
1044+
1045+    def publish_sdmf(self):
1046+        # like publish_one, except that the result is guaranteed to be
1047+        # an SDMF file
1048+        self.CONTENTS = "This is an SDMF file" * 1000
1049+        self._storage = FakeStorage()
1050+        self._nodemaker = make_nodemaker(self._storage)
1051+        self._storage_broker = self._nodemaker.storage_broker
1052+        d = self._nodemaker.create_mutable_file(self.CONTENTS, version=0)
1053+        def _created(node):
1054+            self._fn = node
1055+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
1056+        d.addCallback(_created)
1057+        return d
1058+
1059+
1060+    def publish_multiple(self, version=0):
1061         self.CONTENTS = ["Contents 0",
1062                          "Contents 1",
1063                          "Contents 2",
1064hunk ./src/allmydata/test/test_mutable.py 677
1065         self._copied_shares = {}
1066         self._storage = FakeStorage()
1067         self._nodemaker = make_nodemaker(self._storage)
1068-        d = self._nodemaker.create_mutable_file(self.CONTENTS[0]) # seqnum=1
1069+        d = self._nodemaker.create_mutable_file(self.CONTENTS[0], version=version) # seqnum=1
1070         def _created(node):
1071             self._fn = node
1072             # now create multiple versions of the same file, and accumulate
1073hunk ./src/allmydata/test/test_mutable.py 906
1074         return d
1075 
1076 
1077+    def test_servermapupdater_finds_mdmf_files(self):
1078+        # setUp already published an MDMF file for us. We just need to
1079+        # make sure that when we run the ServermapUpdater, the file is
1080+        # reported to have one recoverable version.
1081+        d = defer.succeed(None)
1082+        d.addCallback(lambda ignored:
1083+            self.publish_mdmf())
1084+        d.addCallback(lambda ignored:
1085+            self.make_servermap(mode=MODE_CHECK))
1086+        # Calling make_servermap also updates the servermap in the mode
1087+        # that we specify, so we just need to see what it says.
1088+        def _check_servermap(sm):
1089+            self.failUnlessEqual(len(sm.recoverable_versions()), 1)
1090+        d.addCallback(_check_servermap)
1091+        return d
1092+
1093+
1094+    def test_servermapupdater_finds_sdmf_files(self):
1095+        d = defer.succeed(None)
1096+        d.addCallback(lambda ignored:
1097+            self.publish_sdmf())
1098+        d.addCallback(lambda ignored:
1099+            self.make_servermap(mode=MODE_CHECK))
1100+        d.addCallback(lambda servermap:
1101+            self.failUnlessEqual(len(servermap.recoverable_versions()), 1))
1102+        return d
1103+
1104 
1105 class Roundtrip(unittest.TestCase, testutil.ShouldFailMixin, PublishMixin):
1106     def setUp(self):
1107hunk ./src/allmydata/test/test_mutable.py 1050
1108         return d
1109     test_no_servers_download.timeout = 15
1110 
1111+
1112     def _test_corrupt_all(self, offset, substring,
1113                           should_succeed=False, corrupt_early=True,
1114                           failure_checker=None):
1115}
1116[Make a segmented mutable uploader
1117Kevan Carstensen <kevan@isnotajoke.com>**20100626234204
1118 Ignore-this: d199af8ab0bc64d8ed2bc19c5437bfba
1119 
1120 The mutable file uploader should be able to publish files with one
1121 segment and files with multiple segments. This patch makes it do that.
1122 This is still incomplete, and rather ugly -- I need to flesh out error
1123 handling, I need to write tests, and I need to remove some of the uglier
1124 kludges in the process before I can call this done.
1125] {
1126hunk ./src/allmydata/mutable/publish.py 8
1127 from zope.interface import implements
1128 from twisted.internet import defer
1129 from twisted.python import failure
1130-from allmydata.interfaces import IPublishStatus
1131+from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION
1132 from allmydata.util import base32, hashutil, mathutil, idlib, log
1133 from allmydata import hashtree, codec
1134 from allmydata.storage.server import si_b2a
1135hunk ./src/allmydata/mutable/publish.py 19
1136      UncoordinatedWriteError, NotEnoughServersError
1137 from allmydata.mutable.servermap import ServerMap
1138 from allmydata.mutable.layout import pack_prefix, pack_share, unpack_header, pack_checkstring, \
1139-     unpack_checkstring, SIGNED_PREFIX
1140+     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy
1141+
1142+KiB = 1024
1143+DEFAULT_MAX_SEGMENT_SIZE = 128 * KiB
1144 
1145 class PublishStatus:
1146     implements(IPublishStatus)
1147hunk ./src/allmydata/mutable/publish.py 112
1148         self._status.set_helper(False)
1149         self._status.set_progress(0.0)
1150         self._status.set_active(True)
1151+        # We use this to control how the file is written.
1152+        version = self._node.get_version()
1153+        assert version in (SDMF_VERSION, MDMF_VERSION)
1154+        self._version = version
1155 
1156     def get_status(self):
1157         return self._status
1158hunk ./src/allmydata/mutable/publish.py 134
1159         simultaneous write.
1160         """
1161 
1162-        # 1: generate shares (SDMF: files are small, so we can do it in RAM)
1163-        # 2: perform peer selection, get candidate servers
1164-        #  2a: send queries to n+epsilon servers, to determine current shares
1165-        #  2b: based upon responses, create target map
1166-        # 3: send slot_testv_and_readv_and_writev messages
1167-        # 4: as responses return, update share-dispatch table
1168-        # 4a: may need to run recovery algorithm
1169-        # 5: when enough responses are back, we're done
1170+        # 0. Setup encoding parameters, encoder, and other such things.
1171+        # 1. Encrypt, encode, and publish segments.
1172 
1173         self.log("starting publish, datalen is %s" % len(newdata))
1174         self._status.set_size(len(newdata))
1175hunk ./src/allmydata/mutable/publish.py 187
1176         self.bad_peers = set() # peerids who have errbacked/refused requests
1177 
1178         self.newdata = newdata
1179-        self.salt = os.urandom(16)
1180 
1181hunk ./src/allmydata/mutable/publish.py 188
1182+        # This will set self.segment_size, self.num_segments, and
1183+        # self.fec.
1184         self.setup_encoding_parameters()
1185 
1186         # if we experience any surprises (writes which were rejected because
1187hunk ./src/allmydata/mutable/publish.py 238
1188             self.bad_share_checkstrings[key] = old_checkstring
1189             self.connections[peerid] = self._servermap.connections[peerid]
1190 
1191-        # create the shares. We'll discard these as they are delivered. SDMF:
1192-        # we're allowed to hold everything in memory.
1193+        # Now, the process dovetails -- if this is an SDMF file, we need
1194+        # to write an SDMF file. Otherwise, we need to write an MDMF
1195+        # file.
1196+        if self._version == MDMF_VERSION:
1197+            return self._publish_mdmf()
1198+        else:
1199+            return self._publish_sdmf()
1200+        #return self.done_deferred
1201+
1202+    def _publish_mdmf(self):
1203+        # Next, we find homes for all of the shares that we don't have
1204+        # homes for yet.
1205+        # TODO: Make this part do peer selection.
1206+        self.update_goal()
1207+        self.writers = {}
1208+        # For each (peerid, shnum) in self.goal, we make an
1209+        # MDMFSlotWriteProxy for that peer. We'll use this to write
1210+        # shares to the peer.
1211+        for key in self.goal:
1212+            peerid, shnum = key
1213+            write_enabler = self._node.get_write_enabler(peerid)
1214+            renew_secret = self._node.get_renewal_secret(peerid)
1215+            cancel_secret = self._node.get_cancel_secret(peerid)
1216+            secrets = (write_enabler, renew_secret, cancel_secret)
1217+
1218+            self.writers[shnum] =  MDMFSlotWriteProxy(shnum,
1219+                                                      self.connections[peerid],
1220+                                                      self._storage_index,
1221+                                                      secrets,
1222+                                                      self._new_seqnum,
1223+                                                      self.required_shares,
1224+                                                      self.total_shares,
1225+                                                      self.segment_size,
1226+                                                      len(self.newdata))
1227+            if (peerid, shnum) in self._servermap.servermap:
1228+                old_versionid, old_timestamp = self._servermap.servermap[key]
1229+                (old_seqnum, old_root_hash, old_salt, old_segsize,
1230+                 old_datalength, old_k, old_N, old_prefix,
1231+                 old_offsets_tuple) = old_versionid
1232+                self.writers[shnum].set_checkstring(old_seqnum, old_root_hash)
1233+
1234+        # Now, we start pushing shares.
1235+        self._status.timings["setup"] = time.time() - self._started
1236+        def _start_pushing(res):
1237+            self._started_pushing = time.time()
1238+            return res
1239+
1240+        # First, we encrypt, encode, and publish the shares that we need
1241+        # to encrypt, encode, and publish.
1242+
1243+        # This will eventually hold the block hash chain for each share
1244+        # that we publish. We define it this way so that empty publishes
1245+        # will still have something to write to the remote slot.
1246+        self.blockhashes = dict([(i, []) for i in xrange(self.total_shares)])
1247+        self.sharehash_leaves = None # eventually [sharehashes]
1248+        self.sharehashes = {} # shnum -> [sharehash leaves necessary to
1249+                              # validate the share]
1250 
1251hunk ./src/allmydata/mutable/publish.py 296
1252+        d = defer.succeed(None)
1253+        self.log("Starting push")
1254+        for i in xrange(self.num_segments - 1):
1255+            d.addCallback(lambda ignored, i=i:
1256+                self.push_segment(i))
1257+            d.addCallback(self._turn_barrier)
1258+        # We have at least one segment, so we will have a tail segment
1259+        if self.num_segments > 0:
1260+            d.addCallback(lambda ignored:
1261+                self.push_tail_segment())
1262+
1263+        d.addCallback(lambda ignored:
1264+            self.push_encprivkey())
1265+        d.addCallback(lambda ignored:
1266+            self.push_blockhashes())
1267+        d.addCallback(lambda ignored:
1268+            self.push_sharehashes())
1269+        d.addCallback(lambda ignored:
1270+            self.push_toplevel_hashes_and_signature())
1271+        d.addCallback(lambda ignored:
1272+            self.finish_publishing())
1273+        return d
1274+
1275+
1276+    def _publish_sdmf(self):
1277         self._status.timings["setup"] = time.time() - self._started
1278hunk ./src/allmydata/mutable/publish.py 322
1279+        self.salt = os.urandom(16)
1280+
1281         d = self._encrypt_and_encode()
1282         d.addCallback(self._generate_shares)
1283         def _start_pushing(res):
1284hunk ./src/allmydata/mutable/publish.py 335
1285 
1286         return self.done_deferred
1287 
1288+
1289     def setup_encoding_parameters(self):
1290hunk ./src/allmydata/mutable/publish.py 337
1291-        segment_size = len(self.newdata)
1292+        if self._version == MDMF_VERSION:
1293+            segment_size = DEFAULT_MAX_SEGMENT_SIZE # 128 KiB by default
1294+        else:
1295+            segment_size = len(self.newdata) # SDMF is only one segment
1296         # this must be a multiple of self.required_shares
1297         segment_size = mathutil.next_multiple(segment_size,
1298                                               self.required_shares)
1299hunk ./src/allmydata/mutable/publish.py 350
1300                                                   segment_size)
1301         else:
1302             self.num_segments = 0
1303-        assert self.num_segments in [0, 1,] # SDMF restrictions
1304+        if self._version == SDMF_VERSION:
1305+            assert self.num_segments in (0, 1) # SDMF
1306+            return
1307+        # calculate the tail segment size.
1308+        self.tail_segment_size = len(self.newdata) % segment_size
1309+
1310+        if self.tail_segment_size == 0:
1311+            # The tail segment is the same size as the other segments.
1312+            self.tail_segment_size = segment_size
1313+
1314+        # We'll make an encoder ahead-of-time for the normal-sized
1315+        # segments (defined as any segment of segment_size size.
1316+        # (the part of the code that puts the tail segment will make its
1317+        #  own encoder for that part)
1318+        fec = codec.CRSEncoder()
1319+        fec.set_params(self.segment_size,
1320+                       self.required_shares, self.total_shares)
1321+        self.piece_size = fec.get_block_size()
1322+        self.fec = fec
1323+
1324+
1325+    def push_segment(self, segnum):
1326+        started = time.time()
1327+        segsize = self.segment_size
1328+        self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
1329+        data = self.newdata[segsize * segnum:segsize*(segnum + 1)]
1330+        assert len(data) == segsize
1331+
1332+        salt = os.urandom(16)
1333+
1334+        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
1335+        enc = AES(key)
1336+        crypttext = enc.process(data)
1337+        assert len(crypttext) == len(data)
1338+
1339+        now = time.time()
1340+        self._status.timings["encrypt"] = now - started
1341+        started = now
1342+
1343+        # now apply FEC
1344+
1345+        self._status.set_status("Encoding")
1346+        crypttext_pieces = [None] * self.required_shares
1347+        piece_size = self.piece_size
1348+        for i in range(len(crypttext_pieces)):
1349+            offset = i * piece_size
1350+            piece = crypttext[offset:offset+piece_size]
1351+            piece = piece + "\x00"*(piece_size - len(piece)) # padding
1352+            crypttext_pieces[i] = piece
1353+            assert len(piece) == piece_size
1354+        d = self.fec.encode(crypttext_pieces)
1355+        def _done_encoding(res):
1356+            elapsed = time.time() - started
1357+            self._status.timings["encode"] = elapsed
1358+            return res
1359+        d.addCallback(_done_encoding)
1360+
1361+        def _push_shares_and_salt(results):
1362+            shares, shareids = results
1363+            dl = []
1364+            for i in xrange(len(shares)):
1365+                sharedata = shares[i]
1366+                shareid = shareids[i]
1367+                block_hash = hashutil.block_hash(salt + sharedata)
1368+                self.blockhashes[shareid].append(block_hash)
1369+
1370+                # find the writer for this share
1371+                d = self.writers[shareid].put_block(sharedata, segnum, salt)
1372+                dl.append(d)
1373+            # TODO: Naturally, we need to check on the results of these.
1374+            return defer.DeferredList(dl)
1375+        d.addCallback(_push_shares_and_salt)
1376+        return d
1377+
1378+
1379+    def push_tail_segment(self):
1380+        # This is essentially the same as push_segment, except that we
1381+        # don't use the cached encoder that we use elsewhere.
1382+        self.log("Pushing tail segment")
1383+        started = time.time()
1384+        segsize = self.segment_size
1385+        data = self.newdata[segsize * (self.num_segments-1):]
1386+        assert len(data) == self.tail_segment_size
1387+        salt = os.urandom(16)
1388+
1389+        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
1390+        enc = AES(key)
1391+        crypttext = enc.process(data)
1392+        assert len(crypttext) == len(data)
1393+
1394+        now = time.time()
1395+        self._status.timings['encrypt'] = now - started
1396+        started = now
1397+
1398+        self._status.set_status("Encoding")
1399+        tail_fec = codec.CRSEncoder()
1400+        tail_fec.set_params(self.tail_segment_size,
1401+                            self.required_shares,
1402+                            self.total_shares)
1403+
1404+        crypttext_pieces = [None] * self.required_shares
1405+        piece_size = tail_fec.get_block_size()
1406+        for i in range(len(crypttext_pieces)):
1407+            offset = i * piece_size
1408+            piece = crypttext[offset:offset+piece_size]
1409+            piece = piece + "\x00"*(piece_size - len(piece)) # padding
1410+            crypttext_pieces[i] = piece
1411+            assert len(piece) == piece_size
1412+        d = tail_fec.encode(crypttext_pieces)
1413+        def _push_shares_and_salt(results):
1414+            shares, shareids = results
1415+            dl = []
1416+            for i in xrange(len(shares)):
1417+                sharedata = shares[i]
1418+                shareid = shareids[i]
1419+                block_hash = hashutil.block_hash(salt + sharedata)
1420+                self.blockhashes[shareid].append(block_hash)
1421+                # find the writer for this share
1422+                d = self.writers[shareid].put_block(sharedata,
1423+                                                    self.num_segments - 1,
1424+                                                    salt)
1425+                dl.append(d)
1426+            # TODO: Naturally, we need to check on the results of these.
1427+            return defer.DeferredList(dl)
1428+        d.addCallback(_push_shares_and_salt)
1429+        return d
1430+
1431+
1432+    def push_encprivkey(self):
1433+        started = time.time()
1434+        encprivkey = self._encprivkey
1435+        dl = []
1436+        def _spy_on_writer(results):
1437+            print results
1438+            return results
1439+        for shnum, writer in self.writers.iteritems():
1440+            d = writer.put_encprivkey(encprivkey)
1441+            dl.append(d)
1442+        d = defer.DeferredList(dl)
1443+        return d
1444+
1445+
1446+    def push_blockhashes(self):
1447+        started = time.time()
1448+        dl = []
1449+        def _spy_on_results(results):
1450+            print results
1451+            return results
1452+        self.sharehash_leaves = [None] * len(self.blockhashes)
1453+        for shnum, blockhashes in self.blockhashes.iteritems():
1454+            t = hashtree.HashTree(blockhashes)
1455+            self.blockhashes[shnum] = list(t)
1456+            # set the leaf for future use.
1457+            self.sharehash_leaves[shnum] = t[0]
1458+            d = self.writers[shnum].put_blockhashes(self.blockhashes[shnum])
1459+            dl.append(d)
1460+        d = defer.DeferredList(dl)
1461+        return d
1462+
1463+
1464+    def push_sharehashes(self):
1465+        share_hash_tree = hashtree.HashTree(self.sharehash_leaves)
1466+        share_hash_chain = {}
1467+        ds = []
1468+        def _spy_on_results(results):
1469+            print results
1470+            return results
1471+        for shnum in xrange(len(self.sharehash_leaves)):
1472+            needed_indices = share_hash_tree.needed_hashes(shnum)
1473+            self.sharehashes[shnum] = dict( [ (i, share_hash_tree[i])
1474+                                             for i in needed_indices] )
1475+            d = self.writers[shnum].put_sharehashes(self.sharehashes[shnum])
1476+            ds.append(d)
1477+        self.root_hash = share_hash_tree[0]
1478+        d = defer.DeferredList(ds)
1479+        return d
1480+
1481+
1482+    def push_toplevel_hashes_and_signature(self):
1483+        # We need to to three things here:
1484+        #   - Push the root hash and salt hash
1485+        #   - Get the checkstring of the resulting layout; sign that.
1486+        #   - Push the signature
1487+        ds = []
1488+        def _spy_on_results(results):
1489+            print results
1490+            return results
1491+        for shnum in xrange(self.total_shares):
1492+            d = self.writers[shnum].put_root_hash(self.root_hash)
1493+            ds.append(d)
1494+        d = defer.DeferredList(ds)
1495+        def _make_and_place_signature(ignored):
1496+            signable = self.writers[0].get_signable()
1497+            self.signature = self._privkey.sign(signable)
1498+
1499+            ds = []
1500+            for (shnum, writer) in self.writers.iteritems():
1501+                d = writer.put_signature(self.signature)
1502+                ds.append(d)
1503+            return defer.DeferredList(ds)
1504+        d.addCallback(_make_and_place_signature)
1505+        return d
1506+
1507+
1508+    def finish_publishing(self):
1509+        # We're almost done -- we just need to put the verification key
1510+        # and the offsets
1511+        ds = []
1512+        verification_key = self._pubkey.serialize()
1513+
1514+        def _spy_on_results(results):
1515+            print results
1516+            return results
1517+        for (shnum, writer) in self.writers.iteritems():
1518+            d = writer.put_verification_key(verification_key)
1519+            d.addCallback(lambda ignored, writer=writer:
1520+                writer.finish_publishing())
1521+            ds.append(d)
1522+        return defer.DeferredList(ds)
1523+
1524+
1525+    def _turn_barrier(self, res):
1526+        # putting this method in a Deferred chain imposes a guaranteed
1527+        # reactor turn between the pre- and post- portions of that chain.
1528+        # This can be useful to limit memory consumption: since Deferreds do
1529+        # not do tail recursion, code which uses defer.succeed(result) for
1530+        # consistency will cause objects to live for longer than you might
1531+        # normally expect.
1532+        return fireEventually(res)
1533+
1534 
1535     def _fatal_error(self, f):
1536         self.log("error during loop", failure=f, level=log.UNUSUAL)
1537hunk ./src/allmydata/mutable/publish.py 716
1538             self.log_goal(self.goal, "after update: ")
1539 
1540 
1541-
1542     def _encrypt_and_encode(self):
1543         # this returns a Deferred that fires with a list of (sharedata,
1544         # sharenum) tuples. TODO: cache the ciphertext, only produce the
1545hunk ./src/allmydata/mutable/publish.py 757
1546         d.addCallback(_done_encoding)
1547         return d
1548 
1549+
1550     def _generate_shares(self, shares_and_shareids):
1551         # this sets self.shares and self.root_hash
1552         self.log("_generate_shares")
1553hunk ./src/allmydata/mutable/publish.py 1145
1554             self._status.set_progress(1.0)
1555         eventually(self.done_deferred.callback, res)
1556 
1557-
1558hunk ./src/allmydata/test/test_mutable.py 248
1559         d.addCallback(_created)
1560         return d
1561 
1562+
1563+    def test_create_mdmf(self):
1564+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
1565+        def _created(n):
1566+            self.failUnless(isinstance(n, MutableFileNode))
1567+            self.failUnlessEqual(n.get_storage_index(), n._storage_index)
1568+            sb = self.nodemaker.storage_broker
1569+            peer0 = sorted(sb.get_all_serverids())[0]
1570+            shnums = self._storage._peers[peer0].keys()
1571+            self.failUnlessEqual(len(shnums), 1)
1572+        d.addCallback(_created)
1573+        return d
1574+
1575+
1576     def test_serialize(self):
1577         n = MutableFileNode(None, None, {"k": 3, "n": 10}, None)
1578         calls = []
1579hunk ./src/allmydata/test/test_mutable.py 334
1580         d.addCallback(_created)
1581         return d
1582 
1583+
1584+    def test_create_mdmf_with_initial_contents(self):
1585+        initial_contents = "foobarbaz" * 131072 # 900KiB
1586+        d = self.nodemaker.create_mutable_file(initial_contents,
1587+                                               version=MDMF_VERSION)
1588+        def _created(n):
1589+            d = n.download_best_version()
1590+            d.addCallback(lambda data:
1591+                self.failUnlessEqual(data, initial_contents))
1592+            d.addCallback(lambda ignored:
1593+                n.overwrite(initial_contents + "foobarbaz"))
1594+            d.addCallback(lambda ignored:
1595+                n.download_best_version())
1596+            d.addCallback(lambda data:
1597+                self.failUnlessEqual(data, initial_contents +
1598+                                           "foobarbaz"))
1599+            return d
1600+        d.addCallback(_created)
1601+        return d
1602+
1603+
1604     def test_create_with_initial_contents_function(self):
1605         data = "initial contents"
1606         def _make_contents(n):
1607hunk ./src/allmydata/test/test_mutable.py 370
1608         d.addCallback(lambda data2: self.failUnlessEqual(data2, data))
1609         return d
1610 
1611+
1612+    def test_create_mdmf_with_initial_contents_function(self):
1613+        data = "initial contents" * 100000
1614+        def _make_contents(n):
1615+            self.failUnless(isinstance(n, MutableFileNode))
1616+            key = n.get_writekey()
1617+            self.failUnless(isinstance(key, str), key)
1618+            self.failUnlessEqual(len(key), 16)
1619+            return data
1620+        d = self.nodemaker.create_mutable_file(_make_contents,
1621+                                               version=MDMF_VERSION)
1622+        d.addCallback(lambda n:
1623+            n.download_best_version())
1624+        d.addCallback(lambda data2:
1625+            self.failUnlessEqual(data2, data))
1626+        return d
1627+
1628+
1629     def test_create_with_too_large_contents(self):
1630         BIG = "a" * (self.OLD_MAX_SEGMENT_SIZE + 1)
1631         d = self.nodemaker.create_mutable_file(BIG)
1632}
1633[Write a segmented mutable downloader
1634Kevan Carstensen <kevan@isnotajoke.com>**20100626234314
1635 Ignore-this: d2bef531cde1b5c38f2eb28afdd4b17c
1636 
1637 The segmented mutable downloader can deal with MDMF files (files with
1638 one or more segments in MDMF format) and SDMF files (files with one
1639 segment in SDMF format). It is backwards compatible with the old
1640 file format.
1641 
1642 This patch also contains tests for the segmented mutable downloader.
1643] {
1644hunk ./src/allmydata/mutable/retrieve.py 8
1645 from twisted.internet import defer
1646 from twisted.python import failure
1647 from foolscap.api import DeadReferenceError, eventually, fireEventually
1648-from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError
1649-from allmydata.util import hashutil, idlib, log
1650+from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError, \
1651+                                 MDMF_VERSION, SDMF_VERSION
1652+from allmydata.util import hashutil, idlib, log, mathutil
1653 from allmydata import hashtree, codec
1654 from allmydata.storage.server import si_b2a
1655 from pycryptopp.cipher.aes import AES
1656hunk ./src/allmydata/mutable/retrieve.py 17
1657 from pycryptopp.publickey import rsa
1658 
1659 from allmydata.mutable.common import DictOfSets, CorruptShareError, UncoordinatedWriteError
1660-from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data
1661+from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data, \
1662+                                     MDMFSlotReadProxy
1663 
1664 class RetrieveStatus:
1665     implements(IRetrieveStatus)
1666hunk ./src/allmydata/mutable/retrieve.py 104
1667         self.verinfo = verinfo
1668         # during repair, we may be called upon to grab the private key, since
1669         # it wasn't picked up during a verify=False checker run, and we'll
1670-        # need it for repair to generate the a new version.
1671+        # need it for repair to generate a new version.
1672         self._need_privkey = fetch_privkey
1673         if self._node.get_privkey():
1674             self._need_privkey = False
1675hunk ./src/allmydata/mutable/retrieve.py 109
1676 
1677+        if self._need_privkey:
1678+            # TODO: Evaluate the need for this. We'll use it if we want
1679+            # to limit how many queries are on the wire for the privkey
1680+            # at once.
1681+            self._privkey_query_markers = [] # one Marker for each time we've
1682+                                             # tried to get the privkey.
1683+
1684         self._status = RetrieveStatus()
1685         self._status.set_storage_index(self._storage_index)
1686         self._status.set_helper(False)
1687hunk ./src/allmydata/mutable/retrieve.py 125
1688          offsets_tuple) = self.verinfo
1689         self._status.set_size(datalength)
1690         self._status.set_encoding(k, N)
1691+        self.readers = {}
1692 
1693     def get_status(self):
1694         return self._status
1695hunk ./src/allmydata/mutable/retrieve.py 149
1696         self.remaining_sharemap = DictOfSets()
1697         for (shnum, peerid, timestamp) in shares:
1698             self.remaining_sharemap.add(shnum, peerid)
1699+            # If the servermap update fetched anything, it fetched at least 1
1700+            # KiB, so we ask for that much.
1701+            # TODO: Change the cache methods to allow us to fetch all of the
1702+            # data that they have, then change this method to do that.
1703+            any_cache, timestamp = self._node._read_from_cache(self.verinfo,
1704+                                                               shnum,
1705+                                                               0,
1706+                                                               1000)
1707+            ss = self.servermap.connections[peerid]
1708+            reader = MDMFSlotReadProxy(ss,
1709+                                       self._storage_index,
1710+                                       shnum,
1711+                                       any_cache)
1712+            reader.peerid = peerid
1713+            self.readers[shnum] = reader
1714+
1715 
1716         self.shares = {} # maps shnum to validated blocks
1717hunk ./src/allmydata/mutable/retrieve.py 167
1718+        self._active_readers = [] # list of active readers for this dl.
1719+        self._validated_readers = set() # set of readers that we have
1720+                                        # validated the prefix of
1721+        self._block_hash_trees = {} # shnum => hashtree
1722+        # TODO: Make this into a file-backed consumer or something to
1723+        # conserve memory.
1724+        self._plaintext = ""
1725 
1726         # how many shares do we need?
1727hunk ./src/allmydata/mutable/retrieve.py 176
1728-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
1729+        (seqnum,
1730+         root_hash,
1731+         IV,
1732+         segsize,
1733+         datalength,
1734+         k,
1735+         N,
1736+         prefix,
1737          offsets_tuple) = self.verinfo
1738hunk ./src/allmydata/mutable/retrieve.py 185
1739-        assert len(self.remaining_sharemap) >= k
1740-        # we start with the lowest shnums we have available, since FEC is
1741-        # faster if we're using "primary shares"
1742-        self.active_shnums = set(sorted(self.remaining_sharemap.keys())[:k])
1743-        for shnum in self.active_shnums:
1744-            # we use an arbitrary peer who has the share. If shares are
1745-            # doubled up (more than one share per peer), we could make this
1746-            # run faster by spreading the load among multiple peers. But the
1747-            # algorithm to do that is more complicated than I want to write
1748-            # right now, and a well-provisioned grid shouldn't have multiple
1749-            # shares per peer.
1750-            peerid = list(self.remaining_sharemap[shnum])[0]
1751-            self.get_data(shnum, peerid)
1752 
1753hunk ./src/allmydata/mutable/retrieve.py 186
1754-        # control flow beyond this point: state machine. Receiving responses
1755-        # from queries is the input. We might send out more queries, or we
1756-        # might produce a result.
1757 
1758hunk ./src/allmydata/mutable/retrieve.py 187
1759+        # We need one share hash tree for the entire file; its leaves
1760+        # are the roots of the block hash trees for the shares that
1761+        # comprise it, and its root is in the verinfo.
1762+        self.share_hash_tree = hashtree.IncompleteHashTree(N)
1763+        self.share_hash_tree.set_hashes({0: root_hash})
1764+
1765+        # This will set up both the segment decoder and the tail segment
1766+        # decoder, as well as a variety of other instance variables that
1767+        # the download process will use.
1768+        self._setup_encoding_parameters()
1769+        assert len(self.remaining_sharemap) >= k
1770+
1771+        self.log("starting download")
1772+        self._add_active_peers()
1773+        # The download process beyond this is a state machine.
1774+        # _add_active_peers will select the peers that we want to use
1775+        # for the download, and then attempt to start downloading. After
1776+        # each segment, it will check for doneness, reacting to broken
1777+        # peers and corrupt shares as necessary. If it runs out of good
1778+        # peers before downloading all of the segments, _done_deferred
1779+        # will errback.  Otherwise, it will eventually callback with the
1780+        # contents of the mutable file.
1781         return self._done_deferred
1782 
1783hunk ./src/allmydata/mutable/retrieve.py 211
1784-    def get_data(self, shnum, peerid):
1785-        self.log(format="sending sh#%(shnum)d request to [%(peerid)s]",
1786-                 shnum=shnum,
1787-                 peerid=idlib.shortnodeid_b2a(peerid),
1788-                 level=log.NOISY)
1789-        ss = self.servermap.connections[peerid]
1790-        started = time.time()
1791-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
1792+
1793+    def _setup_encoding_parameters(self):
1794+        """
1795+        I set up the encoding parameters, including k, n, the number
1796+        of segments associated with this file, and the segment decoder.
1797+        """
1798+        (seqnum,
1799+         root_hash,
1800+         IV,
1801+         segsize,
1802+         datalength,
1803+         k,
1804+         n,
1805+         known_prefix,
1806          offsets_tuple) = self.verinfo
1807hunk ./src/allmydata/mutable/retrieve.py 226
1808-        offsets = dict(offsets_tuple)
1809+        self._required_shares = k
1810+        self._total_shares = n
1811+        self._segment_size = segsize
1812+        self._data_length = datalength
1813+
1814+        if not IV:
1815+            self._version = MDMF_VERSION
1816+        else:
1817+            self._version = SDMF_VERSION
1818+
1819+        if datalength and segsize:
1820+            self._num_segments = mathutil.div_ceil(datalength, segsize)
1821+            self._tail_data_size = datalength % segsize
1822+        else:
1823+            self._num_segments = 0
1824+            self._tail_data_size = 0
1825 
1826hunk ./src/allmydata/mutable/retrieve.py 243
1827-        # we read the checkstring, to make sure that the data we grab is from
1828-        # the right version.
1829-        readv = [ (0, struct.calcsize(SIGNED_PREFIX)) ]
1830+        self._segment_decoder = codec.CRSDecoder()
1831+        self._segment_decoder.set_params(segsize, k, n)
1832+        self._current_segment = 0
1833 
1834hunk ./src/allmydata/mutable/retrieve.py 247
1835-        # We also read the data, and the hashes necessary to validate them
1836-        # (share_hash_chain, block_hash_tree, share_data). We don't read the
1837-        # signature or the pubkey, since that was handled during the
1838-        # servermap phase, and we'll be comparing the share hash chain
1839-        # against the roothash that was validated back then.
1840+        if  not self._tail_data_size:
1841+            self._tail_data_size = segsize
1842 
1843hunk ./src/allmydata/mutable/retrieve.py 250
1844-        readv.append( (offsets['share_hash_chain'],
1845-                       offsets['enc_privkey'] - offsets['share_hash_chain'] ) )
1846+        self._tail_segment_size = mathutil.next_multiple(self._tail_data_size,
1847+                                                         self._required_shares)
1848+        if self._tail_segment_size == self._segment_size:
1849+            self._tail_decoder = self._segment_decoder
1850+        else:
1851+            self._tail_decoder = codec.CRSDecoder()
1852+            self._tail_decoder.set_params(self._tail_segment_size,
1853+                                          self._required_shares,
1854+                                          self._total_shares)
1855 
1856hunk ./src/allmydata/mutable/retrieve.py 260
1857-        # if we need the private key (for repair), we also fetch that
1858-        if self._need_privkey:
1859-            readv.append( (offsets['enc_privkey'],
1860-                           offsets['EOF'] - offsets['enc_privkey']) )
1861+        self.log("got encoding parameters: "
1862+                 "k: %d "
1863+                 "n: %d "
1864+                 "%d segments of %d bytes each (%d byte tail segment)" % \
1865+                 (k, n, self._num_segments, self._segment_size,
1866+                  self._tail_segment_size))
1867 
1868hunk ./src/allmydata/mutable/retrieve.py 267
1869-        m = Marker()
1870-        self._outstanding_queries[m] = (peerid, shnum, started)
1871+        for i in xrange(self._total_shares):
1872+            # So we don't have to do this later.
1873+            self._block_hash_trees[i] = hashtree.IncompleteHashTree(self._num_segments)
1874 
1875hunk ./src/allmydata/mutable/retrieve.py 271
1876-        # ask the cache first
1877-        got_from_cache = False
1878-        datavs = []
1879-        for (offset, length) in readv:
1880-            (data, timestamp) = self._node._read_from_cache(self.verinfo, shnum,
1881-                                                            offset, length)
1882-            if data is not None:
1883-                datavs.append(data)
1884-        if len(datavs) == len(readv):
1885-            self.log("got data from cache")
1886-            got_from_cache = True
1887-            d = fireEventually({shnum: datavs})
1888-            # datavs is a dict mapping shnum to a pair of strings
1889-        else:
1890-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
1891-        self.remaining_sharemap.discard(shnum, peerid)
1892+        # If we have more than one segment, we are an SDMF file, which
1893+        # means that we need to validate the salts as we receive them.
1894+        self._salt_hash_tree = hashtree.IncompleteHashTree(self._num_segments)
1895+        self._salt_hash_tree[0] = IV # from the prefix.
1896 
1897hunk ./src/allmydata/mutable/retrieve.py 276
1898-        d.addCallback(self._got_results, m, peerid, started, got_from_cache)
1899-        d.addErrback(self._query_failed, m, peerid)
1900-        # errors that aren't handled by _query_failed (and errors caused by
1901-        # _query_failed) get logged, but we still want to check for doneness.
1902-        def _oops(f):
1903-            self.log(format="problem in _query_failed for sh#%(shnum)d to %(peerid)s",
1904-                     shnum=shnum,
1905-                     peerid=idlib.shortnodeid_b2a(peerid),
1906-                     failure=f,
1907-                     level=log.WEIRD, umid="W0xnQA")
1908-        d.addErrback(_oops)
1909-        d.addBoth(self._check_for_done)
1910-        # any error during _check_for_done means the download fails. If the
1911-        # download is successful, _check_for_done will fire _done by itself.
1912-        d.addErrback(self._done)
1913-        d.addErrback(log.err)
1914-        return d # purely for testing convenience
1915 
1916hunk ./src/allmydata/mutable/retrieve.py 277
1917-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
1918-        # isolate the callRemote to a separate method, so tests can subclass
1919-        # Publish and override it
1920-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
1921-        return d
1922+    def _add_active_peers(self):
1923+        """
1924+        I populate self._active_readers with enough active readers to
1925+        retrieve the contents of this mutable file. I am called before
1926+        downloading starts, and (eventually) after each validation
1927+        error, connection error, or other problem in the download.
1928+        """
1929+        # TODO: It would be cool to investigate other heuristics for
1930+        # reader selection. For instance, the cost (in time the user
1931+        # spends waiting for their file) of selecting a really slow peer
1932+        # that happens to have a primary share is probably more than
1933+        # selecting a really fast peer that doesn't have a primary
1934+        # share. Maybe the servermap could be extended to provide this
1935+        # information; it could keep track of latency information while
1936+        # it gathers more important data, and then this routine could
1937+        # use that to select active readers.
1938+        #
1939+        # (these and other questions would be easier to answer with a
1940+        #  robust, configurable tahoe-lafs simulator, which modeled node
1941+        #  failures, differences in node speed, and other characteristics
1942+        #  that we expect storage servers to have.  You could have
1943+        #  presets for really stable grids (like allmydata.com),
1944+        #  friendnets, make it easy to configure your own settings, and
1945+        #  then simulate the effect of big changes on these use cases
1946+        #  instead of just reasoning about what the effect might be. Out
1947+        #  of scope for MDMF, though.)
1948 
1949hunk ./src/allmydata/mutable/retrieve.py 304
1950-    def remove_peer(self, peerid):
1951-        for shnum in list(self.remaining_sharemap.keys()):
1952-            self.remaining_sharemap.discard(shnum, peerid)
1953+        # We need at least self._required_shares readers to download a
1954+        # segment.
1955+        needed = self._required_shares - len(self._active_readers)
1956+        # XXX: Why don't format= log messages work here?
1957+        self.log("adding %d peers to the active peers list" % needed)
1958 
1959hunk ./src/allmydata/mutable/retrieve.py 310
1960-    def _got_results(self, datavs, marker, peerid, started, got_from_cache):
1961-        now = time.time()
1962-        elapsed = now - started
1963-        if not got_from_cache:
1964-            self._status.add_fetch_timing(peerid, elapsed)
1965-        self.log(format="got results (%(shares)d shares) from [%(peerid)s]",
1966-                 shares=len(datavs),
1967-                 peerid=idlib.shortnodeid_b2a(peerid),
1968-                 level=log.NOISY)
1969-        self._outstanding_queries.pop(marker, None)
1970-        if not self._running:
1971-            return
1972+        # We favor lower numbered shares, since FEC is faster with
1973+        # primary shares than with other shares, and lower-numbered
1974+        # shares are more likely to be primary than higher numbered
1975+        # shares.
1976+        active_shnums = set(sorted(self.remaining_sharemap.keys()))
1977+        # We shouldn't consider adding shares that we already have; this
1978+        # will cause problems later.
1979+        active_shnums -= set([reader.shnum for reader in self._active_readers])
1980+        active_shnums = list(active_shnums)[:needed]
1981+        if len(active_shnums) < needed:
1982+            # We don't have enough readers to retrieve the file; fail.
1983+            return self._failed()
1984 
1985hunk ./src/allmydata/mutable/retrieve.py 323
1986-        # note that we only ask for a single share per query, so we only
1987-        # expect a single share back. On the other hand, we use the extra
1988-        # shares if we get them.. seems better than an assert().
1989+        for shnum in active_shnums:
1990+            self._active_readers.append(self.readers[shnum])
1991+            self.log("added reader for share %d" % shnum)
1992+        assert len(self._active_readers) == self._required_shares
1993+        # Conceptually, this is part of the _add_active_peers step. It
1994+        # validates the prefixes of newly added readers to make sure
1995+        # that they match what we are expecting for self.verinfo. If
1996+        # validation is successful, _validate_active_prefixes will call
1997+        # _download_current_segment for us. If validation is
1998+        # unsuccessful, then _validate_prefixes will remove the peer and
1999+        # call _add_active_peers again, where we will attempt to rectify
2000+        # the problem by choosing another peer.
2001+        return self._validate_active_prefixes()
2002 
2003hunk ./src/allmydata/mutable/retrieve.py 337
2004-        for shnum,datav in datavs.items():
2005-            (prefix, hash_and_data) = datav[:2]
2006-            try:
2007-                self._got_results_one_share(shnum, peerid,
2008-                                            prefix, hash_and_data)
2009-            except CorruptShareError, e:
2010-                # log it and give the other shares a chance to be processed
2011-                f = failure.Failure()
2012-                self.log(format="bad share: %(f_value)s",
2013-                         f_value=str(f.value), failure=f,
2014-                         level=log.WEIRD, umid="7fzWZw")
2015-                self.notify_server_corruption(peerid, shnum, str(e))
2016-                self.remove_peer(peerid)
2017-                self.servermap.mark_bad_share(peerid, shnum, prefix)
2018-                self._bad_shares.add( (peerid, shnum) )
2019-                self._status.problems[peerid] = f
2020-                self._last_failure = f
2021-                pass
2022-            if self._need_privkey and len(datav) > 2:
2023-                lp = None
2024-                self._try_to_validate_privkey(datav[2], peerid, shnum, lp)
2025-        # all done!
2026 
2027hunk ./src/allmydata/mutable/retrieve.py 338
2028-    def notify_server_corruption(self, peerid, shnum, reason):
2029-        ss = self.servermap.connections[peerid]
2030-        ss.callRemoteOnly("advise_corrupt_share",
2031-                          "mutable", self._storage_index, shnum, reason)
2032+    def _validate_active_prefixes(self):
2033+        """
2034+        I check to make sure that the prefixes on the peers that I am
2035+        currently reading from match the prefix that we want to see, as
2036+        said in self.verinfo.
2037 
2038hunk ./src/allmydata/mutable/retrieve.py 344
2039-    def _got_results_one_share(self, shnum, peerid,
2040-                               got_prefix, got_hash_and_data):
2041-        self.log("_got_results: got shnum #%d from peerid %s"
2042-                 % (shnum, idlib.shortnodeid_b2a(peerid)))
2043-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2044+        If I find that all of the active peers have acceptable prefixes,
2045+        I pass control to _download_current_segment, which will use
2046+        those peers to do cool things. If I find that some of the active
2047+        peers have unacceptable prefixes, I will remove them from active
2048+        peers (and from further consideration) and call
2049+        _add_active_peers to attempt to rectify the situation. I keep
2050+        track of which peers I have already validated so that I don't
2051+        need to do so again.
2052+        """
2053+        assert self._active_readers, "No more active readers"
2054+
2055+        ds = []
2056+        new_readers = set(self._active_readers) - self._validated_readers
2057+        self.log('validating %d newly-added active readers' % len(new_readers))
2058+
2059+        for reader in new_readers:
2060+            # We force a remote read here -- otherwise, we are relying
2061+            # on cached data that we already verified as valid, and we
2062+            # won't detect an uncoordinated write that has occurred
2063+            # since the last servermap update.
2064+            d = reader.get_prefix(force_remote=True)
2065+            d.addCallback(self._try_to_validate_prefix, reader)
2066+            ds.append(d)
2067+        dl = defer.DeferredList(ds, consumeErrors=True)
2068+        def _check_results(results):
2069+            # Each result in results will be of the form (success, msg).
2070+            # We don't care about msg, but success will tell us whether
2071+            # or not the checkstring validated. If it didn't, we need to
2072+            # remove the offending (peer,share) from our active readers,
2073+            # and ensure that active readers is again populated.
2074+            bad_readers = []
2075+            for i, result in enumerate(results):
2076+                if not result[0]:
2077+                    reader = self._active_readers[i]
2078+                    f = result[1]
2079+                    assert isinstance(f, failure.Failure)
2080+
2081+                    self.log("The reader %s failed to "
2082+                             "properly validate: %s" % \
2083+                             (reader, str(f.value)))
2084+                    bad_readers.append((reader, f))
2085+                else:
2086+                    reader = self._active_readers[i]
2087+                    self.log("the reader %s checks out, so we'll use it" % \
2088+                             reader)
2089+                    self._validated_readers.add(reader)
2090+                    # Each time we validate a reader, we check to see if
2091+                    # we need the private key. If we do, we politely ask
2092+                    # for it and then continue computing. If we find
2093+                    # that we haven't gotten it at the end of
2094+                    # segment decoding, then we'll take more drastic
2095+                    # measures.
2096+                    if self._need_privkey:
2097+                        d = reader.get_encprivkey()
2098+                        d.addCallback(self._try_to_validate_privkey, reader)
2099+            if bad_readers:
2100+                # We do them all at once, or else we screw up list indexing.
2101+                for (reader, f) in bad_readers:
2102+                    self._mark_bad_share(reader, f)
2103+                return self._add_active_peers()
2104+            else:
2105+                return self._download_current_segment()
2106+            # The next step will assert that it has enough active
2107+            # readers to fetch shares; we just need to remove it.
2108+        dl.addCallback(_check_results)
2109+        return dl
2110+
2111+
2112+    def _try_to_validate_prefix(self, prefix, reader):
2113+        """
2114+        I check that the prefix returned by a candidate server for
2115+        retrieval matches the prefix that the servermap knows about
2116+        (and, hence, the prefix that was validated earlier). If it does,
2117+        I return True, which means that I approve of the use of the
2118+        candidate server for segment retrieval. If it doesn't, I return
2119+        False, which means that another server must be chosen.
2120+        """
2121+        (seqnum,
2122+         root_hash,
2123+         IV,
2124+         segsize,
2125+         datalength,
2126+         k,
2127+         N,
2128+         known_prefix,
2129          offsets_tuple) = self.verinfo
2130hunk ./src/allmydata/mutable/retrieve.py 430
2131-        assert len(got_prefix) == len(prefix), (len(got_prefix), len(prefix))
2132-        if got_prefix != prefix:
2133-            msg = "someone wrote to the data since we read the servermap: prefix changed"
2134-            raise UncoordinatedWriteError(msg)
2135-        (share_hash_chain, block_hash_tree,
2136-         share_data) = unpack_share_data(self.verinfo, got_hash_and_data)
2137+        if known_prefix != prefix:
2138+            self.log("prefix from share %d doesn't match" % reader.shnum)
2139+            raise UncoordinatedWriteError("Mismatched prefix -- this could "
2140+                                          "indicate an uncoordinated write")
2141+        # Otherwise, we're okay -- no issues.
2142 
2143hunk ./src/allmydata/mutable/retrieve.py 436
2144-        assert isinstance(share_data, str)
2145-        # build the block hash tree. SDMF has only one leaf.
2146-        leaves = [hashutil.block_hash(share_data)]
2147-        t = hashtree.HashTree(leaves)
2148-        if list(t) != block_hash_tree:
2149-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
2150-        share_hash_leaf = t[0]
2151-        t2 = hashtree.IncompleteHashTree(N)
2152-        # root_hash was checked by the signature
2153-        t2.set_hashes({0: root_hash})
2154-        try:
2155-            t2.set_hashes(hashes=share_hash_chain,
2156-                          leaves={shnum: share_hash_leaf})
2157-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
2158-                IndexError), e:
2159-            msg = "corrupt hashes: %s" % (e,)
2160-            raise CorruptShareError(peerid, shnum, msg)
2161-        self.log(" data valid! len=%d" % len(share_data))
2162-        # each query comes down to this: placing validated share data into
2163-        # self.shares
2164-        self.shares[shnum] = share_data
2165 
2166hunk ./src/allmydata/mutable/retrieve.py 437
2167-    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
2168+    def _remove_reader(self, reader):
2169+        """
2170+        At various points, we will wish to remove a peer from
2171+        consideration and/or use. These include, but are not necessarily
2172+        limited to:
2173 
2174hunk ./src/allmydata/mutable/retrieve.py 443
2175-        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
2176-        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
2177-        if alleged_writekey != self._node.get_writekey():
2178-            self.log("invalid privkey from %s shnum %d" %
2179-                     (idlib.nodeid_b2a(peerid)[:8], shnum),
2180-                     parent=lp, level=log.WEIRD, umid="YIw4tA")
2181-            return
2182+            - A connection error.
2183+            - A mismatched prefix (that is, a prefix that does not match
2184+              our conception of the version information string).
2185+            - A failing block hash, salt hash, or share hash, which can
2186+              indicate disk failure/bit flips, or network trouble.
2187 
2188hunk ./src/allmydata/mutable/retrieve.py 449
2189-        # it's good
2190-        self.log("got valid privkey from shnum %d on peerid %s" %
2191-                 (shnum, idlib.shortnodeid_b2a(peerid)),
2192-                 parent=lp)
2193-        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
2194-        self._node._populate_encprivkey(enc_privkey)
2195-        self._node._populate_privkey(privkey)
2196-        self._need_privkey = False
2197+        This method will do that. I will make sure that the
2198+        (shnum,reader) combination represented by my reader argument is
2199+        not used for anything else during this download. I will not
2200+        advise the reader of any corruption, something that my callers
2201+        may wish to do on their own.
2202+        """
2203+        # TODO: When you're done writing this, see if this is ever
2204+        # actually used for something that _mark_bad_share isn't. I have
2205+        # a feeling that they will be used for very similar things, and
2206+        # that having them both here is just going to be an epic amount
2207+        # of code duplication.
2208+        #
2209+        # (well, okay, not epic, but meaningful)
2210+        self.log("removing reader %s" % reader)
2211+        # Remove the reader from _active_readers
2212+        self._active_readers.remove(reader)
2213+        # TODO: self.readers.remove(reader)?
2214+        for shnum in list(self.remaining_sharemap.keys()):
2215+            self.remaining_sharemap.discard(shnum, reader.peerid)
2216 
2217hunk ./src/allmydata/mutable/retrieve.py 469
2218-    def _query_failed(self, f, marker, peerid):
2219-        self.log(format="query to [%(peerid)s] failed",
2220-                 peerid=idlib.shortnodeid_b2a(peerid),
2221-                 level=log.NOISY)
2222-        self._status.problems[peerid] = f
2223-        self._outstanding_queries.pop(marker, None)
2224-        if not self._running:
2225-            return
2226-        self._last_failure = f
2227-        self.remove_peer(peerid)
2228-        level = log.WEIRD
2229-        if f.check(DeadReferenceError):
2230-            level = log.UNUSUAL
2231-        self.log(format="error during query: %(f_value)s",
2232-                 f_value=str(f.value), failure=f, level=level, umid="gOJB5g")
2233 
2234hunk ./src/allmydata/mutable/retrieve.py 470
2235-    def _check_for_done(self, res):
2236-        # exit paths:
2237-        #  return : keep waiting, no new queries
2238-        #  return self._send_more_queries(outstanding) : send some more queries
2239-        #  fire self._done(plaintext) : download successful
2240-        #  raise exception : download fails
2241+    def _mark_bad_share(self, reader, f):
2242+        """
2243+        I mark the (peerid, shnum) encapsulated by my reader argument as
2244+        a bad share, which means that it will not be used anywhere else.
2245 
2246hunk ./src/allmydata/mutable/retrieve.py 475
2247-        self.log(format="_check_for_done: running=%(running)s, decoding=%(decoding)s",
2248-                 running=self._running, decoding=self._decoding,
2249-                 level=log.NOISY)
2250-        if not self._running:
2251-            return
2252-        if self._decoding:
2253-            return
2254-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2255-         offsets_tuple) = self.verinfo
2256+        There are several reasons to want to mark something as a bad
2257+        share. These include:
2258 
2259hunk ./src/allmydata/mutable/retrieve.py 478
2260-        if len(self.shares) < k:
2261-            # we don't have enough shares yet
2262-            return self._maybe_send_more_queries(k)
2263-        if self._need_privkey:
2264-            # we got k shares, but none of them had a valid privkey. TODO:
2265-            # look further. Adding code to do this is a bit complicated, and
2266-            # I want to avoid that complication, and this should be pretty
2267-            # rare (k shares with bitflips in the enc_privkey but not in the
2268-            # data blocks). If we actually do get here, the subsequent repair
2269-            # will fail for lack of a privkey.
2270-            self.log("got k shares but still need_privkey, bummer",
2271-                     level=log.WEIRD, umid="MdRHPA")
2272+            - A connection error to the peer.
2273+            - A mismatched prefix (that is, a prefix that does not match
2274+              our local conception of the version information string).
2275+            - A failing block hash, salt hash, share hash, or other
2276+              integrity check.
2277 
2278hunk ./src/allmydata/mutable/retrieve.py 484
2279-        # we have enough to finish. All the shares have had their hashes
2280-        # checked, so if something fails at this point, we don't know how
2281-        # to fix it, so the download will fail.
2282+        This method will ensure that readers that we wish to mark bad
2283+        (for these reasons or other reasons) are not used for the rest
2284+        of the download. Additionally, it will attempt to tell the
2285+        remote peer (with no guarantee of success) that its share is
2286+        corrupt.
2287+        """
2288+        self.log("marking share %d on server %s as bad" % \
2289+                 (reader.shnum, reader))
2290+        self._remove_reader(reader)
2291+        self._bad_shares.add((reader.peerid, reader.shnum))
2292+        self._status.problems[reader.peerid] = f
2293+        self._last_failure = f
2294+        self.notify_server_corruption(reader.peerid, reader.shnum,
2295+                                      str(f.value))
2296 
2297hunk ./src/allmydata/mutable/retrieve.py 499
2298-        self._decoding = True # avoid reentrancy
2299-        self._status.set_status("decoding")
2300-        now = time.time()
2301-        elapsed = now - self._started
2302-        self._status.timings["fetch"] = elapsed
2303 
2304hunk ./src/allmydata/mutable/retrieve.py 500
2305-        d = defer.maybeDeferred(self._decode)
2306-        d.addCallback(self._decrypt, IV, self._node.get_readkey())
2307-        d.addBoth(self._done)
2308-        return d # purely for test convenience
2309+    def _download_current_segment(self):
2310+        """
2311+        I download, validate, decode, decrypt, and assemble the segment
2312+        that this Retrieve is currently responsible for downloading.
2313+        """
2314+        assert len(self._active_readers) >= self._required_shares
2315+        if self._current_segment < self._num_segments:
2316+            d = self._process_segment(self._current_segment)
2317+        else:
2318+            d = defer.succeed(None)
2319+        d.addCallback(self._check_for_done)
2320+        return d
2321 
2322hunk ./src/allmydata/mutable/retrieve.py 513
2323-    def _maybe_send_more_queries(self, k):
2324-        # we don't have enough shares yet. Should we send out more queries?
2325-        # There are some number of queries outstanding, each for a single
2326-        # share. If we can generate 'needed_shares' additional queries, we do
2327-        # so. If we can't, then we know this file is a goner, and we raise
2328-        # NotEnoughSharesError.
2329-        self.log(format=("_maybe_send_more_queries, have=%(have)d, k=%(k)d, "
2330-                         "outstanding=%(outstanding)d"),
2331-                 have=len(self.shares), k=k,
2332-                 outstanding=len(self._outstanding_queries),
2333-                 level=log.NOISY)
2334 
2335hunk ./src/allmydata/mutable/retrieve.py 514
2336-        remaining_shares = k - len(self.shares)
2337-        needed = remaining_shares - len(self._outstanding_queries)
2338-        if not needed:
2339-            # we have enough queries in flight already
2340+    def _process_segment(self, segnum):
2341+        """
2342+        I download, validate, decode, and decrypt one segment of the
2343+        file that this Retrieve is retrieving. This means coordinating
2344+        the process of getting k blocks of that file, validating them,
2345+        assembling them into one segment with the decoder, and then
2346+        decrypting them.
2347+        """
2348+        self.log("processing segment %d" % segnum)
2349 
2350hunk ./src/allmydata/mutable/retrieve.py 524
2351-            # TODO: but if they've been in flight for a long time, and we
2352-            # have reason to believe that new queries might respond faster
2353-            # (i.e. we've seen other queries come back faster, then consider
2354-            # sending out new queries. This could help with peers which have
2355-            # silently gone away since the servermap was updated, for which
2356-            # we're still waiting for the 15-minute TCP disconnect to happen.
2357-            self.log("enough queries are in flight, no more are needed",
2358-                     level=log.NOISY)
2359-            return
2360+        # TODO: The old code uses a marker. Should this code do that
2361+        # too? What did the Marker do?
2362+        assert len(self._active_readers) >= self._required_shares
2363+
2364+        # We need to ask each of our active readers for its block and
2365+        # salt. We will then validate those. If validation is
2366+        # successful, we will assemble the results into plaintext.
2367+        ds = []
2368+        for reader in self._active_readers:
2369+            d = reader.get_block_and_salt(segnum, queue=True)
2370+            d2 = self._get_needed_hashes(reader, segnum)
2371+            dl = defer.DeferredList([d, d2], consumeErrors=True)
2372+            dl.addCallback(self._validate_block, segnum, reader)
2373+            dl.addErrback(self._validation_or_decoding_failed, [reader])
2374+            ds.append(dl)
2375+            reader.flush()
2376+        dl = defer.DeferredList(ds)
2377+        dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
2378+        return dl
2379 
2380hunk ./src/allmydata/mutable/retrieve.py 544
2381-        outstanding_shnums = set([shnum
2382-                                  for (peerid, shnum, started)
2383-                                  in self._outstanding_queries.values()])
2384-        # prefer low-numbered shares, they are more likely to be primary
2385-        available_shnums = sorted(self.remaining_sharemap.keys())
2386-        for shnum in available_shnums:
2387-            if shnum in outstanding_shnums:
2388-                # skip ones that are already in transit
2389-                continue
2390-            if shnum not in self.remaining_sharemap:
2391-                # no servers for that shnum. note that DictOfSets removes
2392-                # empty sets from the dict for us.
2393-                continue
2394-            peerid = list(self.remaining_sharemap[shnum])[0]
2395-            # get_data will remove that peerid from the sharemap, and add the
2396-            # query to self._outstanding_queries
2397-            self._status.set_status("Retrieving More Shares")
2398-            self.get_data(shnum, peerid)
2399-            needed -= 1
2400-            if not needed:
2401+
2402+    def _maybe_decode_and_decrypt_segment(self, blocks_and_salts, segnum):
2403+        """
2404+        I take the results of fetching and validating the blocks from a
2405+        callback chain in another method. If the results are such that
2406+        they tell me that validation and fetching succeeded without
2407+        incident, I will proceed with decoding and decryption.
2408+        Otherwise, I will do nothing.
2409+        """
2410+        self.log("trying to decode and decrypt segment %d" % segnum)
2411+        failures = False
2412+        for block_and_salt in blocks_and_salts:
2413+            if not block_and_salt[0] or block_and_salt[1] == None:
2414+                self.log("some validation operations failed; not proceeding")
2415+                failures = True
2416                 break
2417hunk ./src/allmydata/mutable/retrieve.py 560
2418+        if not failures:
2419+            self.log("everything looks ok, building segment %d" % segnum)
2420+            d = self._decode_blocks(blocks_and_salts, segnum)
2421+            d.addCallback(self._decrypt_segment)
2422+            d.addErrback(self._validation_or_decoding_failed,
2423+                         self._active_readers)
2424+            d.addCallback(self._set_segment)
2425+            return d
2426+        else:
2427+            return defer.succeed(None)
2428+
2429+
2430+    def _set_segment(self, segment):
2431+        """
2432+        Given a plaintext segment, I register that segment with the
2433+        target that is handling the file download.
2434+        """
2435+        self.log("got plaintext for segment %d" % self._current_segment)
2436+        self._plaintext += segment
2437+        self._current_segment += 1
2438 
2439hunk ./src/allmydata/mutable/retrieve.py 581
2440-        # at this point, we have as many outstanding queries as we can. If
2441-        # needed!=0 then we might not have enough to recover the file.
2442-        if needed:
2443-            format = ("ran out of peers: "
2444-                      "have %(have)d shares (k=%(k)d), "
2445-                      "%(outstanding)d queries in flight, "
2446-                      "need %(need)d more, "
2447-                      "found %(bad)d bad shares")
2448-            args = {"have": len(self.shares),
2449-                    "k": k,
2450-                    "outstanding": len(self._outstanding_queries),
2451-                    "need": needed,
2452-                    "bad": len(self._bad_shares),
2453-                    }
2454-            self.log(format=format,
2455-                     level=log.WEIRD, umid="ezTfjw", **args)
2456-            err = NotEnoughSharesError("%s, last failure: %s" %
2457-                                      (format % args, self._last_failure))
2458-            if self._bad_shares:
2459-                self.log("We found some bad shares this pass. You should "
2460-                         "update the servermap and try again to check "
2461-                         "more peers",
2462-                         level=log.WEIRD, umid="EFkOlA")
2463-                err.servermap = self.servermap
2464-            raise err
2465 
2466hunk ./src/allmydata/mutable/retrieve.py 582
2467+    def _validation_or_decoding_failed(self, f, readers):
2468+        """
2469+        I am called when a block or a salt fails to correctly validate, or when
2470+        the decryption or decoding operation fails for some reason.  I react to
2471+        this failure by notifying the remote server of corruption, and then
2472+        removing the remote peer from further activity.
2473+        """
2474+        assert isinstance(readers, list)
2475+        bad_shnums = [reader.shnum for reader in readers]
2476+
2477+        self.log("validation or decoding failed on share(s) %s, peer(s) %s "
2478+                 ", segment %d: %s" % \
2479+                 (bad_shnums, readers, self._current_segment, str(f)))
2480+        for reader in readers:
2481+            self._mark_bad_share(reader, f)
2482         return
2483 
2484hunk ./src/allmydata/mutable/retrieve.py 599
2485-    def _decode(self):
2486-        started = time.time()
2487-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2488-         offsets_tuple) = self.verinfo
2489 
2490hunk ./src/allmydata/mutable/retrieve.py 600
2491-        # shares_dict is a dict mapping shnum to share data, but the codec
2492-        # wants two lists.
2493-        shareids = []; shares = []
2494-        for shareid, share in self.shares.items():
2495+    def _validate_block(self, results, segnum, reader):
2496+        """
2497+        I validate a block from one share on a remote server.
2498+        """
2499+        # Grab the part of the block hash tree that is necessary to
2500+        # validate this block, then generate the block hash root.
2501+        self.log("validating share %d for segment %d" % (reader.shnum,
2502+                                                             segnum))
2503+        # Did we fail to fetch either of the things that we were
2504+        # supposed to? Fail if so.
2505+        if not results[0][0] and results[1][0]:
2506+            # handled by the errback handler.
2507+
2508+            # These all get batched into one query, so the resulting
2509+            # failure should be the same for all of them, so we can just
2510+            # use the first one.
2511+            assert isinstance(results[0][1], failure.Failure)
2512+
2513+            f = results[0][1]
2514+            raise CorruptShareError(reader.peerid,
2515+                                    reader.shnum,
2516+                                    "Connection error: %s" % str(f))
2517+
2518+        block_and_salt, block_and_sharehashes = results
2519+        block, salt = block_and_salt[1]
2520+        blockhashes, sharehashes = block_and_sharehashes[1]
2521+
2522+        blockhashes = dict(enumerate(blockhashes[1]))
2523+        self.log("the reader gave me the following blockhashes: %s" % \
2524+                 blockhashes.keys())
2525+        self.log("the reader gave me the following sharehashes: %s" % \
2526+                 sharehashes[1].keys())
2527+        bht = self._block_hash_trees[reader.shnum]
2528+
2529+        if bht.needed_hashes(segnum, include_leaf=True):
2530+            try:
2531+                bht.set_hashes(blockhashes)
2532+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
2533+                    IndexError), e:
2534+                raise CorruptShareError(reader.peerid,
2535+                                        reader.shnum,
2536+                                        "block hash tree failure: %s" % e)
2537+
2538+        if self._version == MDMF_VERSION:
2539+            blockhash = hashutil.block_hash(salt + block)
2540+        else:
2541+            blockhash = hashutil.block_hash(block)
2542+        # If this works without an error, then validation is
2543+        # successful.
2544+        try:
2545+           bht.set_hashes(leaves={segnum: blockhash})
2546+        except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
2547+                IndexError), e:
2548+            raise CorruptShareError(reader.peerid,
2549+                                    reader.shnum,
2550+                                    "block hash tree failure: %s" % e)
2551+
2552+        # Reaching this point means that we know that this segment
2553+        # is correct. Now we need to check to see whether the share
2554+        # hash chain is also correct.
2555+        # SDMF wrote share hash chains that didn't contain the
2556+        # leaves, which would be produced from the block hash tree.
2557+        # So we need to validate the block hash tree first. If
2558+        # successful, then bht[0] will contain the root for the
2559+        # shnum, which will be a leaf in the share hash tree, which
2560+        # will allow us to validate the rest of the tree.
2561+        if self.share_hash_tree.needed_hashes(reader.shnum,
2562+                                               include_leaf=True):
2563+            try:
2564+                self.share_hash_tree.set_hashes(hashes=sharehashes[1],
2565+                                            leaves={reader.shnum: bht[0]})
2566+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
2567+                    IndexError), e:
2568+                raise CorruptShareError(reader.peerid,
2569+                                        reader.shnum,
2570+                                        "corrupt hashes: %s" % e)
2571+
2572+        # TODO: Validate the salt, too.
2573+        self.log('share %d is valid for segment %d' % (reader.shnum,
2574+                                                       segnum))
2575+        return {reader.shnum: (block, salt)}
2576+
2577+
2578+    def _get_needed_hashes(self, reader, segnum):
2579+        """
2580+        I get the hashes needed to validate segnum from the reader, then return
2581+        to my caller when this is done.
2582+        """
2583+        bht = self._block_hash_trees[reader.shnum]
2584+        needed = bht.needed_hashes(segnum, include_leaf=True)
2585+        # The root of the block hash tree is also a leaf in the share
2586+        # hash tree. So we don't need to fetch it from the remote
2587+        # server. In the case of files with one segment, this means that
2588+        # we won't fetch any block hash tree from the remote server,
2589+        # since the hash of each share of the file is the entire block
2590+        # hash tree, and is a leaf in the share hash tree. This is fine,
2591+        # since any share corruption will be detected in the share hash
2592+        # tree.
2593+        #needed.discard(0)
2594+        self.log("getting blockhashes for segment %d, share %d: %s" % \
2595+                 (segnum, reader.shnum, str(needed)))
2596+        d1 = reader.get_blockhashes(needed, queue=True, force_remote=True)
2597+        if self.share_hash_tree.needed_hashes(reader.shnum):
2598+            need = self.share_hash_tree.needed_hashes(reader.shnum)
2599+            self.log("also need sharehashes for share %d: %s" % (reader.shnum,
2600+                                                                 str(need)))
2601+            d2 = reader.get_sharehashes(need, queue=True, force_remote=True)
2602+        else:
2603+            d2 = defer.succeed({}) # the logic in the next method
2604+                                   # expects a dict
2605+        dl = defer.DeferredList([d1, d2], consumeErrors=True)
2606+        return dl
2607+
2608+
2609+    def _decode_blocks(self, blocks_and_salts, segnum):
2610+        """
2611+        I take a list of k blocks and salts, and decode that into a
2612+        single encrypted segment.
2613+        """
2614+        d = {}
2615+        # We want to merge our dictionaries to the form
2616+        # {shnum: blocks_and_salts}
2617+        #
2618+        # The dictionaries come from validate block that way, so we just
2619+        # need to merge them.
2620+        for block_and_salt in blocks_and_salts:
2621+            d.update(block_and_salt[1])
2622+
2623+        # All of these blocks should have the same salt; in SDMF, it is
2624+        # the file-wide IV, while in MDMF it is the per-segment salt. In
2625+        # either case, we just need to get one of them and use it.
2626+        #
2627+        # d.items()[0] is like (shnum, (block, salt))
2628+        # d.items()[0][1] is like (block, salt)
2629+        # d.items()[0][1][1] is the salt.
2630+        salt = d.items()[0][1][1]
2631+        # Next, extract just the blocks from the dict. We'll use the
2632+        # salt in the next step.
2633+        share_and_shareids = [(k, v[0]) for k, v in d.items()]
2634+        d2 = dict(share_and_shareids)
2635+        shareids = []
2636+        shares = []
2637+        for shareid, share in d2.items():
2638             shareids.append(shareid)
2639             shares.append(share)
2640 
2641hunk ./src/allmydata/mutable/retrieve.py 746
2642-        assert len(shareids) >= k, len(shareids)
2643+        assert len(shareids) >= self._required_shares, len(shareids)
2644         # zfec really doesn't want extra shares
2645hunk ./src/allmydata/mutable/retrieve.py 748
2646-        shareids = shareids[:k]
2647-        shares = shares[:k]
2648-
2649-        fec = codec.CRSDecoder()
2650-        fec.set_params(segsize, k, N)
2651-
2652-        self.log("params %s, we have %d shares" % ((segsize, k, N), len(shares)))
2653-        self.log("about to decode, shareids=%s" % (shareids,))
2654-        d = defer.maybeDeferred(fec.decode, shares, shareids)
2655-        def _done(buffers):
2656-            self._status.timings["decode"] = time.time() - started
2657-            self.log(" decode done, %d buffers" % len(buffers))
2658+        shareids = shareids[:self._required_shares]
2659+        shares = shares[:self._required_shares]
2660+        self.log("decoding segment %d" % segnum)
2661+        if segnum == self._num_segments - 1:
2662+            d = defer.maybeDeferred(self._tail_decoder.decode, shares, shareids)
2663+        else:
2664+            d = defer.maybeDeferred(self._segment_decoder.decode, shares, shareids)
2665+        def _process(buffers):
2666             segment = "".join(buffers)
2667hunk ./src/allmydata/mutable/retrieve.py 757
2668+            self.log(format="now decoding segment %(segnum)s of %(numsegs)s",
2669+                     segnum=segnum,
2670+                     numsegs=self._num_segments,
2671+                     level=log.NOISY)
2672             self.log(" joined length %d, datalength %d" %
2673hunk ./src/allmydata/mutable/retrieve.py 762
2674-                     (len(segment), datalength))
2675-            segment = segment[:datalength]
2676+                     (len(segment), self._data_length))
2677+            if segnum == self._num_segments - 1:
2678+                size_to_use = self._tail_data_size
2679+            else:
2680+                size_to_use = self._segment_size
2681+            segment = segment[:size_to_use]
2682             self.log(" segment len=%d" % len(segment))
2683hunk ./src/allmydata/mutable/retrieve.py 769
2684-            return segment
2685-        def _err(f):
2686-            self.log(" decode failed: %s" % f)
2687-            return f
2688-        d.addCallback(_done)
2689-        d.addErrback(_err)
2690+            return segment, salt
2691+        d.addCallback(_process)
2692         return d
2693 
2694hunk ./src/allmydata/mutable/retrieve.py 773
2695-    def _decrypt(self, crypttext, IV, readkey):
2696+
2697+    def _decrypt_segment(self, segment_and_salt):
2698+        """
2699+        I take a single segment and its salt, and decrypt it. I return
2700+        the plaintext of the segment that is in my argument.
2701+        """
2702+        segment, salt = segment_and_salt
2703         self._status.set_status("decrypting")
2704hunk ./src/allmydata/mutable/retrieve.py 781
2705+        self.log("decrypting segment %d" % self._current_segment)
2706         started = time.time()
2707hunk ./src/allmydata/mutable/retrieve.py 783
2708-        key = hashutil.ssk_readkey_data_hash(IV, readkey)
2709+        key = hashutil.ssk_readkey_data_hash(salt, self._node.get_readkey())
2710         decryptor = AES(key)
2711hunk ./src/allmydata/mutable/retrieve.py 785
2712-        plaintext = decryptor.process(crypttext)
2713+        plaintext = decryptor.process(segment)
2714         self._status.timings["decrypt"] = time.time() - started
2715         return plaintext
2716 
2717hunk ./src/allmydata/mutable/retrieve.py 789
2718-    def _done(self, res):
2719-        if not self._running:
2720+
2721+    def notify_server_corruption(self, peerid, shnum, reason):
2722+        ss = self.servermap.connections[peerid]
2723+        ss.callRemoteOnly("advise_corrupt_share",
2724+                          "mutable", self._storage_index, shnum, reason)
2725+
2726+
2727+    def _try_to_validate_privkey(self, enc_privkey, reader):
2728+
2729+        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
2730+        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
2731+        if alleged_writekey != self._node.get_writekey():
2732+            self.log("invalid privkey from %s shnum %d" %
2733+                     (reader, reader.shnum),
2734+                     level=log.WEIRD, umid="YIw4tA")
2735             return
2736hunk ./src/allmydata/mutable/retrieve.py 805
2737-        self._running = False
2738-        self._status.set_active(False)
2739-        self._status.timings["total"] = time.time() - self._started
2740-        # res is either the new contents, or a Failure
2741-        if isinstance(res, failure.Failure):
2742-            self.log("Retrieve done, with failure", failure=res,
2743-                     level=log.UNUSUAL)
2744-            self._status.set_status("Failed")
2745-        else:
2746-            self.log("Retrieve done, success!")
2747-            self._status.set_status("Finished")
2748-            self._status.set_progress(1.0)
2749-            # remember the encoding parameters, use them again next time
2750-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2751-             offsets_tuple) = self.verinfo
2752-            self._node._populate_required_shares(k)
2753-            self._node._populate_total_shares(N)
2754-        eventually(self._done_deferred.callback, res)
2755 
2756hunk ./src/allmydata/mutable/retrieve.py 806
2757+        # it's good
2758+        self.log("got valid privkey from shnum %d on reader %s" %
2759+                 (reader.shnum, reader))
2760+        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
2761+        self._node._populate_encprivkey(enc_privkey)
2762+        self._node._populate_privkey(privkey)
2763+        self._need_privkey = False
2764+
2765+
2766+    def _check_for_done(self, res):
2767+        """
2768+        I check to see if this Retrieve object has successfully finished
2769+        its work.
2770+
2771+        I can exit in the following ways:
2772+            - If there are no more segments to download, then I exit by
2773+              causing self._done_deferred to fire with the plaintext
2774+              content requested by the caller.
2775+            - If there are still segments to be downloaded, and there
2776+              are enough active readers (readers which have not broken
2777+              and have not given us corrupt data) to continue
2778+              downloading, I send control back to
2779+              _download_current_segment.
2780+            - If there are still segments to be downloaded but there are
2781+              not enough active peers to download them, I ask
2782+              _add_active_peers to add more peers. If it is successful,
2783+              it will call _download_current_segment. If there are not
2784+              enough peers to retrieve the file, then that will cause
2785+              _done_deferred to errback.
2786+        """
2787+        self.log("checking for doneness")
2788+        if self._current_segment == self._num_segments:
2789+            # No more segments to download, we're done.
2790+            self.log("got plaintext, done")
2791+            return self._done()
2792+
2793+        if len(self._active_readers) >= self._required_shares:
2794+            # More segments to download, but we have enough good peers
2795+            # in self._active_readers that we can do that without issue,
2796+            # so go nab the next segment.
2797+            self.log("not done yet: on segment %d of %d" % \
2798+                     (self._current_segment + 1, self._num_segments))
2799+            return self._download_current_segment()
2800+
2801+        self.log("not done yet: on segment %d of %d, need to add peers" % \
2802+                 (self._current_segment + 1, self._num_segments))
2803+        return self._add_active_peers()
2804+
2805+
2806+    def _done(self):
2807+        """
2808+        I am called by _check_for_done when the download process has
2809+        finished successfully. After making some useful logging
2810+        statements, I return the decrypted contents to the owner of this
2811+        Retrieve object through self._done_deferred.
2812+        """
2813+        eventually(self._done_deferred.callback, self._plaintext)
2814+
2815+
2816+    def _failed(self):
2817+        """
2818+        I am called by _add_active_peers when there are not enough
2819+        active peers left to complete the download. After making some
2820+        useful logging statements, I return an exception to that effect
2821+        to the caller of this Retrieve object through
2822+        self._done_deferred.
2823+        """
2824+        format = ("ran out of peers: "
2825+                  "have %(have)d of %(total)d segments "
2826+                  "found %(bad)d bad shares "
2827+                  "encoding %(k)d-of-%(n)d")
2828+        args = {"have": self._current_segment,
2829+                "total": self._num_segments,
2830+                "k": self._required_shares,
2831+                "n": self._total_shares,
2832+                "bad": len(self._bad_shares)}
2833+        e = NotEnoughSharesError("%s, last failure: %s" % (format % args,
2834+                                                        str(self._last_failure)))
2835+        f = failure.Failure(e)
2836+        eventually(self._done_deferred.callback, f)
2837hunk ./src/allmydata/test/test_mutable.py 12
2838 from allmydata.util.hashutil import tagged_hash, ssk_writekey_hash, \
2839      ssk_pubkey_fingerprint_hash
2840 from allmydata.interfaces import IRepairResults, ICheckAndRepairResults, \
2841-     NotEnoughSharesError
2842+     NotEnoughSharesError, SDMF_VERSION, MDMF_VERSION
2843 from allmydata.monitor import Monitor
2844 from allmydata.test.common import ShouldFailMixin
2845 from allmydata.test.no_network import GridTestMixin
2846hunk ./src/allmydata/test/test_mutable.py 28
2847 from allmydata.mutable.retrieve import Retrieve
2848 from allmydata.mutable.publish import Publish
2849 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
2850-from allmydata.mutable.layout import unpack_header, unpack_share
2851+from allmydata.mutable.layout import unpack_header, unpack_share, \
2852+                                     MDMFSlotReadProxy
2853 from allmydata.mutable.repairer import MustForceRepairError
2854 
2855 import allmydata.test.common_util as testutil
2856hunk ./src/allmydata/test/test_mutable.py 104
2857         d = fireEventually()
2858         d.addCallback(lambda res: _call())
2859         return d
2860+
2861     def callRemoteOnly(self, methname, *args, **kwargs):
2862         d = self.callRemote(methname, *args, **kwargs)
2863         d.addBoth(lambda ignore: None)
2864hunk ./src/allmydata/test/test_mutable.py 163
2865 def corrupt(res, s, offset, shnums_to_corrupt=None, offset_offset=0):
2866     # if shnums_to_corrupt is None, corrupt all shares. Otherwise it is a
2867     # list of shnums to corrupt.
2868+    ds = []
2869     for peerid in s._peers:
2870         shares = s._peers[peerid]
2871         for shnum in shares:
2872hunk ./src/allmydata/test/test_mutable.py 190
2873                 else:
2874                     offset1 = offset
2875                     offset2 = 0
2876-                if offset1 == "pubkey":
2877+                if offset1 == "pubkey" and IV:
2878                     real_offset = 107
2879hunk ./src/allmydata/test/test_mutable.py 192
2880+                elif offset1 == "share_data" and not IV:
2881+                    real_offset = 104
2882                 elif offset1 in o:
2883                     real_offset = o[offset1]
2884                 else:
2885hunk ./src/allmydata/test/test_mutable.py 327
2886         d.addCallback(_created)
2887         return d
2888 
2889+
2890+    def test_upload_and_download_mdmf(self):
2891+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
2892+        def _created(n):
2893+            d = defer.succeed(None)
2894+            d.addCallback(lambda ignored:
2895+                n.get_servermap(MODE_READ))
2896+            def _then(servermap):
2897+                dumped = servermap.dump(StringIO())
2898+                self.failUnlessIn("3-of-10", dumped.getvalue())
2899+            d.addCallback(_then)
2900+            # Now overwrite the contents with some new contents. We want
2901+            # to make them big enough to force the file to be uploaded
2902+            # in more than one segment.
2903+            big_contents = "contents1" * 100000 # about 900 KiB
2904+            d.addCallback(lambda ignored:
2905+                n.overwrite(big_contents))
2906+            d.addCallback(lambda ignored:
2907+                n.download_best_version())
2908+            d.addCallback(lambda data:
2909+                self.failUnlessEqual(data, big_contents))
2910+            # Overwrite the contents again with some new contents. As
2911+            # before, they need to be big enough to force multiple
2912+            # segments, so that we make the downloader deal with
2913+            # multiple segments.
2914+            bigger_contents = "contents2" * 1000000 # about 9MiB
2915+            d.addCallback(lambda ignored:
2916+                n.overwrite(bigger_contents))
2917+            d.addCallback(lambda ignored:
2918+                n.download_best_version())
2919+            d.addCallback(lambda data:
2920+                self.failUnlessEqual(data, bigger_contents))
2921+            return d
2922+        d.addCallback(_created)
2923+        return d
2924+
2925+
2926     def test_create_with_initial_contents(self):
2927         d = self.nodemaker.create_mutable_file("contents 1")
2928         def _created(n):
2929hunk ./src/allmydata/test/test_mutable.py 1147
2930 
2931 
2932     def _test_corrupt_all(self, offset, substring,
2933-                          should_succeed=False, corrupt_early=True,
2934-                          failure_checker=None):
2935+                          should_succeed=False,
2936+                          corrupt_early=True,
2937+                          failure_checker=None,
2938+                          fetch_privkey=False):
2939         d = defer.succeed(None)
2940         if corrupt_early:
2941             d.addCallback(corrupt, self._storage, offset)
2942hunk ./src/allmydata/test/test_mutable.py 1167
2943                     self.failUnlessIn(substring, "".join(allproblems))
2944                 return servermap
2945             if should_succeed:
2946-                d1 = self._fn.download_version(servermap, ver)
2947+                d1 = self._fn.download_version(servermap, ver,
2948+                                               fetch_privkey)
2949                 d1.addCallback(lambda new_contents:
2950                                self.failUnlessEqual(new_contents, self.CONTENTS))
2951             else:
2952hunk ./src/allmydata/test/test_mutable.py 1175
2953                 d1 = self.shouldFail(NotEnoughSharesError,
2954                                      "_corrupt_all(offset=%s)" % (offset,),
2955                                      substring,
2956-                                     self._fn.download_version, servermap, ver)
2957+                                     self._fn.download_version, servermap,
2958+                                                                ver,
2959+                                                                fetch_privkey)
2960             if failure_checker:
2961                 d1.addCallback(failure_checker)
2962             d1.addCallback(lambda res: servermap)
2963hunk ./src/allmydata/test/test_mutable.py 1186
2964         return d
2965 
2966     def test_corrupt_all_verbyte(self):
2967-        # when the version byte is not 0, we hit an UnknownVersionError error
2968-        # in unpack_share().
2969+        # when the version byte is not 0 or 1, we hit an UnknownVersionError
2970+        # error in unpack_share().
2971         d = self._test_corrupt_all(0, "UnknownVersionError")
2972         def _check_servermap(servermap):
2973             # and the dump should mention the problems
2974hunk ./src/allmydata/test/test_mutable.py 1193
2975             s = StringIO()
2976             dump = servermap.dump(s).getvalue()
2977-            self.failUnless("10 PROBLEMS" in dump, dump)
2978+            self.failUnless("30 PROBLEMS" in dump, dump)
2979         d.addCallback(_check_servermap)
2980         return d
2981 
2982hunk ./src/allmydata/test/test_mutable.py 1263
2983         return self._test_corrupt_all("enc_privkey", None, should_succeed=True)
2984 
2985 
2986+    def test_corrupt_all_encprivkey_late(self):
2987+        # this should work for the same reason as above, but we corrupt
2988+        # after the servermap update to exercise the error handling
2989+        # code.
2990+        # We need to remove the privkey from the node, or the retrieve
2991+        # process won't know to update it.
2992+        self._fn._privkey = None
2993+        return self._test_corrupt_all("enc_privkey",
2994+                                      None, # this shouldn't fail
2995+                                      should_succeed=True,
2996+                                      corrupt_early=False,
2997+                                      fetch_privkey=True)
2998+
2999+
3000     def test_corrupt_all_seqnum_late(self):
3001         # corrupting the seqnum between mapupdate and retrieve should result
3002         # in NotEnoughSharesError, since each share will look invalid
3003hunk ./src/allmydata/test/test_mutable.py 1283
3004         def _check(res):
3005             f = res[0]
3006             self.failUnless(f.check(NotEnoughSharesError))
3007-            self.failUnless("someone wrote to the data since we read the servermap" in str(f))
3008+            self.failUnless("uncoordinated write" in str(f))
3009         return self._test_corrupt_all(1, "ran out of peers",
3010                                       corrupt_early=False,
3011                                       failure_checker=_check)
3012hunk ./src/allmydata/test/test_mutable.py 1333
3013                       self.failUnlessEqual(new_contents, self.CONTENTS))
3014         return d
3015 
3016-    def test_corrupt_some(self):
3017-        # corrupt the data of first five shares (so the servermap thinks
3018-        # they're good but retrieve marks them as bad), so that the
3019-        # MODE_READ set of 6 will be insufficient, forcing node.download to
3020-        # retry with more servers.
3021-        corrupt(None, self._storage, "share_data", range(5))
3022-        d = self.make_servermap()
3023+
3024+    def _test_corrupt_some(self, offset, mdmf=False):
3025+        if mdmf:
3026+            d = self.publish_mdmf()
3027+        else:
3028+            d = defer.succeed(None)
3029+        d.addCallback(lambda ignored:
3030+            corrupt(None, self._storage, offset, range(5)))
3031+        d.addCallback(lambda ignored:
3032+            self.make_servermap())
3033         def _do_retrieve(servermap):
3034             ver = servermap.best_recoverable_version()
3035             self.failUnless(ver)
3036hunk ./src/allmydata/test/test_mutable.py 1349
3037             return self._fn.download_best_version()
3038         d.addCallback(_do_retrieve)
3039         d.addCallback(lambda new_contents:
3040-                      self.failUnlessEqual(new_contents, self.CONTENTS))
3041+            self.failUnlessEqual(new_contents, self.CONTENTS))
3042         return d
3043 
3044hunk ./src/allmydata/test/test_mutable.py 1352
3045+
3046+    def test_corrupt_some(self):
3047+        # corrupt the data of first five shares (so the servermap thinks
3048+        # they're good but retrieve marks them as bad), so that the
3049+        # MODE_READ set of 6 will be insufficient, forcing node.download to
3050+        # retry with more servers.
3051+        return self._test_corrupt_some("share_data")
3052+
3053+
3054     def test_download_fails(self):
3055         d = corrupt(None, self._storage, "signature")
3056         d.addCallback(lambda ignored:
3057hunk ./src/allmydata/test/test_mutable.py 1366
3058             self.shouldFail(UnrecoverableFileError, "test_download_anyway",
3059                             "no recoverable versions",
3060-                            self._fn.download_best_version)
3061+                            self._fn.download_best_version))
3062         return d
3063 
3064 
3065hunk ./src/allmydata/test/test_mutable.py 1370
3066+
3067+    def test_corrupt_mdmf_block_hash_tree(self):
3068+        d = self.publish_mdmf()
3069+        d.addCallback(lambda ignored:
3070+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
3071+                                   "block hash tree failure",
3072+                                   corrupt_early=False,
3073+                                   should_succeed=False))
3074+        return d
3075+
3076+
3077+    def test_corrupt_mdmf_block_hash_tree_late(self):
3078+        d = self.publish_mdmf()
3079+        d.addCallback(lambda ignored:
3080+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
3081+                                   "block hash tree failure",
3082+                                   corrupt_early=True,
3083+                                   should_succeed=False))
3084+        return d
3085+
3086+
3087+    def test_corrupt_mdmf_share_data(self):
3088+        d = self.publish_mdmf()
3089+        d.addCallback(lambda ignored:
3090+            # TODO: Find out what the block size is and corrupt a
3091+            # specific block, rather than just guessing.
3092+            self._test_corrupt_all(("share_data", 12 * 40),
3093+                                    "block hash tree failure",
3094+                                    corrupt_early=True,
3095+                                    should_succeed=False))
3096+        return d
3097+
3098+
3099+    def test_corrupt_some_mdmf(self):
3100+        return self._test_corrupt_some(("share_data", 12 * 40),
3101+                                       mdmf=True)
3102+
3103+
3104 class CheckerMixin:
3105     def check_good(self, r, where):
3106         self.failUnless(r.is_healthy(), where)
3107hunk ./src/allmydata/test/test_mutable.py 2116
3108             d.addCallback(lambda res:
3109                           self.shouldFail(NotEnoughSharesError,
3110                                           "test_retrieve_surprise",
3111-                                          "ran out of peers: have 0 shares (k=3)",
3112+                                          "ran out of peers: have 0 of 1",
3113                                           n.download_version,
3114                                           self.old_map,
3115                                           self.old_map.best_recoverable_version(),
3116hunk ./src/allmydata/test/test_mutable.py 2125
3117         d.addCallback(_created)
3118         return d
3119 
3120+
3121     def test_unexpected_shares(self):
3122         # upload the file, take a servermap, shut down one of the servers,
3123         # upload it again (causing shares to appear on a new server), then
3124hunk ./src/allmydata/test/test_mutable.py 2329
3125         self.basedir = "mutable/Problems/test_privkey_query_missing"
3126         self.set_up_grid(num_servers=20)
3127         nm = self.g.clients[0].nodemaker
3128-        LARGE = "These are Larger contents" * 2000 # about 50KB
3129+        LARGE = "These are Larger contents" * 2000 # about 50KiB
3130         nm._node_cache = DevNullDictionary() # disable the nodecache
3131 
3132         d = nm.create_mutable_file(LARGE)
3133hunk ./src/allmydata/test/test_mutable.py 2342
3134         d.addCallback(_created)
3135         d.addCallback(lambda res: self.n2.get_servermap(MODE_WRITE))
3136         return d
3137+
3138+
3139+    def test_block_and_hash_query_error(self):
3140+        # This tests for what happens when a query to a remote server
3141+        # fails in either the hash validation step or the block getting
3142+        # step (because of batching, this is the same actual query).
3143+        # We need to have the storage server persist up until the point
3144+        # that its prefix is validated, then suddenly die. This
3145+        # exercises some exception handling code in Retrieve.
3146+        self.basedir = "mutable/Problems/test_block_and_hash_query_error"
3147+        self.set_up_grid(num_servers=20)
3148+        nm = self.g.clients[0].nodemaker
3149+        CONTENTS = "contents" * 2000
3150+        d = nm.create_mutable_file(CONTENTS)
3151+        def _created(node):
3152+            self._node = node
3153+        d.addCallback(_created)
3154+        d.addCallback(lambda ignored:
3155+            self._node.get_servermap(MODE_READ))
3156+        def _then(servermap):
3157+            # we have our servermap. Now we set up the servers like the
3158+            # tests above -- the first one that gets a read call should
3159+            # start throwing errors, but only after returning its prefix
3160+            # for validation. Since we'll download without fetching the
3161+            # private key, the next query to the remote server will be
3162+            # for either a block and salt or for hashes, either of which
3163+            # will exercise the error handling code.
3164+            killer = FirstServerGetsKilled()
3165+            for (serverid, ss) in nm.storage_broker.get_all_servers():
3166+                ss.post_call_notifier = killer.notify
3167+            ver = servermap.best_recoverable_version()
3168+            assert ver
3169+            return self._node.download_version(servermap, ver)
3170+        d.addCallback(_then)
3171+        d.addCallback(lambda data:
3172+            self.failUnlessEqual(data, CONTENTS))
3173+        return d
3174}
3175[mutable/checker.py: check MDMF files
3176Kevan Carstensen <kevan@isnotajoke.com>**20100628225048
3177 Ignore-this: fb697b36285d60552df6ca5ac6a37629
3178 
3179 This patch adapts the mutable file checker and verifier to check and
3180 verify MDMF files. It does this by using the new segmented downloader,
3181 which is trained to perform verification operations on request. This
3182 removes some code duplication.
3183] {
3184hunk ./src/allmydata/mutable/checker.py 12
3185 from allmydata.mutable.common import MODE_CHECK, CorruptShareError
3186 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
3187 from allmydata.mutable.layout import unpack_share, SIGNED_PREFIX_LENGTH
3188+from allmydata.mutable.retrieve import Retrieve # for verifying
3189 
3190 class MutableChecker:
3191 
3192hunk ./src/allmydata/mutable/checker.py 29
3193 
3194     def check(self, verify=False, add_lease=False):
3195         servermap = ServerMap()
3196+        # Updating the servermap in MODE_CHECK will stand a good chance
3197+        # of finding all of the shares, and getting a good idea of
3198+        # recoverability, etc, without verifying.
3199         u = ServermapUpdater(self._node, self._storage_broker, self._monitor,
3200                              servermap, MODE_CHECK, add_lease=add_lease)
3201         if self._history:
3202hunk ./src/allmydata/mutable/checker.py 55
3203         if num_recoverable:
3204             self.best_version = servermap.best_recoverable_version()
3205 
3206+        # The file is unhealthy and needs to be repaired if:
3207+        # - There are unrecoverable versions.
3208         if servermap.unrecoverable_versions():
3209             self.need_repair = True
3210hunk ./src/allmydata/mutable/checker.py 59
3211+        # - There isn't a recoverable version.
3212         if num_recoverable != 1:
3213             self.need_repair = True
3214hunk ./src/allmydata/mutable/checker.py 62
3215+        # - The best recoverable version is missing some shares.
3216         if self.best_version:
3217             available_shares = servermap.shares_available()
3218             (num_distinct_shares, k, N) = available_shares[self.best_version]
3219hunk ./src/allmydata/mutable/checker.py 73
3220 
3221     def _verify_all_shares(self, servermap):
3222         # read every byte of each share
3223+        #
3224+        # This logic is going to be very nearly the same as the
3225+        # downloader. I bet we could pass the downloader a flag that
3226+        # makes it do this, and piggyback onto that instead of
3227+        # duplicating a bunch of code.
3228+        #
3229+        # Like:
3230+        #  r = Retrieve(blah, blah, blah, verify=True)
3231+        #  d = r.download()
3232+        #  (wait, wait, wait, d.callback)
3233+        # 
3234+        #  Then, when it has finished, we can check the servermap (which
3235+        #  we provided to Retrieve) to figure out which shares are bad,
3236+        #  since the Retrieve process will have updated the servermap as
3237+        #  it went along.
3238+        #
3239+        #  By passing the verify=True flag to the constructor, we are
3240+        #  telling the downloader a few things.
3241+        #
3242+        #  1. It needs to download all N shares, not just K shares.
3243+        #  2. It doesn't need to decrypt or decode the shares, only
3244+        #     verify them.
3245         if not self.best_version:
3246             return
3247hunk ./src/allmydata/mutable/checker.py 97
3248-        versionmap = servermap.make_versionmap()
3249-        shares = versionmap[self.best_version]
3250-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
3251-         offsets_tuple) = self.best_version
3252-        offsets = dict(offsets_tuple)
3253-        readv = [ (0, offsets["EOF"]) ]
3254-        dl = []
3255-        for (shnum, peerid, timestamp) in shares:
3256-            ss = servermap.connections[peerid]
3257-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
3258-            d.addCallback(self._got_answer, peerid, servermap)
3259-            dl.append(d)
3260-        return defer.DeferredList(dl, fireOnOneErrback=True, consumeErrors=True)
3261 
3262hunk ./src/allmydata/mutable/checker.py 98
3263-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
3264-        # isolate the callRemote to a separate method, so tests can subclass
3265-        # Publish and override it
3266-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
3267+        r = Retrieve(self._node, servermap, self.best_version, verify=True)
3268+        d = r.download()
3269+        d.addCallback(self._process_bad_shares)
3270         return d
3271 
3272hunk ./src/allmydata/mutable/checker.py 103
3273-    def _got_answer(self, datavs, peerid, servermap):
3274-        for shnum,datav in datavs.items():
3275-            data = datav[0]
3276-            try:
3277-                self._got_results_one_share(shnum, peerid, data)
3278-            except CorruptShareError:
3279-                f = failure.Failure()
3280-                self.need_repair = True
3281-                self.bad_shares.append( (peerid, shnum, f) )
3282-                prefix = data[:SIGNED_PREFIX_LENGTH]
3283-                servermap.mark_bad_share(peerid, shnum, prefix)
3284-                ss = servermap.connections[peerid]
3285-                self.notify_server_corruption(ss, shnum, str(f.value))
3286-
3287-    def check_prefix(self, peerid, shnum, data):
3288-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
3289-         offsets_tuple) = self.best_version
3290-        got_prefix = data[:SIGNED_PREFIX_LENGTH]
3291-        if got_prefix != prefix:
3292-            raise CorruptShareError(peerid, shnum,
3293-                                    "prefix mismatch: share changed while we were reading it")
3294-
3295-    def _got_results_one_share(self, shnum, peerid, data):
3296-        self.check_prefix(peerid, shnum, data)
3297-
3298-        # the [seqnum:signature] pieces are validated by _compare_prefix,
3299-        # which checks their signature against the pubkey known to be
3300-        # associated with this file.
3301 
3302hunk ./src/allmydata/mutable/checker.py 104
3303-        (seqnum, root_hash, IV, k, N, segsize, datalen, pubkey, signature,
3304-         share_hash_chain, block_hash_tree, share_data,
3305-         enc_privkey) = unpack_share(data)
3306-
3307-        # validate [share_hash_chain,block_hash_tree,share_data]
3308-
3309-        leaves = [hashutil.block_hash(share_data)]
3310-        t = hashtree.HashTree(leaves)
3311-        if list(t) != block_hash_tree:
3312-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
3313-        share_hash_leaf = t[0]
3314-        t2 = hashtree.IncompleteHashTree(N)
3315-        # root_hash was checked by the signature
3316-        t2.set_hashes({0: root_hash})
3317-        try:
3318-            t2.set_hashes(hashes=share_hash_chain,
3319-                          leaves={shnum: share_hash_leaf})
3320-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
3321-                IndexError), e:
3322-            msg = "corrupt hashes: %s" % (e,)
3323-            raise CorruptShareError(peerid, shnum, msg)
3324-
3325-        # validate enc_privkey: only possible if we have a write-cap
3326-        if not self._node.is_readonly():
3327-            alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
3328-            alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
3329-            if alleged_writekey != self._node.get_writekey():
3330-                raise CorruptShareError(peerid, shnum, "invalid privkey")
3331+    def _process_bad_shares(self, bad_shares):
3332+        if bad_shares:
3333+            self.need_repair = True
3334+        self.bad_shares = bad_shares
3335 
3336hunk ./src/allmydata/mutable/checker.py 109
3337-    def notify_server_corruption(self, ss, shnum, reason):
3338-        ss.callRemoteOnly("advise_corrupt_share",
3339-                          "mutable", self._storage_index, shnum, reason)
3340 
3341     def _count_shares(self, smap, version):
3342         available_shares = smap.shares_available()
3343hunk ./src/allmydata/test/test_mutable.py 193
3344                 if offset1 == "pubkey" and IV:
3345                     real_offset = 107
3346                 elif offset1 == "share_data" and not IV:
3347-                    real_offset = 104
3348+                    real_offset = 107
3349                 elif offset1 in o:
3350                     real_offset = o[offset1]
3351                 else:
3352hunk ./src/allmydata/test/test_mutable.py 395
3353             return d
3354         d.addCallback(_created)
3355         return d
3356+    test_create_mdmf_with_initial_contents.timeout = 20
3357 
3358 
3359     def test_create_with_initial_contents_function(self):
3360hunk ./src/allmydata/test/test_mutable.py 700
3361                                            k, N, segsize, datalen)
3362                 self.failUnless(p._pubkey.verify(sig_material, signature))
3363                 #self.failUnlessEqual(signature, p._privkey.sign(sig_material))
3364-                self.failUnless(isinstance(share_hash_chain, dict))
3365-                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
3366+                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
3367                 for shnum,share_hash in share_hash_chain.items():
3368                     self.failUnless(isinstance(shnum, int))
3369                     self.failUnless(isinstance(share_hash, str))
3370hunk ./src/allmydata/test/test_mutable.py 820
3371                     shares[peerid][shnum] = oldshares[index][peerid][shnum]
3372 
3373 
3374+
3375+
3376 class Servermap(unittest.TestCase, PublishMixin):
3377     def setUp(self):
3378         return self.publish_one()
3379hunk ./src/allmydata/test/test_mutable.py 951
3380         self._storage._peers = {} # delete all shares
3381         ms = self.make_servermap
3382         d = defer.succeed(None)
3383-
3384+#
3385         d.addCallback(lambda res: ms(mode=MODE_CHECK))
3386         d.addCallback(lambda sm: self.failUnlessNoneRecoverable(sm))
3387 
3388hunk ./src/allmydata/test/test_mutable.py 1440
3389         d.addCallback(self.check_good, "test_check_good")
3390         return d
3391 
3392+    def test_check_mdmf_good(self):
3393+        d = self.publish_mdmf()
3394+        d.addCallback(lambda ignored:
3395+            self._fn.check(Monitor()))
3396+        d.addCallback(self.check_good, "test_check_mdmf_good")
3397+        return d
3398+
3399     def test_check_no_shares(self):
3400         for shares in self._storage._peers.values():
3401             shares.clear()
3402hunk ./src/allmydata/test/test_mutable.py 1454
3403         d.addCallback(self.check_bad, "test_check_no_shares")
3404         return d
3405 
3406+    def test_check_mdmf_no_shares(self):
3407+        d = self.publish_mdmf()
3408+        def _then(ignored):
3409+            for share in self._storage._peers.values():
3410+                share.clear()
3411+        d.addCallback(_then)
3412+        d.addCallback(lambda ignored:
3413+            self._fn.check(Monitor()))
3414+        d.addCallback(self.check_bad, "test_check_mdmf_no_shares")
3415+        return d
3416+
3417     def test_check_not_enough_shares(self):
3418         for shares in self._storage._peers.values():
3419             for shnum in shares.keys():
3420hunk ./src/allmydata/test/test_mutable.py 1474
3421         d.addCallback(self.check_bad, "test_check_not_enough_shares")
3422         return d
3423 
3424+    def test_check_mdmf_not_enough_shares(self):
3425+        d = self.publish_mdmf()
3426+        def _then(ignored):
3427+            for shares in self._storage._peers.values():
3428+                for shnum in shares.keys():
3429+                    if shnum > 0:
3430+                        del shares[shnum]
3431+        d.addCallback(_then)
3432+        d.addCallback(lambda ignored:
3433+            self._fn.check(Monitor()))
3434+        d.addCallback(self.check_bad, "test_check_mdmf_not_enougH_shares")
3435+        return d
3436+
3437+
3438     def test_check_all_bad_sig(self):
3439         d = corrupt(None, self._storage, 1) # bad sig
3440         d.addCallback(lambda ignored:
3441hunk ./src/allmydata/test/test_mutable.py 1495
3442         d.addCallback(self.check_bad, "test_check_all_bad_sig")
3443         return d
3444 
3445+    def test_check_mdmf_all_bad_sig(self):
3446+        d = self.publish_mdmf()
3447+        d.addCallback(lambda ignored:
3448+            corrupt(None, self._storage, 1))
3449+        d.addCallback(lambda ignored:
3450+            self._fn.check(Monitor()))
3451+        d.addCallback(self.check_bad, "test_check_mdmf_all_bad_sig")
3452+        return d
3453+
3454     def test_check_all_bad_blocks(self):
3455         d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
3456         # the Checker won't notice this.. it doesn't look at actual data
3457hunk ./src/allmydata/test/test_mutable.py 1512
3458         d.addCallback(self.check_good, "test_check_all_bad_blocks")
3459         return d
3460 
3461+
3462+    def test_check_mdmf_all_bad_blocks(self):
3463+        d = self.publish_mdmf()
3464+        d.addCallback(lambda ignored:
3465+            corrupt(None, self._storage, "share_data"))
3466+        d.addCallback(lambda ignored:
3467+            self._fn.check(Monitor()))
3468+        d.addCallback(self.check_good, "test_check_mdmf_all_bad_blocks")
3469+        return d
3470+
3471     def test_verify_good(self):
3472         d = self._fn.check(Monitor(), verify=True)
3473         d.addCallback(self.check_good, "test_verify_good")
3474hunk ./src/allmydata/test/test_mutable.py 1582
3475                       "test_verify_one_bad_encprivkey_uncheckable")
3476         return d
3477 
3478+
3479+    def test_verify_mdmf_good(self):
3480+        d = self.publish_mdmf()
3481+        d.addCallback(lambda ignored:
3482+            self._fn.check(Monitor(), verify=True))
3483+        d.addCallback(self.check_good, "test_verify_mdmf_good")
3484+        return d
3485+
3486+
3487+    def test_verify_mdmf_one_bad_block(self):
3488+        d = self.publish_mdmf()
3489+        d.addCallback(lambda ignored:
3490+            corrupt(None, self._storage, "share_data", [1]))
3491+        d.addCallback(lambda ignored:
3492+            self._fn.check(Monitor(), verify=True))
3493+        # We should find one bad block here
3494+        d.addCallback(self.check_bad, "test_verify_mdmf_one_bad_block")
3495+        d.addCallback(self.check_expected_failure,
3496+                      CorruptShareError, "block hash tree failure",
3497+                      "test_verify_mdmf_one_bad_block")
3498+        return d
3499+
3500+
3501+    def test_verify_mdmf_bad_encprivkey(self):
3502+        d = self.publish_mdmf()
3503+        d.addCallback(lambda ignored:
3504+            corrupt(None, self._storage, "enc_privkey", [1]))
3505+        d.addCallback(lambda ignored:
3506+            self._fn.check(Monitor(), verify=True))
3507+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_encprivkey")
3508+        d.addCallback(self.check_expected_failure,
3509+                      CorruptShareError, "privkey",
3510+                      "test_verify_mdmf_bad_encprivkey")
3511+        return d
3512+
3513+
3514+    def test_verify_mdmf_bad_sig(self):
3515+        d = self.publish_mdmf()
3516+        d.addCallback(lambda ignored:
3517+            corrupt(None, self._storage, 1, [1]))
3518+        d.addCallback(lambda ignored:
3519+            self._fn.check(Monitor(), verify=True))
3520+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_sig")
3521+        return d
3522+
3523+
3524+    def test_verify_mdmf_bad_encprivkey_uncheckable(self):
3525+        d = self.publish_mdmf()
3526+        d.addCallback(lambda ignored:
3527+            corrupt(None, self._storage, "enc_privkey", [1]))
3528+        d.addCallback(lambda ignored:
3529+            self._fn.get_readonly())
3530+        d.addCallback(lambda fn:
3531+            fn.check(Monitor(), verify=True))
3532+        d.addCallback(self.check_good,
3533+                      "test_verify_mdmf_bad_encprivkey_uncheckable")
3534+        return d
3535+
3536+
3537 class Repair(unittest.TestCase, PublishMixin, ShouldFailMixin):
3538 
3539     def get_shares(self, s):
3540hunk ./src/allmydata/test/test_mutable.py 1706
3541         current_shares = self.old_shares[-1]
3542         self.failUnlessEqual(old_shares, current_shares)
3543 
3544+
3545     def test_unrepairable_0shares(self):
3546         d = self.publish_one()
3547         def _delete_all_shares(ign):
3548hunk ./src/allmydata/test/test_mutable.py 1721
3549         d.addCallback(_check)
3550         return d
3551 
3552+    def test_mdmf_unrepairable_0shares(self):
3553+        d = self.publish_mdmf()
3554+        def _delete_all_shares(ign):
3555+            shares = self._storage._peers
3556+            for peerid in shares:
3557+                shares[peerid] = {}
3558+        d.addCallback(_delete_all_shares)
3559+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3560+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3561+        d.addCallback(lambda crr: self.failIf(crr.get_successful()))
3562+        return d
3563+
3564+
3565     def test_unrepairable_1share(self):
3566         d = self.publish_one()
3567         def _delete_all_shares(ign):
3568hunk ./src/allmydata/test/test_mutable.py 1750
3569         d.addCallback(_check)
3570         return d
3571 
3572+    def test_mdmf_unrepairable_1share(self):
3573+        d = self.publish_mdmf()
3574+        def _delete_all_shares(ign):
3575+            shares = self._storage._peers
3576+            for peerid in shares:
3577+                for shnum in list(shares[peerid]):
3578+                    if shnum > 0:
3579+                        del shares[peerid][shnum]
3580+        d.addCallback(_delete_all_shares)
3581+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3582+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3583+        def _check(crr):
3584+            self.failUnlessEqual(crr.get_successful(), False)
3585+        d.addCallback(_check)
3586+        return d
3587+
3588+    def test_repairable_5shares(self):
3589+        d = self.publish_mdmf()
3590+        def _delete_all_shares(ign):
3591+            shares = self._storage._peers
3592+            for peerid in shares:
3593+                for shnum in list(shares[peerid]):
3594+                    if shnum > 4:
3595+                        del shares[peerid][shnum]
3596+        d.addCallback(_delete_all_shares)
3597+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3598+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3599+        def _check(crr):
3600+            self.failUnlessEqual(crr.get_successful(), True)
3601+        d.addCallback(_check)
3602+        return d
3603+
3604+    def test_mdmf_repairable_5shares(self):
3605+        d = self.publish_mdmf()
3606+        def _delete_all_shares(ign):
3607+            shares = self._storage._peers
3608+            for peerid in shares:
3609+                for shnum in list(shares[peerid]):
3610+                    if shnum > 5:
3611+                        del shares[peerid][shnum]
3612+        d.addCallback(_delete_all_shares)
3613+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3614+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3615+        def _check(crr):
3616+            self.failUnlessEqual(crr.get_successful(), True)
3617+        d.addCallback(_check)
3618+        return d
3619+
3620+
3621     def test_merge(self):
3622         self.old_shares = []
3623         d = self.publish_multiple()
3624}
3625[mutable/retrieve.py: learn how to verify mutable files
3626Kevan Carstensen <kevan@isnotajoke.com>**20100628225201
3627 Ignore-this: 989af7800c47589620918461ec989483
3628] {
3629hunk ./src/allmydata/mutable/retrieve.py 86
3630     # Retrieve object will remain tied to a specific version of the file, and
3631     # will use a single ServerMap instance.
3632 
3633-    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False):
3634+    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False,
3635+                 verify=False):
3636         self._node = filenode
3637         assert self._node.get_pubkey()
3638         self._storage_index = filenode.get_storage_index()
3639hunk ./src/allmydata/mutable/retrieve.py 106
3640         # during repair, we may be called upon to grab the private key, since
3641         # it wasn't picked up during a verify=False checker run, and we'll
3642         # need it for repair to generate a new version.
3643-        self._need_privkey = fetch_privkey
3644-        if self._node.get_privkey():
3645+        self._need_privkey = fetch_privkey or verify
3646+        if self._node.get_privkey() and not verify:
3647             self._need_privkey = False
3648 
3649         if self._need_privkey:
3650hunk ./src/allmydata/mutable/retrieve.py 117
3651             self._privkey_query_markers = [] # one Marker for each time we've
3652                                              # tried to get the privkey.
3653 
3654+        # verify means that we are using the downloader logic to verify all
3655+        # of our shares. This tells the downloader a few things.
3656+        #
3657+        # 1. We need to download all of the shares.
3658+        # 2. We don't need to decode or decrypt the shares, since our
3659+        #    caller doesn't care about the plaintext, only the
3660+        #    information about which shares are or are not valid.
3661+        # 3. When we are validating readers, we need to validate the
3662+        #    signature on the prefix. Do we? We already do this in the
3663+        #    servermap update?
3664+        #
3665+        # (just work on 1 and 2 for now, I guess)
3666+        self._verify = False
3667+        if verify:
3668+            self._verify = True
3669+
3670         self._status = RetrieveStatus()
3671         self._status.set_storage_index(self._storage_index)
3672         self._status.set_helper(False)
3673hunk ./src/allmydata/mutable/retrieve.py 323
3674 
3675         # We need at least self._required_shares readers to download a
3676         # segment.
3677-        needed = self._required_shares - len(self._active_readers)
3678+        if self._verify:
3679+            needed = self._total_shares
3680+        else:
3681+            needed = self._required_shares - len(self._active_readers)
3682         # XXX: Why don't format= log messages work here?
3683         self.log("adding %d peers to the active peers list" % needed)
3684 
3685hunk ./src/allmydata/mutable/retrieve.py 339
3686         # will cause problems later.
3687         active_shnums -= set([reader.shnum for reader in self._active_readers])
3688         active_shnums = list(active_shnums)[:needed]
3689-        if len(active_shnums) < needed:
3690+        if len(active_shnums) < needed and not self._verify:
3691             # We don't have enough readers to retrieve the file; fail.
3692             return self._failed()
3693 
3694hunk ./src/allmydata/mutable/retrieve.py 346
3695         for shnum in active_shnums:
3696             self._active_readers.append(self.readers[shnum])
3697             self.log("added reader for share %d" % shnum)
3698-        assert len(self._active_readers) == self._required_shares
3699+        assert len(self._active_readers) >= self._required_shares
3700         # Conceptually, this is part of the _add_active_peers step. It
3701         # validates the prefixes of newly added readers to make sure
3702         # that they match what we are expecting for self.verinfo. If
3703hunk ./src/allmydata/mutable/retrieve.py 416
3704                     # that we haven't gotten it at the end of
3705                     # segment decoding, then we'll take more drastic
3706                     # measures.
3707-                    if self._need_privkey:
3708+                    if self._need_privkey and not self._node.is_readonly():
3709                         d = reader.get_encprivkey()
3710                         d.addCallback(self._try_to_validate_privkey, reader)
3711             if bad_readers:
3712hunk ./src/allmydata/mutable/retrieve.py 423
3713                 # We do them all at once, or else we screw up list indexing.
3714                 for (reader, f) in bad_readers:
3715                     self._mark_bad_share(reader, f)
3716-                return self._add_active_peers()
3717+                if self._verify:
3718+                    if len(self._active_readers) >= self._required_shares:
3719+                        return self._download_current_segment()
3720+                    else:
3721+                        return self._failed()
3722+                else:
3723+                    return self._add_active_peers()
3724             else:
3725                 return self._download_current_segment()
3726             # The next step will assert that it has enough active
3727hunk ./src/allmydata/mutable/retrieve.py 518
3728         """
3729         self.log("marking share %d on server %s as bad" % \
3730                  (reader.shnum, reader))
3731+        prefix = self.verinfo[-2]
3732+        self.servermap.mark_bad_share(reader.peerid,
3733+                                      reader.shnum,
3734+                                      prefix)
3735         self._remove_reader(reader)
3736hunk ./src/allmydata/mutable/retrieve.py 523
3737-        self._bad_shares.add((reader.peerid, reader.shnum))
3738+        self._bad_shares.add((reader.peerid, reader.shnum, f))
3739         self._status.problems[reader.peerid] = f
3740         self._last_failure = f
3741         self.notify_server_corruption(reader.peerid, reader.shnum,
3742hunk ./src/allmydata/mutable/retrieve.py 571
3743             ds.append(dl)
3744             reader.flush()
3745         dl = defer.DeferredList(ds)
3746-        dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
3747+        if self._verify:
3748+            dl.addCallback(lambda ignored: "")
3749+            dl.addCallback(self._set_segment)
3750+        else:
3751+            dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
3752         return dl
3753 
3754 
3755hunk ./src/allmydata/mutable/retrieve.py 701
3756         # shnum, which will be a leaf in the share hash tree, which
3757         # will allow us to validate the rest of the tree.
3758         if self.share_hash_tree.needed_hashes(reader.shnum,
3759-                                               include_leaf=True):
3760+                                              include_leaf=True) or \
3761+                                              self._verify:
3762             try:
3763                 self.share_hash_tree.set_hashes(hashes=sharehashes[1],
3764                                             leaves={reader.shnum: bht[0]})
3765hunk ./src/allmydata/mutable/retrieve.py 832
3766 
3767 
3768     def _try_to_validate_privkey(self, enc_privkey, reader):
3769-
3770         alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
3771         alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
3772         if alleged_writekey != self._node.get_writekey():
3773hunk ./src/allmydata/mutable/retrieve.py 838
3774             self.log("invalid privkey from %s shnum %d" %
3775                      (reader, reader.shnum),
3776                      level=log.WEIRD, umid="YIw4tA")
3777+            if self._verify:
3778+                self.servermap.mark_bad_share(reader.peerid, reader.shnum,
3779+                                              self.verinfo[-2])
3780+                e = CorruptShareError(reader.peerid,
3781+                                      reader.shnum,
3782+                                      "invalid privkey")
3783+                f = failure.Failure(e)
3784+                self._bad_shares.add((reader.peerid, reader.shnum, f))
3785             return
3786 
3787         # it's good
3788hunk ./src/allmydata/mutable/retrieve.py 904
3789         statements, I return the decrypted contents to the owner of this
3790         Retrieve object through self._done_deferred.
3791         """
3792-        eventually(self._done_deferred.callback, self._plaintext)
3793+        if self._verify:
3794+            ret = list(self._bad_shares)
3795+            self.log("done verifying, found %d bad shares" % len(ret))
3796+        else:
3797+            ret = self._plaintext
3798+        eventually(self._done_deferred.callback, ret)
3799 
3800 
3801     def _failed(self):
3802hunk ./src/allmydata/mutable/retrieve.py 920
3803         to the caller of this Retrieve object through
3804         self._done_deferred.
3805         """
3806-        format = ("ran out of peers: "
3807-                  "have %(have)d of %(total)d segments "
3808-                  "found %(bad)d bad shares "
3809-                  "encoding %(k)d-of-%(n)d")
3810-        args = {"have": self._current_segment,
3811-                "total": self._num_segments,
3812-                "k": self._required_shares,
3813-                "n": self._total_shares,
3814-                "bad": len(self._bad_shares)}
3815-        e = NotEnoughSharesError("%s, last failure: %s" % (format % args,
3816-                                                        str(self._last_failure)))
3817-        f = failure.Failure(e)
3818-        eventually(self._done_deferred.callback, f)
3819+        if self._verify:
3820+            ret = list(self._bad_shares)
3821+        else:
3822+            format = ("ran out of peers: "
3823+                      "have %(have)d of %(total)d segments "
3824+                      "found %(bad)d bad shares "
3825+                      "encoding %(k)d-of-%(n)d")
3826+            args = {"have": self._current_segment,
3827+                    "total": self._num_segments,
3828+                    "k": self._required_shares,
3829+                    "n": self._total_shares,
3830+                    "bad": len(self._bad_shares)}
3831+            e = NotEnoughSharesError("%s, last failure: %s" % \
3832+                                     (format % args, str(self._last_failure)))
3833+            f = failure.Failure(e)
3834+            ret = f
3835+        eventually(self._done_deferred.callback, ret)
3836}
3837[interfaces.py: add IMutableSlotWriter
3838Kevan Carstensen <kevan@isnotajoke.com>**20100630183305
3839 Ignore-this: ff9dca96ef1a009ae85485682f81ea5
3840] hunk ./src/allmydata/interfaces.py 418
3841         """
3842 
3843 
3844+class IMutableSlotWriter(Interface):
3845+    """
3846+    The interface for a writer around a mutable slot on a remote server.
3847+    """
3848+    def set_checkstring(checkstring, *args):
3849+        """
3850+        Set the checkstring that I will pass to the remote server when
3851+        writing.
3852+
3853+            @param checkstring A packed checkstring to use.
3854+
3855+        Note that implementations can differ in which semantics they
3856+        wish to support for set_checkstring -- they can, for example,
3857+        build the checkstring themselves from its constituents, or
3858+        some other thing.
3859+        """
3860+
3861+    def get_checkstring():
3862+        """
3863+        Get the checkstring that I think currently exists on the remote
3864+        server.
3865+        """
3866+
3867+    def put_block(data, segnum, salt):
3868+        """
3869+        Add a block and salt to the share.
3870+        """
3871+
3872+    def put_encprivey(encprivkey):
3873+        """
3874+        Add the encrypted private key to the share.
3875+        """
3876+
3877+    def put_blockhashes(blockhashes=list):
3878+        """
3879+        Add the block hash tree to the share.
3880+        """
3881+
3882+    def put_sharehashes(sharehashes=dict):
3883+        """
3884+        Add the share hash chain to the share.
3885+        """
3886+
3887+    def get_signable():
3888+        """
3889+        Return the part of the share that needs to be signed.
3890+        """
3891+
3892+    def put_signature(signature):
3893+        """
3894+        Add the signature to the share.
3895+        """
3896+
3897+    def put_verification_key(verification_key):
3898+        """
3899+        Add the verification key to the share.
3900+        """
3901+
3902+    def finish_publishing():
3903+        """
3904+        Do anything necessary to finish writing the share to a remote
3905+        server. I require that no further publishing needs to take place
3906+        after this method has been called.
3907+        """
3908+
3909+
3910 class IURI(Interface):
3911     def init_from_string(uri):
3912         """Accept a string (as created by my to_string() method) and populate
3913[test/test_mutable.py: temporarily disable two tests that are now irrelevant
3914Kevan Carstensen <kevan@isnotajoke.com>**20100701232806
3915 Ignore-this: 701e143567f3954812ca6960af1d6ac7
3916] {
3917hunk ./src/allmydata/test/test_mutable.py 651
3918             self.failUnlessEqual(len(share_ids), 10)
3919         d.addCallback(_done)
3920         return d
3921+    test_encrypt.todo = "Write an equivalent of this for the new uploader"
3922 
3923     def test_generate(self):
3924         nm = make_nodemaker()
3925hunk ./src/allmydata/test/test_mutable.py 713
3926                 self.failUnlessEqual(enc_privkey, self._fn.get_encprivkey())
3927         d.addCallback(_generated)
3928         return d
3929+    test_generate.todo = "Write an equivalent of this for the new uploader"
3930 
3931     # TODO: when we publish to 20 peers, we should get one share per peer on 10
3932     # when we publish to 3 peers, we should get either 3 or 4 shares per peer
3933}
3934[Add MDMF reader and writer, and SDMF writer
3935Kevan Carstensen <kevan@isnotajoke.com>**20100702225531
3936 Ignore-this: bf6276a91d27dcb4e779b0eb82ea1843
3937 
3938 The MDMF/SDMF reader MDMF writer, and SDMF writer are similar to the
3939 object proxies that exist for immutable files. They abstract away
3940 details of connection, state, and caching from their callers (in this
3941 case, the download, servermap updater, and uploader), and expose methods
3942 to get and set information on the remote server.
3943 
3944 MDMFSlotReadProxy reads a mutable file from the server, doing the right
3945 thing (in most cases) regardless of whether the file is MDMF or SDMF. It
3946 allows callers to tell it how to batch and flush reads.
3947 
3948 MDMFSlotWriteProxy writes an MDMF mutable file to a server.
3949 
3950 SDMFSlotWriteProxy writes an SDMF mutable file to a server.
3951 
3952 This patch also includes tests for MDMFSlotReadProxy,
3953 SDMFSlotWriteProxy, and MDMFSlotWriteProxy.
3954] {
3955hunk ./src/allmydata/mutable/layout.py 4
3956 
3957 import struct
3958 from allmydata.mutable.common import NeedMoreDataError, UnknownVersionError
3959+from allmydata.interfaces import HASH_SIZE, SALT_SIZE, SDMF_VERSION, \
3960+                                 MDMF_VERSION, IMutableSlotWriter
3961+from allmydata.util import mathutil, observer
3962+from twisted.python import failure
3963+from twisted.internet import defer
3964+from zope.interface import implements
3965+
3966+
3967+# These strings describe the format of the packed structs they help process
3968+# Here's what they mean:
3969+#
3970+#  PREFIX:
3971+#    >: Big-endian byte order; the most significant byte is first (leftmost).
3972+#    B: The version information; an 8 bit version identifier. Stored as
3973+#       an unsigned char. This is currently 00 00 00 00; our modifications
3974+#       will turn it into 00 00 00 01.
3975+#    Q: The sequence number; this is sort of like a revision history for
3976+#       mutable files; they start at 1 and increase as they are changed after
3977+#       being uploaded. Stored as an unsigned long long, which is 8 bytes in
3978+#       length.
3979+#  32s: The root hash of the share hash tree. We use sha-256d, so we use 32
3980+#       characters = 32 bytes to store the value.
3981+#  16s: The salt for the readkey. This is a 16-byte random value, stored as
3982+#       16 characters.
3983+#
3984+#  SIGNED_PREFIX additions, things that are covered by the signature:
3985+#    B: The "k" encoding parameter. We store this as an 8-bit character,
3986+#       which is convenient because our erasure coding scheme cannot
3987+#       encode if you ask for more than 255 pieces.
3988+#    B: The "N" encoding parameter. Stored as an 8-bit character for the
3989+#       same reasons as above.
3990+#    Q: The segment size of the uploaded file. This will essentially be the
3991+#       length of the file in SDMF. An unsigned long long, so we can store
3992+#       files of quite large size.
3993+#    Q: The data length of the uploaded file. Modulo padding, this will be
3994+#       the same of the data length field. Like the data length field, it is
3995+#       an unsigned long long and can be quite large.
3996+#
3997+#   HEADER additions:
3998+#     L: The offset of the signature of this. An unsigned long.
3999+#     L: The offset of the share hash chain. An unsigned long.
4000+#     L: The offset of the block hash tree. An unsigned long.
4001+#     L: The offset of the share data. An unsigned long.
4002+#     Q: The offset of the encrypted private key. An unsigned long long, to
4003+#        account for the possibility of a lot of share data.
4004+#     Q: The offset of the EOF. An unsigned long long, to account for the
4005+#        possibility of a lot of share data.
4006+#
4007+#  After all of these, we have the following:
4008+#    - The verification key: Occupies the space between the end of the header
4009+#      and the start of the signature (i.e.: data[HEADER_LENGTH:o['signature']].
4010+#    - The signature, which goes from the signature offset to the share hash
4011+#      chain offset.
4012+#    - The share hash chain, which goes from the share hash chain offset to
4013+#      the block hash tree offset.
4014+#    - The share data, which goes from the share data offset to the encrypted
4015+#      private key offset.
4016+#    - The encrypted private key offset, which goes until the end of the file.
4017+#
4018+#  The block hash tree in this encoding has only one share, so the offset of
4019+#  the share data will be 32 bits more than the offset of the block hash tree.
4020+#  Given this, we may need to check to see how many bytes a reasonably sized
4021+#  block hash tree will take up.
4022 
4023 PREFIX = ">BQ32s16s" # each version has a different prefix
4024 SIGNED_PREFIX = ">BQ32s16s BBQQ" # this is covered by the signature
4025hunk ./src/allmydata/mutable/layout.py 73
4026 SIGNED_PREFIX_LENGTH = struct.calcsize(SIGNED_PREFIX)
4027 HEADER = ">BQ32s16s BBQQ LLLLQQ" # includes offsets
4028 HEADER_LENGTH = struct.calcsize(HEADER)
4029+OFFSETS = ">LLLLQQ"
4030+OFFSETS_LENGTH = struct.calcsize(OFFSETS)
4031 
4032 def unpack_header(data):
4033     o = {}
4034hunk ./src/allmydata/mutable/layout.py 194
4035     return (share_hash_chain, block_hash_tree, share_data)
4036 
4037 
4038-def pack_checkstring(seqnum, root_hash, IV):
4039+def pack_checkstring(seqnum, root_hash, IV, version=0):
4040     return struct.pack(PREFIX,
4041hunk ./src/allmydata/mutable/layout.py 196
4042-                       0, # version,
4043+                       version,
4044                        seqnum,
4045                        root_hash,
4046                        IV)
4047hunk ./src/allmydata/mutable/layout.py 269
4048                            encprivkey])
4049     return final_share
4050 
4051+def pack_prefix(seqnum, root_hash, IV,
4052+                required_shares, total_shares,
4053+                segment_size, data_length):
4054+    prefix = struct.pack(SIGNED_PREFIX,
4055+                         0, # version,
4056+                         seqnum,
4057+                         root_hash,
4058+                         IV,
4059+                         required_shares,
4060+                         total_shares,
4061+                         segment_size,
4062+                         data_length,
4063+                         )
4064+    return prefix
4065+
4066+
4067+class SDMFSlotWriteProxy:
4068+    implements(IMutableSlotWriter)
4069+    """
4070+    I represent a remote write slot for an SDMF mutable file. I build a
4071+    share in memory, and then write it in one piece to the remote
4072+    server. This mimics how SDMF shares were built before MDMF (and the
4073+    new MDMF uploader), but provides that functionality in a way that
4074+    allows the MDMF uploader to be built without much special-casing for
4075+    file format, which makes the uploader code more readable.
4076+    """
4077+    def __init__(self,
4078+                 shnum,
4079+                 rref, # a remote reference to a storage server
4080+                 storage_index,
4081+                 secrets, # (write_enabler, renew_secret, cancel_secret)
4082+                 seqnum, # the sequence number of the mutable file
4083+                 required_shares,
4084+                 total_shares,
4085+                 segment_size,
4086+                 data_length): # the length of the original file
4087+        self.shnum = shnum
4088+        self._rref = rref
4089+        self._storage_index = storage_index
4090+        self._secrets = secrets
4091+        self._seqnum = seqnum
4092+        self._required_shares = required_shares
4093+        self._total_shares = total_shares
4094+        self._segment_size = segment_size
4095+        self._data_length = data_length
4096+
4097+        # This is an SDMF file, so it should have only one segment, so,
4098+        # modulo padding of the data length, the segment size and the
4099+        # data length should be the same.
4100+        expected_segment_size = mathutil.next_multiple(data_length,
4101+                                                       self._required_shares)
4102+        assert expected_segment_size == segment_size
4103+
4104+        self._block_size = self._segment_size / self._required_shares
4105+
4106+        # This is meant to mimic how SDMF files were built before MDMF
4107+        # entered the picture: we generate each share in its entirety,
4108+        # then push it off to the storage server in one write. When
4109+        # callers call set_*, they are just populating this dict.
4110+        # finish_publishing will stitch these pieces together into a
4111+        # coherent share, and then write the coherent share to the
4112+        # storage server.
4113+        self._share_pieces = {}
4114+
4115+        # This tells the write logic what checkstring to use when
4116+        # writing remote shares.
4117+        self._testvs = []
4118+
4119+        self._readvs = [(0, struct.calcsize(PREFIX))]
4120+
4121+
4122+    def set_checkstring(self, checkstring_or_seqnum,
4123+                              root_hash=None,
4124+                              salt=None):
4125+        """
4126+        Set the checkstring that I will pass to the remote server when
4127+        writing.
4128+
4129+            @param checkstring_or_seqnum: A packed checkstring to use,
4130+                   or a sequence number. I will treat this as a checkstr
4131+
4132+        Note that implementations can differ in which semantics they
4133+        wish to support for set_checkstring -- they can, for example,
4134+        build the checkstring themselves from its constituents, or
4135+        some other thing.
4136+        """
4137+        if root_hash and salt:
4138+            checkstring = struct.pack(PREFIX,
4139+                                      0,
4140+                                      checkstring_or_seqnum,
4141+                                      root_hash,
4142+                                      salt)
4143+        else:
4144+            checkstring = checkstring_or_seqnum
4145+        self._testvs = [(0, len(checkstring), "eq", checkstring)]
4146+
4147+
4148+    def get_checkstring(self):
4149+        """
4150+        Get the checkstring that I think currently exists on the remote
4151+        server.
4152+        """
4153+        if self._testvs:
4154+            return self._testvs[0][3]
4155+        return ""
4156+
4157+
4158+    def put_block(self, data, segnum, salt):
4159+        """
4160+        Add a block and salt to the share.
4161+        """
4162+        # SDMF files have only one segment
4163+        assert segnum == 0
4164+        assert len(data) == self._block_size
4165+        assert len(salt) == SALT_SIZE
4166+
4167+        self._share_pieces['sharedata'] = data
4168+        self._share_pieces['salt'] = salt
4169+
4170+        # TODO: Figure out something intelligent to return.
4171+        return defer.succeed(None)
4172+
4173+
4174+    def put_encprivkey(self, encprivkey):
4175+        """
4176+        Add the encrypted private key to the share.
4177+        """
4178+        self._share_pieces['encprivkey'] = encprivkey
4179+
4180+        return defer.succeed(None)
4181+
4182+
4183+    def put_blockhashes(self, blockhashes):
4184+        """
4185+        Add the block hash tree to the share.
4186+        """
4187+        assert isinstance(blockhashes, list)
4188+        for h in blockhashes:
4189+            assert len(h) == HASH_SIZE
4190+
4191+        # serialize the blockhashes, then set them.
4192+        blockhashes_s = "".join(blockhashes)
4193+        self._share_pieces['block_hash_tree'] = blockhashes_s
4194+
4195+        return defer.succeed(None)
4196+
4197+
4198+    def put_sharehashes(self, sharehashes):
4199+        """
4200+        Add the share hash chain to the share.
4201+        """
4202+        assert isinstance(sharehashes, dict)
4203+        for h in sharehashes.itervalues():
4204+            assert len(h) == HASH_SIZE
4205+
4206+        # serialize the sharehashes, then set them.
4207+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
4208+                                 for i in sorted(sharehashes.keys())])
4209+        self._share_pieces['share_hash_chain'] = sharehashes_s
4210+
4211+        return defer.succeed(None)
4212+
4213+
4214+    def put_root_hash(self, root_hash):
4215+        """
4216+        Add the root hash to the share.
4217+        """
4218+        assert len(root_hash) == HASH_SIZE
4219+
4220+        self._share_pieces['root_hash'] = root_hash
4221+
4222+        return defer.succeed(None)
4223+
4224+
4225+    def put_salt(self, salt):
4226+        """
4227+        Add a salt to an empty SDMF file.
4228+        """
4229+        assert len(salt) == SALT_SIZE
4230+
4231+        self._share_pieces['salt'] = salt
4232+        self._share_pieces['sharedata'] = ""
4233+
4234+
4235+    def get_signable(self):
4236+        """
4237+        Return the part of the share that needs to be signed.
4238+
4239+        SDMF writers need to sign the packed representation of the
4240+        first eight fields of the remote share, that is:
4241+            - version number (0)
4242+            - sequence number
4243+            - root of the share hash tree
4244+            - salt
4245+            - k
4246+            - n
4247+            - segsize
4248+            - datalen
4249+
4250+        This method is responsible for returning that to callers.
4251+        """
4252+        return struct.pack(SIGNED_PREFIX,
4253+                           0,
4254+                           self._seqnum,
4255+                           self._share_pieces['root_hash'],
4256+                           self._share_pieces['salt'],
4257+                           self._required_shares,
4258+                           self._total_shares,
4259+                           self._segment_size,
4260+                           self._data_length)
4261+
4262+
4263+    def put_signature(self, signature):
4264+        """
4265+        Add the signature to the share.
4266+        """
4267+        self._share_pieces['signature'] = signature
4268+
4269+        return defer.succeed(None)
4270+
4271+
4272+    def put_verification_key(self, verification_key):
4273+        """
4274+        Add the verification key to the share.
4275+        """
4276+        self._share_pieces['verification_key'] = verification_key
4277+
4278+        return defer.succeed(None)
4279+
4280+
4281+    def get_verinfo(self):
4282+        """
4283+        I return my verinfo tuple. This is used by the ServermapUpdater
4284+        to keep track of versions of mutable files.
4285+
4286+        The verinfo tuple for MDMF files contains:
4287+            - seqnum
4288+            - root hash
4289+            - a blank (nothing)
4290+            - segsize
4291+            - datalen
4292+            - k
4293+            - n
4294+            - prefix (the thing that you sign)
4295+            - a tuple of offsets
4296+
4297+        We include the nonce in MDMF to simplify processing of version
4298+        information tuples.
4299+
4300+        The verinfo tuple for SDMF files is the same, but contains a
4301+        16-byte IV instead of a hash of salts.
4302+        """
4303+        return (self._seqnum,
4304+                self._share_pieces['root_hash'],
4305+                self._share_pieces['salt'],
4306+                self._segment_size,
4307+                self._data_length,
4308+                self._required_shares,
4309+                self._total_shares,
4310+                self.get_signable(),
4311+                self._get_offsets_tuple())
4312+
4313+    def _get_offsets_dict(self):
4314+        post_offset = HEADER_LENGTH
4315+        offsets = {}
4316+
4317+        verification_key_length = len(self._share_pieces['verification_key'])
4318+        o1 = offsets['signature'] = post_offset + verification_key_length
4319+
4320+        signature_length = len(self._share_pieces['signature'])
4321+        o2 = offsets['share_hash_chain'] = o1 + signature_length
4322+
4323+        share_hash_chain_length = len(self._share_pieces['share_hash_chain'])
4324+        o3 = offsets['block_hash_tree'] = o2 + share_hash_chain_length
4325+
4326+        block_hash_tree_length = len(self._share_pieces['block_hash_tree'])
4327+        o4 = offsets['share_data'] = o3 + block_hash_tree_length
4328+
4329+        share_data_length = len(self._share_pieces['sharedata'])
4330+        o5 = offsets['enc_privkey'] = o4 + share_data_length
4331+
4332+        encprivkey_length = len(self._share_pieces['encprivkey'])
4333+        offsets['EOF'] = o5 + encprivkey_length
4334+        return offsets
4335+
4336+
4337+    def _get_offsets_tuple(self):
4338+        offsets = self._get_offsets_dict()
4339+        return tuple([(key, value) for key, value in offsets.items()])
4340+
4341+
4342+    def _pack_offsets(self):
4343+        offsets = self._get_offsets_dict()
4344+        return struct.pack(">LLLLQQ",
4345+                           offsets['signature'],
4346+                           offsets['share_hash_chain'],
4347+                           offsets['block_hash_tree'],
4348+                           offsets['share_data'],
4349+                           offsets['enc_privkey'],
4350+                           offsets['EOF'])
4351+
4352+
4353+    def finish_publishing(self):
4354+        """
4355+        Do anything necessary to finish writing the share to a remote
4356+        server. I require that no further publishing needs to take place
4357+        after this method has been called.
4358+        """
4359+        for k in ["sharedata", "encprivkey", "signature", "verification_key",
4360+                  "share_hash_chain", "block_hash_tree"]:
4361+            assert k in self._share_pieces
4362+        # This is the only method that actually writes something to the
4363+        # remote server.
4364+        # First, we need to pack the share into data that we can write
4365+        # to the remote server in one write.
4366+        offsets = self._pack_offsets()
4367+        prefix = self.get_signable()
4368+        final_share = "".join([prefix,
4369+                               offsets,
4370+                               self._share_pieces['verification_key'],
4371+                               self._share_pieces['signature'],
4372+                               self._share_pieces['share_hash_chain'],
4373+                               self._share_pieces['block_hash_tree'],
4374+                               self._share_pieces['sharedata'],
4375+                               self._share_pieces['encprivkey']])
4376+
4377+        # Our only data vector is going to be writing the final share,
4378+        # in its entirely.
4379+        datavs = [(0, final_share)]
4380+
4381+        if not self._testvs:
4382+            # Our caller has not provided us with another checkstring
4383+            # yet, so we assume that we are writing a new share, and set
4384+            # a test vector that will allow a new share to be written.
4385+            self._testvs = []
4386+            self._testvs.append(tuple([0, 1, "eq", ""]))
4387+            new_share = True
4388+
4389+        tw_vectors = {}
4390+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
4391+        return self._rref.callRemote("slot_testv_and_readv_and_writev",
4392+                                     self._storage_index,
4393+                                     self._secrets,
4394+                                     tw_vectors,
4395+                                     # TODO is it useful to read something?
4396+                                     self._readvs)
4397+
4398+
4399+MDMFHEADER = ">BQ32sBBQQ QQQQQQ"
4400+MDMFHEADERWITHOUTOFFSETS = ">BQ32sBBQQ"
4401+MDMFHEADERSIZE = struct.calcsize(MDMFHEADER)
4402+MDMFHEADERWITHOUTOFFSETSSIZE = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
4403+MDMFCHECKSTRING = ">BQ32s"
4404+MDMFSIGNABLEHEADER = ">BQ32sBBQQ"
4405+MDMFOFFSETS = ">QQQQQQ"
4406+MDMFOFFSETS_LENGTH = struct.calcsize(MDMFOFFSETS)
4407+
4408+class MDMFSlotWriteProxy:
4409+    implements(IMutableSlotWriter)
4410+
4411+    """
4412+    I represent a remote write slot for an MDMF mutable file.
4413+
4414+    I abstract away from my caller the details of block and salt
4415+    management, and the implementation of the on-disk format for MDMF
4416+    shares.
4417+    """
4418+    # Expected layout, MDMF:
4419+    # offset:     size:       name:
4420+    #-- signed part --
4421+    # 0           1           version number (01)
4422+    # 1           8           sequence number
4423+    # 9           32          share tree root hash
4424+    # 41          1           The "k" encoding parameter
4425+    # 42          1           The "N" encoding parameter
4426+    # 43          8           The segment size of the uploaded file
4427+    # 51          8           The data length of the original plaintext
4428+    #-- end signed part --
4429+    # 59          8           The offset of the encrypted private key
4430+    # 67          8           The offset of the block hash tree
4431+    # 75          8           The offset of the share hash chain
4432+    # 83          8           The offset of the signature
4433+    # 91          8           The offset of the verification key
4434+    # 99          8           The offset of the EOF
4435+    #
4436+    # followed by salts and share data, the encrypted private key, the
4437+    # block hash tree, the salt hash tree, the share hash chain, a
4438+    # signature over the first eight fields, and a verification key.
4439+    #
4440+    # The checkstring is the first three fields -- the version number,
4441+    # sequence number, root hash and root salt hash. This is consistent
4442+    # in meaning to what we have with SDMF files, except now instead of
4443+    # using the literal salt, we use a value derived from all of the
4444+    # salts -- the share hash root.
4445+    #
4446+    # The salt is stored before the block for each segment. The block
4447+    # hash tree is computed over the combination of block and salt for
4448+    # each segment. In this way, we get integrity checking for both
4449+    # block and salt with the current block hash tree arrangement.
4450+    #
4451+    # The ordering of the offsets is different to reflect the dependencies
4452+    # that we'll run into with an MDMF file. The expected write flow is
4453+    # something like this:
4454+    #
4455+    #   0: Initialize with the sequence number, encoding parameters and
4456+    #      data length. From this, we can deduce the number of segments,
4457+    #      and where they should go.. We can also figure out where the
4458+    #      encrypted private key should go, because we can figure out how
4459+    #      big the share data will be.
4460+    #
4461+    #   1: Encrypt, encode, and upload the file in chunks. Do something
4462+    #      like
4463+    #
4464+    #       put_block(data, segnum, salt)
4465+    #
4466+    #      to write a block and a salt to the disk. We can do both of
4467+    #      these operations now because we have enough of the offsets to
4468+    #      know where to put them.
4469+    #
4470+    #   2: Put the encrypted private key. Use:
4471+    #
4472+    #        put_encprivkey(encprivkey)
4473+    #
4474+    #      Now that we know the length of the private key, we can fill
4475+    #      in the offset for the block hash tree.
4476+    #
4477+    #   3: We're now in a position to upload the block hash tree for
4478+    #      a share. Put that using something like:
4479+    #       
4480+    #        put_blockhashes(block_hash_tree)
4481+    #
4482+    #      Note that block_hash_tree is a list of hashes -- we'll take
4483+    #      care of the details of serializing that appropriately. When
4484+    #      we get the block hash tree, we are also in a position to
4485+    #      calculate the offset for the share hash chain, and fill that
4486+    #      into the offsets table.
4487+    #
4488+    #   4: At the same time, we're in a position to upload the salt hash
4489+    #      tree. This is a Merkle tree over all of the salts. We use a
4490+    #      Merkle tree so that we can validate each block,salt pair as
4491+    #      we download them later. We do this using
4492+    #
4493+    #        put_salthashes(salt_hash_tree)
4494+    #
4495+    #      When you do this, I automatically put the root of the tree
4496+    #      (the hash at index 0 of the list) in its appropriate slot in
4497+    #      the signed prefix of the share.
4498+    #
4499+    #   5: We're now in a position to upload the share hash chain for
4500+    #      a share. Do that with something like:
4501+    #     
4502+    #        put_sharehashes(share_hash_chain)
4503+    #
4504+    #      share_hash_chain should be a dictionary mapping shnums to
4505+    #      32-byte hashes -- the wrapper handles serialization.
4506+    #      We'll know where to put the signature at this point, also.
4507+    #      The root of this tree will be put explicitly in the next
4508+    #      step.
4509+    #
4510+    #      TODO: Why? Why not just include it in the tree here?
4511+    #
4512+    #   6: Before putting the signature, we must first put the
4513+    #      root_hash. Do this with:
4514+    #
4515+    #        put_root_hash(root_hash).
4516+    #     
4517+    #      In terms of knowing where to put this value, it was always
4518+    #      possible to place it, but it makes sense semantically to
4519+    #      place it after the share hash tree, so that's why you do it
4520+    #      in this order.
4521+    #
4522+    #   6: With the root hash put, we can now sign the header. Use:
4523+    #
4524+    #        get_signable()
4525+    #
4526+    #      to get the part of the header that you want to sign, and use:
4527+    #       
4528+    #        put_signature(signature)
4529+    #
4530+    #      to write your signature to the remote server.
4531+    #
4532+    #   6: Add the verification key, and finish. Do:
4533+    #
4534+    #        put_verification_key(key)
4535+    #
4536+    #      and
4537+    #
4538+    #        finish_publish()
4539+    #
4540+    # Checkstring management:
4541+    #
4542+    # To write to a mutable slot, we have to provide test vectors to ensure
4543+    # that we are writing to the same data that we think we are. These
4544+    # vectors allow us to detect uncoordinated writes; that is, writes
4545+    # where both we and some other shareholder are writing to the
4546+    # mutable slot, and to report those back to the parts of the program
4547+    # doing the writing.
4548+    #
4549+    # With SDMF, this was easy -- all of the share data was written in
4550+    # one go, so it was easy to detect uncoordinated writes, and we only
4551+    # had to do it once. With MDMF, not all of the file is written at
4552+    # once.
4553+    #
4554+    # If a share is new, we write out as much of the header as we can
4555+    # before writing out anything else. This gives other writers a
4556+    # canary that they can use to detect uncoordinated writes, and, if
4557+    # they do the same thing, gives us the same canary. We them update
4558+    # the share. We won't be able to write out two fields of the header
4559+    # -- the share tree hash and the salt hash -- until we finish
4560+    # writing out the share. We only require the writer to provide the
4561+    # initial checkstring, and keep track of what it should be after
4562+    # updates ourselves.
4563+    #
4564+    # If we haven't written anything yet, then on the first write (which
4565+    # will probably be a block + salt of a share), we'll also write out
4566+    # the header. On subsequent passes, we'll expect to see the header.
4567+    # This changes in two places:
4568+    #
4569+    #   - When we write out the salt hash
4570+    #   - When we write out the root of the share hash tree
4571+    #
4572+    # since these values will change the header. It is possible that we
4573+    # can just make those be written in one operation to minimize
4574+    # disruption.
4575+    def __init__(self,
4576+                 shnum,
4577+                 rref, # a remote reference to a storage server
4578+                 storage_index,
4579+                 secrets, # (write_enabler, renew_secret, cancel_secret)
4580+                 seqnum, # the sequence number of the mutable file
4581+                 required_shares,
4582+                 total_shares,
4583+                 segment_size,
4584+                 data_length): # the length of the original file
4585+        self.shnum = shnum
4586+        self._rref = rref
4587+        self._storage_index = storage_index
4588+        self._seqnum = seqnum
4589+        self._required_shares = required_shares
4590+        assert self.shnum >= 0 and self.shnum < total_shares
4591+        self._total_shares = total_shares
4592+        # We build up the offset table as we write things. It is the
4593+        # last thing we write to the remote server.
4594+        self._offsets = {}
4595+        self._testvs = []
4596+        self._secrets = secrets
4597+        # The segment size needs to be a multiple of the k parameter --
4598+        # any padding should have been carried out by the publisher
4599+        # already.
4600+        assert segment_size % required_shares == 0
4601+        self._segment_size = segment_size
4602+        self._data_length = data_length
4603+
4604+        # These are set later -- we define them here so that we can
4605+        # check for their existence easily
4606+
4607+        # This is the root of the share hash tree -- the Merkle tree
4608+        # over the roots of the block hash trees computed for shares in
4609+        # this upload.
4610+        self._root_hash = None
4611+
4612+        # We haven't yet written anything to the remote bucket. By
4613+        # setting this, we tell the _write method as much. The write
4614+        # method will then know that it also needs to add a write vector
4615+        # for the checkstring (or what we have of it) to the first write
4616+        # request. We'll then record that value for future use.  If
4617+        # we're expecting something to be there already, we need to call
4618+        # set_checkstring before we write anything to tell the first
4619+        # write about that.
4620+        self._written = False
4621+
4622+        # When writing data to the storage servers, we get a read vector
4623+        # for free. We'll read the checkstring, which will help us
4624+        # figure out what's gone wrong if a write fails.
4625+        self._readv = [(0, struct.calcsize(MDMFCHECKSTRING))]
4626+
4627+        # We calculate the number of segments because it tells us
4628+        # where the salt part of the file ends/share segment begins,
4629+        # and also because it provides a useful amount of bounds checking.
4630+        self._num_segments = mathutil.div_ceil(self._data_length,
4631+                                               self._segment_size)
4632+        self._block_size = self._segment_size / self._required_shares
4633+        # We also calculate the share size, to help us with block
4634+        # constraints later.
4635+        tail_size = self._data_length % self._segment_size
4636+        if not tail_size:
4637+            self._tail_block_size = self._block_size
4638+        else:
4639+            self._tail_block_size = mathutil.next_multiple(tail_size,
4640+                                                           self._required_shares)
4641+            self._tail_block_size /= self._required_shares
4642+
4643+        # We already know where the sharedata starts; right after the end
4644+        # of the header (which is defined as the signable part + the offsets)
4645+        # We can also calculate where the encrypted private key begins
4646+        # from what we know know.
4647+        self._actual_block_size = self._block_size + SALT_SIZE
4648+        data_size = self._actual_block_size * (self._num_segments - 1)
4649+        data_size += self._tail_block_size
4650+        data_size += SALT_SIZE
4651+        self._offsets['enc_privkey'] = MDMFHEADERSIZE
4652+        self._offsets['enc_privkey'] += data_size
4653+        # We'll wait for the rest. Callers can now call my "put_block" and
4654+        # "set_checkstring" methods.
4655+
4656+
4657+    def set_checkstring(self,
4658+                        seqnum_or_checkstring,
4659+                        root_hash=None,
4660+                        salt=None):
4661+        """
4662+        Set checkstring checkstring for the given shnum.
4663+
4664+        This can be invoked in one of two ways.
4665+
4666+        With one argument, I assume that you are giving me a literal
4667+        checkstring -- e.g., the output of get_checkstring. I will then
4668+        set that checkstring as it is. This form is used by unit tests.
4669+
4670+        With two arguments, I assume that you are giving me a sequence
4671+        number and root hash to make a checkstring from. In that case, I
4672+        will build a checkstring and set it for you. This form is used
4673+        by the publisher.
4674+
4675+        By default, I assume that I am writing new shares to the grid.
4676+        If you don't explcitly set your own checkstring, I will use
4677+        one that requires that the remote share not exist. You will want
4678+        to use this method if you are updating a share in-place;
4679+        otherwise, writes will fail.
4680+        """
4681+        # You're allowed to overwrite checkstrings with this method;
4682+        # I assume that users know what they are doing when they call
4683+        # it.
4684+        if root_hash:
4685+            checkstring = struct.pack(MDMFCHECKSTRING,
4686+                                      1,
4687+                                      seqnum_or_checkstring,
4688+                                      root_hash)
4689+        else:
4690+            checkstring = seqnum_or_checkstring
4691+
4692+        if checkstring == "":
4693+            # We special-case this, since len("") = 0, but we need
4694+            # length of 1 for the case of an empty share to work on the
4695+            # storage server, which is what a checkstring that is the
4696+            # empty string means.
4697+            self._testvs = []
4698+        else:
4699+            self._testvs = []
4700+            self._testvs.append((0, len(checkstring), "eq", checkstring))
4701+
4702+
4703+    def __repr__(self):
4704+        return "MDMFSlotWriteProxy for share %d" % self.shnum
4705+
4706+
4707+    def get_checkstring(self):
4708+        """
4709+        Given a share number, I return a representation of what the
4710+        checkstring for that share on the server will look like.
4711+
4712+        I am mostly used for tests.
4713+        """
4714+        if self._root_hash:
4715+            roothash = self._root_hash
4716+        else:
4717+            roothash = "\x00" * 32
4718+        return struct.pack(MDMFCHECKSTRING,
4719+                           1,
4720+                           self._seqnum,
4721+                           roothash)
4722+
4723+
4724+    def put_block(self, data, segnum, salt):
4725+        """
4726+        Put the encrypted-and-encoded data segment in the slot, along
4727+        with the salt.
4728+        """
4729+        if segnum >= self._num_segments:
4730+            raise LayoutInvalid("I won't overwrite the private key")
4731+        if len(salt) != SALT_SIZE:
4732+            raise LayoutInvalid("I was given a salt of size %d, but "
4733+                                "I wanted a salt of size %d")
4734+        if segnum + 1 == self._num_segments:
4735+            if len(data) != self._tail_block_size:
4736+                raise LayoutInvalid("I was given the wrong size block to write")
4737+        elif len(data) != self._block_size:
4738+            raise LayoutInvalid("I was given the wrong size block to write")
4739+
4740+        # We want to write at len(MDMFHEADER) + segnum * block_size.
4741+
4742+        offset = MDMFHEADERSIZE + (self._actual_block_size * segnum)
4743+        data = salt + data
4744+
4745+        datavs = [tuple([offset, data])]
4746+        return self._write(datavs)
4747+
4748+
4749+    def put_encprivkey(self, encprivkey):
4750+        """
4751+        Put the encrypted private key in the remote slot.
4752+        """
4753+        assert self._offsets
4754+        assert self._offsets['enc_privkey']
4755+        # You shouldn't re-write the encprivkey after the block hash
4756+        # tree is written, since that could cause the private key to run
4757+        # into the block hash tree. Before it writes the block hash
4758+        # tree, the block hash tree writing method writes the offset of
4759+        # the salt hash tree. So that's a good indicator of whether or
4760+        # not the block hash tree has been written.
4761+        if "share_hash_chain" in self._offsets:
4762+            raise LayoutInvalid("You must write this before the block hash tree")
4763+
4764+        self._offsets['block_hash_tree'] = self._offsets['enc_privkey'] + len(encprivkey)
4765+        datavs = [(tuple([self._offsets['enc_privkey'], encprivkey]))]
4766+        def _on_failure():
4767+            del(self._offsets['block_hash_tree'])
4768+        return self._write(datavs, on_failure=_on_failure)
4769+
4770+
4771+    def put_blockhashes(self, blockhashes):
4772+        """
4773+        Put the block hash tree in the remote slot.
4774+
4775+        The encrypted private key must be put before the block hash
4776+        tree, since we need to know how large it is to know where the
4777+        block hash tree should go. The block hash tree must be put
4778+        before the salt hash tree, since its size determines the
4779+        offset of the share hash chain.
4780+        """
4781+        assert self._offsets
4782+        assert isinstance(blockhashes, list)
4783+        if "block_hash_tree" not in self._offsets:
4784+            raise LayoutInvalid("You must put the encrypted private key "
4785+                                "before you put the block hash tree")
4786+        # If written, the share hash chain causes the signature offset
4787+        # to be defined.
4788+        if "signature" in self._offsets:
4789+            raise LayoutInvalid("You must put the block hash tree before "
4790+                                "you put the share hash chain")
4791+        blockhashes_s = "".join(blockhashes)
4792+        self._offsets['share_hash_chain'] = self._offsets['block_hash_tree'] + len(blockhashes_s)
4793+        datavs = []
4794+        datavs.append(tuple([self._offsets['block_hash_tree'], blockhashes_s]))
4795+        def _on_failure():
4796+            del(self._offsets['share_hash_chain'])
4797+        return self._write(datavs, on_failure=_on_failure)
4798+
4799+
4800+    def put_sharehashes(self, sharehashes):
4801+        """
4802+        Put the share hash chain in the remote slot.
4803+
4804+        The salt hash tree must be put before the share hash chain,
4805+        since we need to know where the salt hash tree ends before we
4806+        can know where the share hash chain starts. The share hash chain
4807+        must be put before the signature, since the length of the packed
4808+        share hash chain determines the offset of the signature. Also,
4809+        semantically, you must know what the root of the salt hash tree
4810+        is before you can generate a valid signature.
4811+        """
4812+        assert isinstance(sharehashes, dict)
4813+        if "share_hash_chain" not in self._offsets:
4814+            raise LayoutInvalid("You need to put the salt hash tree before "
4815+                                "you can put the share hash chain")
4816+        # The signature comes after the share hash chain. If the
4817+        # signature has already been written, we must not write another
4818+        # share hash chain. The signature writes the verification key
4819+        # offset when it gets sent to the remote server, so we look for
4820+        # that.
4821+        if "verification_key" in self._offsets:
4822+            raise LayoutInvalid("You must write the share hash chain "
4823+                                "before you write the signature")
4824+        datavs = []
4825+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
4826+                                  for i in sorted(sharehashes.keys())])
4827+        self._offsets['signature'] = self._offsets['share_hash_chain'] + len(sharehashes_s)
4828+        datavs.append(tuple([self._offsets['share_hash_chain'], sharehashes_s]))
4829+        def _on_failure():
4830+            del(self._offsets['signature'])
4831+        return self._write(datavs, on_failure=_on_failure)
4832+
4833+
4834+    def put_root_hash(self, roothash):
4835+        """
4836+        Put the root hash (the root of the share hash tree) in the
4837+        remote slot.
4838+        """
4839+        # It does not make sense to be able to put the root
4840+        # hash without first putting the share hashes, since you need
4841+        # the share hashes to generate the root hash.
4842+        #
4843+        # Signature is defined by the routine that places the share hash
4844+        # chain, so it's a good thing to look for in finding out whether
4845+        # or not the share hash chain exists on the remote server.
4846+        if "signature" not in self._offsets:
4847+            raise LayoutInvalid("You need to put the share hash chain "
4848+                                "before you can put the root share hash")
4849+        if len(roothash) != HASH_SIZE:
4850+            raise LayoutInvalid("hashes and salts must be exactly %d bytes"
4851+                                 % HASH_SIZE)
4852+        datavs = []
4853+        self._root_hash = roothash
4854+        # To write both of these values, we update the checkstring on
4855+        # the remote server, which includes them
4856+        checkstring = self.get_checkstring()
4857+        datavs.append(tuple([0, checkstring]))
4858+        # This write, if successful, changes the checkstring, so we need
4859+        # to update our internal checkstring to be consistent with the
4860+        # one on the server.
4861+        def _on_success():
4862+            self._testvs = [(0, len(checkstring), "eq", checkstring)]
4863+        def _on_failure():
4864+            self._root_hash = None
4865+        return self._write(datavs,
4866+                           on_success=_on_success,
4867+                           on_failure=_on_failure)
4868+
4869+
4870+    def get_signable(self):
4871+        """
4872+        Get the first seven fields of the mutable file; the parts that
4873+        are signed.
4874+        """
4875+        if not self._root_hash:
4876+            raise LayoutInvalid("You need to set the root hash "
4877+                                "before getting something to "
4878+                                "sign")
4879+        return struct.pack(MDMFSIGNABLEHEADER,
4880+                           1,
4881+                           self._seqnum,
4882+                           self._root_hash,
4883+                           self._required_shares,
4884+                           self._total_shares,
4885+                           self._segment_size,
4886+                           self._data_length)
4887+
4888+
4889+    def put_signature(self, signature):
4890+        """
4891+        Put the signature field to the remote slot.
4892+
4893+        I require that the root hash and share hash chain have been put
4894+        to the grid before I will write the signature to the grid.
4895+        """
4896+        if "signature" not in self._offsets:
4897+            raise LayoutInvalid("You must put the share hash chain "
4898+        # It does not make sense to put a signature without first
4899+        # putting the root hash and the salt hash (since otherwise
4900+        # the signature would be incomplete), so we don't allow that.
4901+                       "before putting the signature")
4902+        if not self._root_hash:
4903+            raise LayoutInvalid("You must complete the signed prefix "
4904+                                "before computing a signature")
4905+        # If we put the signature after we put the verification key, we
4906+        # could end up running into the verification key, and will
4907+        # probably screw up the offsets as well. So we don't allow that.
4908+        # The method that writes the verification key defines the EOF
4909+        # offset before writing the verification key, so look for that.
4910+        if "EOF" in self._offsets:
4911+            raise LayoutInvalid("You must write the signature before the verification key")
4912+
4913+        self._offsets['verification_key'] = self._offsets['signature'] + len(signature)
4914+        datavs = []
4915+        datavs.append(tuple([self._offsets['signature'], signature]))
4916+        def _on_failure():
4917+            del(self._offsets['verification_key'])
4918+        return self._write(datavs, on_failure=_on_failure)
4919+
4920+
4921+    def put_verification_key(self, verification_key):
4922+        """
4923+        Put the verification key into the remote slot.
4924+
4925+        I require that the signature have been written to the storage
4926+        server before I allow the verification key to be written to the
4927+        remote server.
4928+        """
4929+        if "verification_key" not in self._offsets:
4930+            raise LayoutInvalid("You must put the signature before you "
4931+                                "can put the verification key")
4932+        self._offsets['EOF'] = self._offsets['verification_key'] + len(verification_key)
4933+        datavs = []
4934+        datavs.append(tuple([self._offsets['verification_key'], verification_key]))
4935+        def _on_failure():
4936+            del(self._offsets['EOF'])
4937+        return self._write(datavs, on_failure=_on_failure)
4938+
4939+    def _get_offsets_tuple(self):
4940+        return tuple([(key, value) for key, value in self._offsets.items()])
4941+
4942+    def get_verinfo(self):
4943+        return (self._seqnum,
4944+                self._root_hash,
4945+                self._required_shares,
4946+                self._total_shares,
4947+                self._segment_size,
4948+                self._data_length,
4949+                self.get_signable(),
4950+                self._get_offsets_tuple())
4951+
4952+
4953+    def finish_publishing(self):
4954+        """
4955+        Write the offset table and encoding parameters to the remote
4956+        slot, since that's the only thing we have yet to publish at this
4957+        point.
4958+        """
4959+        if "EOF" not in self._offsets:
4960+            raise LayoutInvalid("You must put the verification key before "
4961+                                "you can publish the offsets")
4962+        offsets_offset = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
4963+        offsets = struct.pack(MDMFOFFSETS,
4964+                              self._offsets['enc_privkey'],
4965+                              self._offsets['block_hash_tree'],
4966+                              self._offsets['share_hash_chain'],
4967+                              self._offsets['signature'],
4968+                              self._offsets['verification_key'],
4969+                              self._offsets['EOF'])
4970+        datavs = []
4971+        datavs.append(tuple([offsets_offset, offsets]))
4972+        encoding_parameters_offset = struct.calcsize(MDMFCHECKSTRING)
4973+        params = struct.pack(">BBQQ",
4974+                             self._required_shares,
4975+                             self._total_shares,
4976+                             self._segment_size,
4977+                             self._data_length)
4978+        datavs.append(tuple([encoding_parameters_offset, params]))
4979+        return self._write(datavs)
4980+
4981+
4982+    def _write(self, datavs, on_failure=None, on_success=None):
4983+        """I write the data vectors in datavs to the remote slot."""
4984+        tw_vectors = {}
4985+        new_share = False
4986+        if not self._testvs:
4987+            self._testvs = []
4988+            self._testvs.append(tuple([0, 1, "eq", ""]))
4989+            new_share = True
4990+        if not self._written:
4991+            # Write a new checkstring to the share when we write it, so
4992+            # that we have something to check later.
4993+            new_checkstring = self.get_checkstring()
4994+            datavs.append((0, new_checkstring))
4995+            def _first_write():
4996+                self._written = True
4997+                self._testvs = [(0, len(new_checkstring), "eq", new_checkstring)]
4998+            on_success = _first_write
4999+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
5000+        datalength = sum([len(x[1]) for x in datavs])
5001+        d = self._rref.callRemote("slot_testv_and_readv_and_writev",
5002+                                  self._storage_index,
5003+                                  self._secrets,
5004+                                  tw_vectors,
5005+                                  self._readv)
5006+        def _result(results):
5007+            if isinstance(results, failure.Failure) or not results[0]:
5008+                # Do nothing; the write was unsuccessful.
5009+                if on_failure: on_failure()
5010+            else:
5011+                if on_success: on_success()
5012+            return results
5013+        d.addCallback(_result)
5014+        return d
5015+
5016+
5017+class MDMFSlotReadProxy:
5018+    """
5019+    I read from a mutable slot filled with data written in the MDMF data
5020+    format (which is described above).
5021+
5022+    I can be initialized with some amount of data, which I will use (if
5023+    it is valid) to eliminate some of the need to fetch it from servers.
5024+    """
5025+    def __init__(self,
5026+                 rref,
5027+                 storage_index,
5028+                 shnum,
5029+                 data=""):
5030+        # Start the initialization process.
5031+        self._rref = rref
5032+        self._storage_index = storage_index
5033+        self.shnum = shnum
5034+
5035+        # Before doing anything, the reader is probably going to want to
5036+        # verify that the signature is correct. To do that, they'll need
5037+        # the verification key, and the signature. To get those, we'll
5038+        # need the offset table. So fetch the offset table on the
5039+        # assumption that that will be the first thing that a reader is
5040+        # going to do.
5041+
5042+        # The fact that these encoding parameters are None tells us
5043+        # that we haven't yet fetched them from the remote share, so we
5044+        # should. We could just not set them, but the checks will be
5045+        # easier to read if we don't have to use hasattr.
5046+        self._version_number = None
5047+        self._sequence_number = None
5048+        self._root_hash = None
5049+        # Filled in if we're dealing with an SDMF file. Unused
5050+        # otherwise.
5051+        self._salt = None
5052+        self._required_shares = None
5053+        self._total_shares = None
5054+        self._segment_size = None
5055+        self._data_length = None
5056+        self._offsets = None
5057+
5058+        # If the user has chosen to initialize us with some data, we'll
5059+        # try to satisfy subsequent data requests with that data before
5060+        # asking the storage server for it. If
5061+        self._data = data
5062+        # The way callers interact with cache in the filenode returns
5063+        # None if there isn't any cached data, but the way we index the
5064+        # cached data requires a string, so convert None to "".
5065+        if self._data == None:
5066+            self._data = ""
5067+
5068+        self._queue_observers = observer.ObserverList()
5069+        self._queue_errbacks = observer.ObserverList()
5070+        self._readvs = []
5071+
5072+
5073+    def _maybe_fetch_offsets_and_header(self, force_remote=False):
5074+        """
5075+        I fetch the offset table and the header from the remote slot if
5076+        I don't already have them. If I do have them, I do nothing and
5077+        return an empty Deferred.
5078+        """
5079+        if self._offsets:
5080+            return defer.succeed(None)
5081+        # At this point, we may be either SDMF or MDMF. Fetching 107
5082+        # bytes will be enough to get header and offsets for both SDMF and
5083+        # MDMF, though we'll be left with 4 more bytes than we
5084+        # need if this ends up being MDMF. This is probably less
5085+        # expensive than the cost of a second roundtrip.
5086+        readvs = [(0, 107)]
5087+        d = self._read(readvs, force_remote)
5088+        d.addCallback(self._process_encoding_parameters)
5089+        d.addCallback(self._process_offsets)
5090+        return d
5091+
5092+
5093+    def _process_encoding_parameters(self, encoding_parameters):
5094+        assert self.shnum in encoding_parameters
5095+        encoding_parameters = encoding_parameters[self.shnum][0]
5096+        # The first byte is the version number. It will tell us what
5097+        # to do next.
5098+        (verno,) = struct.unpack(">B", encoding_parameters[:1])
5099+        if verno == MDMF_VERSION:
5100+            read_size = MDMFHEADERWITHOUTOFFSETSSIZE
5101+            (verno,
5102+             seqnum,
5103+             root_hash,
5104+             k,
5105+             n,
5106+             segsize,
5107+             datalen) = struct.unpack(MDMFHEADERWITHOUTOFFSETS,
5108+                                      encoding_parameters[:read_size])
5109+            if segsize == 0 and datalen == 0:
5110+                # Empty file, no segments.
5111+                self._num_segments = 0
5112+            else:
5113+                self._num_segments = mathutil.div_ceil(datalen, segsize)
5114+
5115+        elif verno == SDMF_VERSION:
5116+            read_size = SIGNED_PREFIX_LENGTH
5117+            (verno,
5118+             seqnum,
5119+             root_hash,
5120+             salt,
5121+             k,
5122+             n,
5123+             segsize,
5124+             datalen) = struct.unpack(">BQ32s16s BBQQ",
5125+                                encoding_parameters[:SIGNED_PREFIX_LENGTH])
5126+            self._salt = salt
5127+            if segsize == 0 and datalen == 0:
5128+                # empty file
5129+                self._num_segments = 0
5130+            else:
5131+                # non-empty SDMF files have one segment.
5132+                self._num_segments = 1
5133+        else:
5134+            raise UnknownVersionError("You asked me to read mutable file "
5135+                                      "version %d, but I only understand "
5136+                                      "%d and %d" % (verno, SDMF_VERSION,
5137+                                                     MDMF_VERSION))
5138+
5139+        self._version_number = verno
5140+        self._sequence_number = seqnum
5141+        self._root_hash = root_hash
5142+        self._required_shares = k
5143+        self._total_shares = n
5144+        self._segment_size = segsize
5145+        self._data_length = datalen
5146+
5147+        self._block_size = self._segment_size / self._required_shares
5148+        # We can upload empty files, and need to account for this fact
5149+        # so as to avoid zero-division and zero-modulo errors.
5150+        if datalen > 0:
5151+            tail_size = self._data_length % self._segment_size
5152+        else:
5153+            tail_size = 0
5154+        if not tail_size:
5155+            self._tail_block_size = self._block_size
5156+        else:
5157+            self._tail_block_size = mathutil.next_multiple(tail_size,
5158+                                                    self._required_shares)
5159+            self._tail_block_size /= self._required_shares
5160+
5161+        return encoding_parameters
5162+
5163+
5164+    def _process_offsets(self, offsets):
5165+        if self._version_number == 0:
5166+            read_size = OFFSETS_LENGTH
5167+            read_offset = SIGNED_PREFIX_LENGTH
5168+            end = read_size + read_offset
5169+            (signature,
5170+             share_hash_chain,
5171+             block_hash_tree,
5172+             share_data,
5173+             enc_privkey,
5174+             EOF) = struct.unpack(">LLLLQQ",
5175+                                  offsets[read_offset:end])
5176+            self._offsets = {}
5177+            self._offsets['signature'] = signature
5178+            self._offsets['share_data'] = share_data
5179+            self._offsets['block_hash_tree'] = block_hash_tree
5180+            self._offsets['share_hash_chain'] = share_hash_chain
5181+            self._offsets['enc_privkey'] = enc_privkey
5182+            self._offsets['EOF'] = EOF
5183+
5184+        elif self._version_number == 1:
5185+            read_offset = MDMFHEADERWITHOUTOFFSETSSIZE
5186+            read_length = MDMFOFFSETS_LENGTH
5187+            end = read_offset + read_length
5188+            (encprivkey,
5189+             blockhashes,
5190+             sharehashes,
5191+             signature,
5192+             verification_key,
5193+             eof) = struct.unpack(MDMFOFFSETS,
5194+                                  offsets[read_offset:end])
5195+            self._offsets = {}
5196+            self._offsets['enc_privkey'] = encprivkey
5197+            self._offsets['block_hash_tree'] = blockhashes
5198+            self._offsets['share_hash_chain'] = sharehashes
5199+            self._offsets['signature'] = signature
5200+            self._offsets['verification_key'] = verification_key
5201+            self._offsets['EOF'] = eof
5202+
5203+
5204+    def get_block_and_salt(self, segnum, queue=False):
5205+        """
5206+        I return (block, salt), where block is the block data and
5207+        salt is the salt used to encrypt that segment.
5208+        """
5209+        d = self._maybe_fetch_offsets_and_header()
5210+        def _then(ignored):
5211+            if self._version_number == 1:
5212+                base_share_offset = MDMFHEADERSIZE
5213+            else:
5214+                base_share_offset = self._offsets['share_data']
5215+
5216+            if segnum + 1 > self._num_segments:
5217+                raise LayoutInvalid("Not a valid segment number")
5218+
5219+            if self._version_number == 0:
5220+                share_offset = base_share_offset + self._block_size * segnum
5221+            else:
5222+                share_offset = base_share_offset + (self._block_size + \
5223+                                                    SALT_SIZE) * segnum
5224+            if segnum + 1 == self._num_segments:
5225+                data = self._tail_block_size
5226+            else:
5227+                data = self._block_size
5228+
5229+            if self._version_number == 1:
5230+                data += SALT_SIZE
5231+
5232+            readvs = [(share_offset, data)]
5233+            return readvs
5234+        d.addCallback(_then)
5235+        d.addCallback(lambda readvs:
5236+            self._read(readvs, queue=queue))
5237+        def _process_results(results):
5238+            assert self.shnum in results
5239+            if self._version_number == 0:
5240+                # We only read the share data, but we know the salt from
5241+                # when we fetched the header
5242+                data = results[self.shnum]
5243+                if not data:
5244+                    data = ""
5245+                else:
5246+                    assert len(data) == 1
5247+                    data = data[0]
5248+                salt = self._salt
5249+            else:
5250+                data = results[self.shnum]
5251+                if not data:
5252+                    salt = data = ""
5253+                else:
5254+                    salt_and_data = results[self.shnum][0]
5255+                    salt = salt_and_data[:SALT_SIZE]
5256+                    data = salt_and_data[SALT_SIZE:]
5257+            return data, salt
5258+        d.addCallback(_process_results)
5259+        return d
5260+
5261+
5262+    def get_blockhashes(self, needed=None, queue=False, force_remote=False):
5263+        """
5264+        I return the block hash tree
5265+
5266+        I take an optional argument, needed, which is a set of indices
5267+        correspond to hashes that I should fetch. If this argument is
5268+        missing, I will fetch the entire block hash tree; otherwise, I
5269+        may attempt to fetch fewer hashes, based on what needed says
5270+        that I should do. Note that I may fetch as many hashes as I
5271+        want, so long as the set of hashes that I do fetch is a superset
5272+        of the ones that I am asked for, so callers should be prepared
5273+        to tolerate additional hashes.
5274+        """
5275+        # TODO: Return only the parts of the block hash tree necessary
5276+        # to validate the blocknum provided?
5277+        # This is a good idea, but it is hard to implement correctly. It
5278+        # is bad to fetch any one block hash more than once, so we
5279+        # probably just want to fetch the whole thing at once and then
5280+        # serve it.
5281+        if needed == set([]):
5282+            return defer.succeed([])
5283+        d = self._maybe_fetch_offsets_and_header()
5284+        def _then(ignored):
5285+            blockhashes_offset = self._offsets['block_hash_tree']
5286+            if self._version_number == 1:
5287+                blockhashes_length = self._offsets['share_hash_chain'] - blockhashes_offset
5288+            else:
5289+                blockhashes_length = self._offsets['share_data'] - blockhashes_offset
5290+            readvs = [(blockhashes_offset, blockhashes_length)]
5291+            return readvs
5292+        d.addCallback(_then)
5293+        d.addCallback(lambda readvs:
5294+            self._read(readvs, queue=queue, force_remote=force_remote))
5295+        def _build_block_hash_tree(results):
5296+            assert self.shnum in results
5297+
5298+            rawhashes = results[self.shnum][0]
5299+            results = [rawhashes[i:i+HASH_SIZE]
5300+                       for i in range(0, len(rawhashes), HASH_SIZE)]
5301+            return results
5302+        d.addCallback(_build_block_hash_tree)
5303+        return d
5304+
5305+
5306+    def get_sharehashes(self, needed=None, queue=False, force_remote=False):
5307+        """
5308+        I return the part of the share hash chain placed to validate
5309+        this share.
5310+
5311+        I take an optional argument, needed. Needed is a set of indices
5312+        that correspond to the hashes that I should fetch. If needed is
5313+        not present, I will fetch and return the entire share hash
5314+        chain. Otherwise, I may fetch and return any part of the share
5315+        hash chain that is a superset of the part that I am asked to
5316+        fetch. Callers should be prepared to deal with more hashes than
5317+        they've asked for.
5318+        """
5319+        if needed == set([]):
5320+            return defer.succeed([])
5321+        d = self._maybe_fetch_offsets_and_header()
5322+
5323+        def _make_readvs(ignored):
5324+            sharehashes_offset = self._offsets['share_hash_chain']
5325+            if self._version_number == 0:
5326+                sharehashes_length = self._offsets['block_hash_tree'] - sharehashes_offset
5327+            else:
5328+                sharehashes_length = self._offsets['signature'] - sharehashes_offset
5329+            readvs = [(sharehashes_offset, sharehashes_length)]
5330+            return readvs
5331+        d.addCallback(_make_readvs)
5332+        d.addCallback(lambda readvs:
5333+            self._read(readvs, queue=queue, force_remote=force_remote))
5334+        def _build_share_hash_chain(results):
5335+            assert self.shnum in results
5336+
5337+            sharehashes = results[self.shnum][0]
5338+            results = [sharehashes[i:i+(HASH_SIZE + 2)]
5339+                       for i in range(0, len(sharehashes), HASH_SIZE + 2)]
5340+            results = dict([struct.unpack(">H32s", data)
5341+                            for data in results])
5342+            return results
5343+        d.addCallback(_build_share_hash_chain)
5344+        return d
5345+
5346+
5347+    def get_encprivkey(self, queue=False):
5348+        """
5349+        I return the encrypted private key.
5350+        """
5351+        d = self._maybe_fetch_offsets_and_header()
5352+
5353+        def _make_readvs(ignored):
5354+            privkey_offset = self._offsets['enc_privkey']
5355+            if self._version_number == 0:
5356+                privkey_length = self._offsets['EOF'] - privkey_offset
5357+            else:
5358+                privkey_length = self._offsets['block_hash_tree'] - privkey_offset
5359+            readvs = [(privkey_offset, privkey_length)]
5360+            return readvs
5361+        d.addCallback(_make_readvs)
5362+        d.addCallback(lambda readvs:
5363+            self._read(readvs, queue=queue))
5364+        def _process_results(results):
5365+            assert self.shnum in results
5366+            privkey = results[self.shnum][0]
5367+            return privkey
5368+        d.addCallback(_process_results)
5369+        return d
5370+
5371+
5372+    def get_signature(self, queue=False):
5373+        """
5374+        I return the signature of my share.
5375+        """
5376+        d = self._maybe_fetch_offsets_and_header()
5377+
5378+        def _make_readvs(ignored):
5379+            signature_offset = self._offsets['signature']
5380+            if self._version_number == 1:
5381+                signature_length = self._offsets['verification_key'] - signature_offset
5382+            else:
5383+                signature_length = self._offsets['share_hash_chain'] - signature_offset
5384+            readvs = [(signature_offset, signature_length)]
5385+            return readvs
5386+        d.addCallback(_make_readvs)
5387+        d.addCallback(lambda readvs:
5388+            self._read(readvs, queue=queue))
5389+        def _process_results(results):
5390+            assert self.shnum in results
5391+            signature = results[self.shnum][0]
5392+            return signature
5393+        d.addCallback(_process_results)
5394+        return d
5395+
5396+
5397+    def get_verification_key(self, queue=False):
5398+        """
5399+        I return the verification key.
5400+        """
5401+        d = self._maybe_fetch_offsets_and_header()
5402+
5403+        def _make_readvs(ignored):
5404+            if self._version_number == 1:
5405+                vk_offset = self._offsets['verification_key']
5406+                vk_length = self._offsets['EOF'] - vk_offset
5407+            else:
5408+                vk_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
5409+                vk_length = self._offsets['signature'] - vk_offset
5410+            readvs = [(vk_offset, vk_length)]
5411+            return readvs
5412+        d.addCallback(_make_readvs)
5413+        d.addCallback(lambda readvs:
5414+            self._read(readvs, queue=queue))
5415+        def _process_results(results):
5416+            assert self.shnum in results
5417+            verification_key = results[self.shnum][0]
5418+            return verification_key
5419+        d.addCallback(_process_results)
5420+        return d
5421+
5422+
5423+    def get_encoding_parameters(self):
5424+        """
5425+        I return (k, n, segsize, datalen)
5426+        """
5427+        d = self._maybe_fetch_offsets_and_header()
5428+        d.addCallback(lambda ignored:
5429+            (self._required_shares,
5430+             self._total_shares,
5431+             self._segment_size,
5432+             self._data_length))
5433+        return d
5434+
5435+
5436+    def get_seqnum(self):
5437+        """
5438+        I return the sequence number for this share.
5439+        """
5440+        d = self._maybe_fetch_offsets_and_header()
5441+        d.addCallback(lambda ignored:
5442+            self._sequence_number)
5443+        return d
5444+
5445+
5446+    def get_root_hash(self):
5447+        """
5448+        I return the root of the block hash tree
5449+        """
5450+        d = self._maybe_fetch_offsets_and_header()
5451+        d.addCallback(lambda ignored: self._root_hash)
5452+        return d
5453+
5454+
5455+    def get_checkstring(self):
5456+        """
5457+        I return the packed representation of the following:
5458+
5459+            - version number
5460+            - sequence number
5461+            - root hash
5462+            - salt hash
5463+
5464+        which my users use as a checkstring to detect other writers.
5465+        """
5466+        d = self._maybe_fetch_offsets_and_header()
5467+        def _build_checkstring(ignored):
5468+            if self._salt:
5469+                checkstring = strut.pack(PREFIX,
5470+                                         self._version_number,
5471+                                         self._sequence_number,
5472+                                         self._root_hash,
5473+                                         self._salt)
5474+            else:
5475+                checkstring = struct.pack(MDMFCHECKSTRING,
5476+                                          self._version_number,
5477+                                          self._sequence_number,
5478+                                          self._root_hash)
5479+
5480+            return checkstring
5481+        d.addCallback(_build_checkstring)
5482+        return d
5483+
5484+
5485+    def get_prefix(self, force_remote):
5486+        d = self._maybe_fetch_offsets_and_header(force_remote)
5487+        d.addCallback(lambda ignored:
5488+            self._build_prefix())
5489+        return d
5490+
5491+
5492+    def _build_prefix(self):
5493+        # The prefix is another name for the part of the remote share
5494+        # that gets signed. It consists of everything up to and
5495+        # including the datalength, packed by struct.
5496+        if self._version_number == SDMF_VERSION:
5497+            return struct.pack(SIGNED_PREFIX,
5498+                           self._version_number,
5499+                           self._sequence_number,
5500+                           self._root_hash,
5501+                           self._salt,
5502+                           self._required_shares,
5503+                           self._total_shares,
5504+                           self._segment_size,
5505+                           self._data_length)
5506+
5507+        else:
5508+            return struct.pack(MDMFSIGNABLEHEADER,
5509+                           self._version_number,
5510+                           self._sequence_number,
5511+                           self._root_hash,
5512+                           self._required_shares,
5513+                           self._total_shares,
5514+                           self._segment_size,
5515+                           self._data_length)
5516+
5517+
5518+    def _get_offsets_tuple(self):
5519+        # The offsets tuple is another component of the version
5520+        # information tuple. It is basically our offsets dictionary,
5521+        # itemized and in a tuple.
5522+        return self._offsets.copy()
5523+
5524+
5525+    def get_verinfo(self):
5526+        """
5527+        I return my verinfo tuple. This is used by the ServermapUpdater
5528+        to keep track of versions of mutable files.
5529+
5530+        The verinfo tuple for MDMF files contains:
5531+            - seqnum
5532+            - root hash
5533+            - a blank (nothing)
5534+            - segsize
5535+            - datalen
5536+            - k
5537+            - n
5538+            - prefix (the thing that you sign)
5539+            - a tuple of offsets
5540+
5541+        We include the nonce in MDMF to simplify processing of version
5542+        information tuples.
5543+
5544+        The verinfo tuple for SDMF files is the same, but contains a
5545+        16-byte IV instead of a hash of salts.
5546+        """
5547+        d = self._maybe_fetch_offsets_and_header()
5548+        def _build_verinfo(ignored):
5549+            if self._version_number == SDMF_VERSION:
5550+                salt_to_use = self._salt
5551+            else:
5552+                salt_to_use = None
5553+            return (self._sequence_number,
5554+                    self._root_hash,
5555+                    salt_to_use,
5556+                    self._segment_size,
5557+                    self._data_length,
5558+                    self._required_shares,
5559+                    self._total_shares,
5560+                    self._build_prefix(),
5561+                    self._get_offsets_tuple())
5562+        d.addCallback(_build_verinfo)
5563+        return d
5564+
5565+
5566+    def flush(self):
5567+        """
5568+        I flush my queue of read vectors.
5569+        """
5570+        d = self._read(self._readvs)
5571+        def _then(results):
5572+            self._readvs = []
5573+            if isinstance(results, failure.Failure):
5574+                self._queue_errbacks.notify(results)
5575+            else:
5576+                self._queue_observers.notify(results)
5577+            self._queue_observers = observer.ObserverList()
5578+            self._queue_errbacks = observer.ObserverList()
5579+        d.addBoth(_then)
5580+
5581+
5582+    def _read(self, readvs, force_remote=False, queue=False):
5583+        unsatisfiable = filter(lambda x: x[0] + x[1] > len(self._data), readvs)
5584+        # TODO: It's entirely possible to tweak this so that it just
5585+        # fulfills the requests that it can, and not demand that all
5586+        # requests are satisfiable before running it.
5587+        if not unsatisfiable and not force_remote:
5588+            results = [self._data[offset:offset+length]
5589+                       for (offset, length) in readvs]
5590+            results = {self.shnum: results}
5591+            return defer.succeed(results)
5592+        else:
5593+            if queue:
5594+                start = len(self._readvs)
5595+                self._readvs += readvs
5596+                end = len(self._readvs)
5597+                def _get_results(results, start, end):
5598+                    if not self.shnum in results:
5599+                        return {self._shnum: [""]}
5600+                    return {self.shnum: results[self.shnum][start:end]}
5601+                d = defer.Deferred()
5602+                d.addCallback(_get_results, start, end)
5603+                self._queue_observers.subscribe(d.callback)
5604+                self._queue_errbacks.subscribe(d.errback)
5605+                return d
5606+            return self._rref.callRemote("slot_readv",
5607+                                         self._storage_index,
5608+                                         [self.shnum],
5609+                                         readvs)
5610+
5611+
5612+    def is_sdmf(self):
5613+        """I tell my caller whether or not my remote file is SDMF or MDMF
5614+        """
5615+        d = self._maybe_fetch_offsets_and_header()
5616+        d.addCallback(lambda ignored:
5617+            self._version_number == 0)
5618+        return d
5619+
5620+
5621+class LayoutInvalid(Exception):
5622+    """
5623+    This isn't a valid MDMF mutable file
5624+    """
5625hunk ./src/allmydata/test/test_storage.py 2
5626 
5627-import time, os.path, stat, re, simplejson, struct
5628+import time, os.path, stat, re, simplejson, struct, shutil
5629 
5630 from twisted.trial import unittest
5631 
5632hunk ./src/allmydata/test/test_storage.py 22
5633 from allmydata.storage.expirer import LeaseCheckingCrawler
5634 from allmydata.immutable.layout import WriteBucketProxy, WriteBucketProxy_v2, \
5635      ReadBucketProxy
5636-from allmydata.interfaces import BadWriteEnablerError
5637-from allmydata.test.common import LoggingServiceParent
5638+from allmydata.mutable.layout import MDMFSlotWriteProxy, MDMFSlotReadProxy, \
5639+                                     LayoutInvalid, MDMFSIGNABLEHEADER, \
5640+                                     SIGNED_PREFIX, MDMFHEADER, \
5641+                                     MDMFOFFSETS, SDMFSlotWriteProxy
5642+from allmydata.interfaces import BadWriteEnablerError, MDMF_VERSION, \
5643+                                 SDMF_VERSION
5644+from allmydata.test.common import LoggingServiceParent, ShouldFailMixin
5645 from allmydata.test.common_web import WebRenderingMixin
5646 from allmydata.web.storage import StorageStatus, remove_prefix
5647 
5648hunk ./src/allmydata/test/test_storage.py 106
5649 
5650 class RemoteBucket:
5651 
5652+    def __init__(self):
5653+        self.read_count = 0
5654+        self.write_count = 0
5655+
5656     def callRemote(self, methname, *args, **kwargs):
5657         def _call():
5658             meth = getattr(self.target, "remote_" + methname)
5659hunk ./src/allmydata/test/test_storage.py 114
5660             return meth(*args, **kwargs)
5661+
5662+        if methname == "slot_readv":
5663+            self.read_count += 1
5664+        if "writev" in methname:
5665+            self.write_count += 1
5666+
5667         return defer.maybeDeferred(_call)
5668 
5669hunk ./src/allmydata/test/test_storage.py 122
5670+
5671 class BucketProxy(unittest.TestCase):
5672     def make_bucket(self, name, size):
5673         basedir = os.path.join("storage", "BucketProxy", name)
5674hunk ./src/allmydata/test/test_storage.py 1299
5675         self.failUnless(os.path.exists(prefixdir), prefixdir)
5676         self.failIf(os.path.exists(bucketdir), bucketdir)
5677 
5678+
5679+class MDMFProxies(unittest.TestCase, ShouldFailMixin):
5680+    def setUp(self):
5681+        self.sparent = LoggingServiceParent()
5682+        self._lease_secret = itertools.count()
5683+        self.ss = self.create("MDMFProxies storage test server")
5684+        self.rref = RemoteBucket()
5685+        self.rref.target = self.ss
5686+        self.secrets = (self.write_enabler("we_secret"),
5687+                        self.renew_secret("renew_secret"),
5688+                        self.cancel_secret("cancel_secret"))
5689+        self.segment = "aaaaaa"
5690+        self.block = "aa"
5691+        self.salt = "a" * 16
5692+        self.block_hash = "a" * 32
5693+        self.block_hash_tree = [self.block_hash for i in xrange(6)]
5694+        self.share_hash = self.block_hash
5695+        self.share_hash_chain = dict([(i, self.share_hash) for i in xrange(6)])
5696+        self.signature = "foobarbaz"
5697+        self.verification_key = "vvvvvv"
5698+        self.encprivkey = "private"
5699+        self.root_hash = self.block_hash
5700+        self.salt_hash = self.root_hash
5701+        self.salt_hash_tree = [self.salt_hash for i in xrange(6)]
5702+        self.block_hash_tree_s = self.serialize_blockhashes(self.block_hash_tree)
5703+        self.share_hash_chain_s = self.serialize_sharehashes(self.share_hash_chain)
5704+        # blockhashes and salt hashes are serialized in the same way,
5705+        # only we lop off the first element and store that in the
5706+        # header.
5707+        self.salt_hash_tree_s = self.serialize_blockhashes(self.salt_hash_tree[1:])
5708+
5709+
5710+    def tearDown(self):
5711+        self.sparent.stopService()
5712+        shutil.rmtree(self.workdir("MDMFProxies storage test server"))
5713+
5714+
5715+    def write_enabler(self, we_tag):
5716+        return hashutil.tagged_hash("we_blah", we_tag)
5717+
5718+
5719+    def renew_secret(self, tag):
5720+        return hashutil.tagged_hash("renew_blah", str(tag))
5721+
5722+
5723+    def cancel_secret(self, tag):
5724+        return hashutil.tagged_hash("cancel_blah", str(tag))
5725+
5726+
5727+    def workdir(self, name):
5728+        basedir = os.path.join("storage", "MutableServer", name)
5729+        return basedir
5730+
5731+
5732+    def create(self, name):
5733+        workdir = self.workdir(name)
5734+        ss = StorageServer(workdir, "\x00" * 20)
5735+        ss.setServiceParent(self.sparent)
5736+        return ss
5737+
5738+
5739+    def build_test_mdmf_share(self, tail_segment=False, empty=False):
5740+        # Start with the checkstring
5741+        data = struct.pack(">BQ32s",
5742+                           1,
5743+                           0,
5744+                           self.root_hash)
5745+        self.checkstring = data
5746+        # Next, the encoding parameters
5747+        if tail_segment:
5748+            data += struct.pack(">BBQQ",
5749+                                3,
5750+                                10,
5751+                                6,
5752+                                33)
5753+        elif empty:
5754+            data += struct.pack(">BBQQ",
5755+                                3,
5756+                                10,
5757+                                0,
5758+                                0)
5759+        else:
5760+            data += struct.pack(">BBQQ",
5761+                                3,
5762+                                10,
5763+                                6,
5764+                                36)
5765+        # Now we'll build the offsets.
5766+        sharedata = ""
5767+        if not tail_segment and not empty:
5768+            for i in xrange(6):
5769+                sharedata += self.salt + self.block
5770+        elif tail_segment:
5771+            for i in xrange(5):
5772+                sharedata += self.salt + self.block
5773+            sharedata += self.salt + "a"
5774+
5775+        # The encrypted private key comes after the shares + salts
5776+        offset_size = struct.calcsize(MDMFOFFSETS)
5777+        encrypted_private_key_offset = len(data) + offset_size + len(sharedata)
5778+        # The blockhashes come after the private key
5779+        blockhashes_offset = encrypted_private_key_offset + len(self.encprivkey)
5780+        # The sharehashes come after the salt hashes
5781+        sharehashes_offset = blockhashes_offset + len(self.block_hash_tree_s)
5782+        # The signature comes after the share hash chain
5783+        signature_offset = sharehashes_offset + len(self.share_hash_chain_s)
5784+        # The verification key comes after the signature
5785+        verification_offset = signature_offset + len(self.signature)
5786+        # The EOF comes after the verification key
5787+        eof_offset = verification_offset + len(self.verification_key)
5788+        data += struct.pack(MDMFOFFSETS,
5789+                            encrypted_private_key_offset,
5790+                            blockhashes_offset,
5791+                            sharehashes_offset,
5792+                            signature_offset,
5793+                            verification_offset,
5794+                            eof_offset)
5795+        self.offsets = {}
5796+        self.offsets['enc_privkey'] = encrypted_private_key_offset
5797+        self.offsets['block_hash_tree'] = blockhashes_offset
5798+        self.offsets['share_hash_chain'] = sharehashes_offset
5799+        self.offsets['signature'] = signature_offset
5800+        self.offsets['verification_key'] = verification_offset
5801+        self.offsets['EOF'] = eof_offset
5802+        # Next, we'll add in the salts and share data,
5803+        data += sharedata
5804+        # the private key,
5805+        data += self.encprivkey
5806+        # the block hash tree,
5807+        data += self.block_hash_tree_s
5808+        # the share hash chain,
5809+        data += self.share_hash_chain_s
5810+        # the signature,
5811+        data += self.signature
5812+        # and the verification key
5813+        data += self.verification_key
5814+        return data
5815+
5816+
5817+    def write_test_share_to_server(self,
5818+                                   storage_index,
5819+                                   tail_segment=False,
5820+                                   empty=False):
5821+        """
5822+        I write some data for the read tests to read to self.ss
5823+
5824+        If tail_segment=True, then I will write a share that has a
5825+        smaller tail segment than other segments.
5826+        """
5827+        write = self.ss.remote_slot_testv_and_readv_and_writev
5828+        data = self.build_test_mdmf_share(tail_segment, empty)
5829+        # Finally, we write the whole thing to the storage server in one
5830+        # pass.
5831+        testvs = [(0, 1, "eq", "")]
5832+        tws = {}
5833+        tws[0] = (testvs, [(0, data)], None)
5834+        readv = [(0, 1)]
5835+        results = write(storage_index, self.secrets, tws, readv)
5836+        self.failUnless(results[0])
5837+
5838+
5839+    def build_test_sdmf_share(self, empty=False):
5840+        if empty:
5841+            sharedata = ""
5842+        else:
5843+            sharedata = self.segment * 6
5844+        self.sharedata = sharedata
5845+        blocksize = len(sharedata) / 3
5846+        block = sharedata[:blocksize]
5847+        self.blockdata = block
5848+        prefix = struct.pack(">BQ32s16s BBQQ",
5849+                             0, # version,
5850+                             0,
5851+                             self.root_hash,
5852+                             self.salt,
5853+                             3,
5854+                             10,
5855+                             len(sharedata),
5856+                             len(sharedata),
5857+                            )
5858+        post_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
5859+        signature_offset = post_offset + len(self.verification_key)
5860+        sharehashes_offset = signature_offset + len(self.signature)
5861+        blockhashes_offset = sharehashes_offset + len(self.share_hash_chain_s)
5862+        sharedata_offset = blockhashes_offset + len(self.block_hash_tree_s)
5863+        encprivkey_offset = sharedata_offset + len(block)
5864+        eof_offset = encprivkey_offset + len(self.encprivkey)
5865+        offsets = struct.pack(">LLLLQQ",
5866+                              signature_offset,
5867+                              sharehashes_offset,
5868+                              blockhashes_offset,
5869+                              sharedata_offset,
5870+                              encprivkey_offset,
5871+                              eof_offset)
5872+        final_share = "".join([prefix,
5873+                           offsets,
5874+                           self.verification_key,
5875+                           self.signature,
5876+                           self.share_hash_chain_s,
5877+                           self.block_hash_tree_s,
5878+                           block,
5879+                           self.encprivkey])
5880+        self.offsets = {}
5881+        self.offsets['signature'] = signature_offset
5882+        self.offsets['share_hash_chain'] = sharehashes_offset
5883+        self.offsets['block_hash_tree'] = blockhashes_offset
5884+        self.offsets['share_data'] = sharedata_offset
5885+        self.offsets['enc_privkey'] = encprivkey_offset
5886+        self.offsets['EOF'] = eof_offset
5887+        return final_share
5888+
5889+
5890+    def write_sdmf_share_to_server(self,
5891+                                   storage_index,
5892+                                   empty=False):
5893+        # Some tests need SDMF shares to verify that we can still
5894+        # read them. This method writes one, which resembles but is not
5895+        assert self.rref
5896+        write = self.ss.remote_slot_testv_and_readv_and_writev
5897+        share = self.build_test_sdmf_share(empty)
5898+        testvs = [(0, 1, "eq", "")]
5899+        tws = {}
5900+        tws[0] = (testvs, [(0, share)], None)
5901+        readv = []
5902+        results = write(storage_index, self.secrets, tws, readv)
5903+        self.failUnless(results[0])
5904+
5905+
5906+    def test_read(self):
5907+        self.write_test_share_to_server("si1")
5908+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
5909+        # Check that every method equals what we expect it to.
5910+        d = defer.succeed(None)
5911+        def _check_block_and_salt((block, salt)):
5912+            self.failUnlessEqual(block, self.block)
5913+            self.failUnlessEqual(salt, self.salt)
5914+
5915+        for i in xrange(6):
5916+            d.addCallback(lambda ignored, i=i:
5917+                mr.get_block_and_salt(i))
5918+            d.addCallback(_check_block_and_salt)
5919+
5920+        d.addCallback(lambda ignored:
5921+            mr.get_encprivkey())
5922+        d.addCallback(lambda encprivkey:
5923+            self.failUnlessEqual(self.encprivkey, encprivkey))
5924+
5925+        d.addCallback(lambda ignored:
5926+            mr.get_blockhashes())
5927+        d.addCallback(lambda blockhashes:
5928+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
5929+
5930+        d.addCallback(lambda ignored:
5931+            mr.get_sharehashes())
5932+        d.addCallback(lambda sharehashes:
5933+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
5934+
5935+        d.addCallback(lambda ignored:
5936+            mr.get_signature())
5937+        d.addCallback(lambda signature:
5938+            self.failUnlessEqual(signature, self.signature))
5939+
5940+        d.addCallback(lambda ignored:
5941+            mr.get_verification_key())
5942+        d.addCallback(lambda verification_key:
5943+            self.failUnlessEqual(verification_key, self.verification_key))
5944+
5945+        d.addCallback(lambda ignored:
5946+            mr.get_seqnum())
5947+        d.addCallback(lambda seqnum:
5948+            self.failUnlessEqual(seqnum, 0))
5949+
5950+        d.addCallback(lambda ignored:
5951+            mr.get_root_hash())
5952+        d.addCallback(lambda root_hash:
5953+            self.failUnlessEqual(self.root_hash, root_hash))
5954+
5955+        d.addCallback(lambda ignored:
5956+            mr.get_seqnum())
5957+        d.addCallback(lambda seqnum:
5958+            self.failUnlessEqual(0, seqnum))
5959+
5960+        d.addCallback(lambda ignored:
5961+            mr.get_encoding_parameters())
5962+        def _check_encoding_parameters((k, n, segsize, datalen)):
5963+            self.failUnlessEqual(k, 3)
5964+            self.failUnlessEqual(n, 10)
5965+            self.failUnlessEqual(segsize, 6)
5966+            self.failUnlessEqual(datalen, 36)
5967+        d.addCallback(_check_encoding_parameters)
5968+
5969+        d.addCallback(lambda ignored:
5970+            mr.get_checkstring())
5971+        d.addCallback(lambda checkstring:
5972+            self.failUnlessEqual(checkstring, checkstring))
5973+        return d
5974+
5975+
5976+    def test_read_with_different_tail_segment_size(self):
5977+        self.write_test_share_to_server("si1", tail_segment=True)
5978+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
5979+        d = mr.get_block_and_salt(5)
5980+        def _check_tail_segment(results):
5981+            block, salt = results
5982+            self.failUnlessEqual(len(block), 1)
5983+            self.failUnlessEqual(block, "a")
5984+        d.addCallback(_check_tail_segment)
5985+        return d
5986+
5987+
5988+    def test_get_block_with_invalid_segnum(self):
5989+        self.write_test_share_to_server("si1")
5990+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
5991+        d = defer.succeed(None)
5992+        d.addCallback(lambda ignored:
5993+            self.shouldFail(LayoutInvalid, "test invalid segnum",
5994+                            None,
5995+                            mr.get_block_and_salt, 7))
5996+        return d
5997+
5998+
5999+    def test_get_encoding_parameters_first(self):
6000+        self.write_test_share_to_server("si1")
6001+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6002+        d = mr.get_encoding_parameters()
6003+        def _check_encoding_parameters((k, n, segment_size, datalen)):
6004+            self.failUnlessEqual(k, 3)
6005+            self.failUnlessEqual(n, 10)
6006+            self.failUnlessEqual(segment_size, 6)
6007+            self.failUnlessEqual(datalen, 36)
6008+        d.addCallback(_check_encoding_parameters)
6009+        return d
6010+
6011+
6012+    def test_get_seqnum_first(self):
6013+        self.write_test_share_to_server("si1")
6014+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6015+        d = mr.get_seqnum()
6016+        d.addCallback(lambda seqnum:
6017+            self.failUnlessEqual(seqnum, 0))
6018+        return d
6019+
6020+
6021+    def test_get_root_hash_first(self):
6022+        self.write_test_share_to_server("si1")
6023+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6024+        d = mr.get_root_hash()
6025+        d.addCallback(lambda root_hash:
6026+            self.failUnlessEqual(root_hash, self.root_hash))
6027+        return d
6028+
6029+
6030+    def test_get_checkstring_first(self):
6031+        self.write_test_share_to_server("si1")
6032+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6033+        d = mr.get_checkstring()
6034+        d.addCallback(lambda checkstring:
6035+            self.failUnlessEqual(checkstring, self.checkstring))
6036+        return d
6037+
6038+
6039+    def test_write_read_vectors(self):
6040+        # When writing for us, the storage server will return to us a
6041+        # read vector, along with its result. If a write fails because
6042+        # the test vectors failed, this read vector can help us to
6043+        # diagnose the problem. This test ensures that the read vector
6044+        # is working appropriately.
6045+        mw = self._make_new_mw("si1", 0)
6046+        d = defer.succeed(None)
6047+
6048+        # Write one share. This should return a checkstring of nothing,
6049+        # since there is no data there.
6050+        d.addCallback(lambda ignored:
6051+            mw.put_block(self.block, 0, self.salt))
6052+        def _check_first_write(results):
6053+            result, readvs = results
6054+            self.failUnless(result)
6055+            self.failIf(readvs)
6056+        d.addCallback(_check_first_write)
6057+        # Now, there should be a different checkstring returned when
6058+        # we write other shares
6059+        d.addCallback(lambda ignored:
6060+            mw.put_block(self.block, 1, self.salt))
6061+        def _check_next_write(results):
6062+            result, readvs = results
6063+            self.failUnless(result)
6064+            self.expected_checkstring = mw.get_checkstring()
6065+            self.failUnlessIn(0, readvs)
6066+            self.failUnlessEqual(readvs[0][0], self.expected_checkstring)
6067+        d.addCallback(_check_next_write)
6068+        # Add the other four shares
6069+        for i in xrange(2, 6):
6070+            d.addCallback(lambda ignored, i=i:
6071+                mw.put_block(self.block, i, self.salt))
6072+            d.addCallback(_check_next_write)
6073+        # Add the encrypted private key
6074+        d.addCallback(lambda ignored:
6075+            mw.put_encprivkey(self.encprivkey))
6076+        d.addCallback(_check_next_write)
6077+        # Add the block hash tree and share hash tree
6078+        d.addCallback(lambda ignored:
6079+            mw.put_blockhashes(self.block_hash_tree))
6080+        d.addCallback(_check_next_write)
6081+        d.addCallback(lambda ignored:
6082+            mw.put_sharehashes(self.share_hash_chain))
6083+        d.addCallback(_check_next_write)
6084+        # Add the root hash and the salt hash. This should change the
6085+        # checkstring, but not in a way that we'll be able to see right
6086+        # now, since the read vectors are applied before the write
6087+        # vectors.
6088+        d.addCallback(lambda ignored:
6089+            mw.put_root_hash(self.root_hash))
6090+        def _check_old_testv_after_new_one_is_written(results):
6091+            result, readvs = results
6092+            self.failUnless(result)
6093+            self.failUnlessIn(0, readvs)
6094+            self.failUnlessEqual(self.expected_checkstring,
6095+                                 readvs[0][0])
6096+            new_checkstring = mw.get_checkstring()
6097+            self.failIfEqual(new_checkstring,
6098+                             readvs[0][0])
6099+        d.addCallback(_check_old_testv_after_new_one_is_written)
6100+        # Now add the signature. This should succeed, meaning that the
6101+        # data gets written and the read vector matches what the writer
6102+        # thinks should be there.
6103+        d.addCallback(lambda ignored:
6104+            mw.put_signature(self.signature))
6105+        d.addCallback(_check_next_write)
6106+        # The checkstring remains the same for the rest of the process.
6107+        return d
6108+
6109+
6110+    def test_blockhashes_after_share_hash_chain(self):
6111+        mw = self._make_new_mw("si1", 0)
6112+        d = defer.succeed(None)
6113+        # Put everything up to and including the share hash chain
6114+        for i in xrange(6):
6115+            d.addCallback(lambda ignored, i=i:
6116+                mw.put_block(self.block, i, self.salt))
6117+        d.addCallback(lambda ignored:
6118+            mw.put_encprivkey(self.encprivkey))
6119+        d.addCallback(lambda ignored:
6120+            mw.put_blockhashes(self.block_hash_tree))
6121+        d.addCallback(lambda ignored:
6122+            mw.put_sharehashes(self.share_hash_chain))
6123+
6124+        # Now try to put the block hash tree again.
6125+        d.addCallback(lambda ignored:
6126+            self.shouldFail(LayoutInvalid, "test repeat salthashes",
6127+                            None,
6128+                            mw.put_blockhashes, self.block_hash_tree))
6129+        return d
6130+
6131+
6132+    def test_encprivkey_after_blockhashes(self):
6133+        mw = self._make_new_mw("si1", 0)
6134+        d = defer.succeed(None)
6135+        # Put everything up to and including the block hash tree
6136+        for i in xrange(6):
6137+            d.addCallback(lambda ignored, i=i:
6138+                mw.put_block(self.block, i, self.salt))
6139+        d.addCallback(lambda ignored:
6140+            mw.put_encprivkey(self.encprivkey))
6141+        d.addCallback(lambda ignored:
6142+            mw.put_blockhashes(self.block_hash_tree))
6143+        d.addCallback(lambda ignored:
6144+            self.shouldFail(LayoutInvalid, "out of order private key",
6145+                            None,
6146+                            mw.put_encprivkey, self.encprivkey))
6147+        return d
6148+
6149+
6150+    def test_share_hash_chain_after_signature(self):
6151+        mw = self._make_new_mw("si1", 0)
6152+        d = defer.succeed(None)
6153+        # Put everything up to and including the signature
6154+        for i in xrange(6):
6155+            d.addCallback(lambda ignored, i=i:
6156+                mw.put_block(self.block, i, self.salt))
6157+        d.addCallback(lambda ignored:
6158+            mw.put_encprivkey(self.encprivkey))
6159+        d.addCallback(lambda ignored:
6160+            mw.put_blockhashes(self.block_hash_tree))
6161+        d.addCallback(lambda ignored:
6162+            mw.put_sharehashes(self.share_hash_chain))
6163+        d.addCallback(lambda ignored:
6164+            mw.put_root_hash(self.root_hash))
6165+        d.addCallback(lambda ignored:
6166+            mw.put_signature(self.signature))
6167+        # Now try to put the share hash chain again. This should fail
6168+        d.addCallback(lambda ignored:
6169+            self.shouldFail(LayoutInvalid, "out of order share hash chain",
6170+                            None,
6171+                            mw.put_sharehashes, self.share_hash_chain))
6172+        return d
6173+
6174+
6175+    def test_signature_after_verification_key(self):
6176+        mw = self._make_new_mw("si1", 0)
6177+        d = defer.succeed(None)
6178+        # Put everything up to and including the verification key.
6179+        for i in xrange(6):
6180+            d.addCallback(lambda ignored, i=i:
6181+                mw.put_block(self.block, i, self.salt))
6182+        d.addCallback(lambda ignored:
6183+            mw.put_encprivkey(self.encprivkey))
6184+        d.addCallback(lambda ignored:
6185+            mw.put_blockhashes(self.block_hash_tree))
6186+        d.addCallback(lambda ignored:
6187+            mw.put_sharehashes(self.share_hash_chain))
6188+        d.addCallback(lambda ignored:
6189+            mw.put_root_hash(self.root_hash))
6190+        d.addCallback(lambda ignored:
6191+            mw.put_signature(self.signature))
6192+        d.addCallback(lambda ignored:
6193+            mw.put_verification_key(self.verification_key))
6194+        # Now try to put the signature again. This should fail
6195+        d.addCallback(lambda ignored:
6196+            self.shouldFail(LayoutInvalid, "signature after verification",
6197+                            None,
6198+                            mw.put_signature, self.signature))
6199+        return d
6200+
6201+
6202+    def test_uncoordinated_write(self):
6203+        # Make two mutable writers, both pointing to the same storage
6204+        # server, both at the same storage index, and try writing to the
6205+        # same share.
6206+        mw1 = self._make_new_mw("si1", 0)
6207+        mw2 = self._make_new_mw("si1", 0)
6208+        d = defer.succeed(None)
6209+        def _check_success(results):
6210+            result, readvs = results
6211+            self.failUnless(result)
6212+
6213+        def _check_failure(results):
6214+            result, readvs = results
6215+            self.failIf(result)
6216+
6217+        d.addCallback(lambda ignored:
6218+            mw1.put_block(self.block, 0, self.salt))
6219+        d.addCallback(_check_success)
6220+        d.addCallback(lambda ignored:
6221+            mw2.put_block(self.block, 0, self.salt))
6222+        d.addCallback(_check_failure)
6223+        return d
6224+
6225+
6226+    def test_invalid_salt_size(self):
6227+        # Salts need to be 16 bytes in size. Writes that attempt to
6228+        # write more or less than this should be rejected.
6229+        mw = self._make_new_mw("si1", 0)
6230+        invalid_salt = "a" * 17 # 17 bytes
6231+        another_invalid_salt = "b" * 15 # 15 bytes
6232+        d = defer.succeed(None)
6233+        d.addCallback(lambda ignored:
6234+            self.shouldFail(LayoutInvalid, "salt too big",
6235+                            None,
6236+                            mw.put_block, self.block, 0, invalid_salt))
6237+        d.addCallback(lambda ignored:
6238+            self.shouldFail(LayoutInvalid, "salt too small",
6239+                            None,
6240+                            mw.put_block, self.block, 0,
6241+                            another_invalid_salt))
6242+        return d
6243+
6244+
6245+    def test_write_test_vectors(self):
6246+        # If we give the write proxy a bogus test vector at
6247+        # any point during the process, it should fail to write.
6248+        mw = self._make_new_mw("si1", 0)
6249+        mw.set_checkstring("this is a lie")
6250+        # The initial write should be expecting to find the improbable
6251+        # checkstring above in place; finding nothing, it should fail.
6252+        d = defer.succeed(None)
6253+        d.addCallback(lambda ignored:
6254+            mw.put_block(self.block, 0, self.salt))
6255+        def _check_failure(results):
6256+            result, readv = results
6257+            self.failIf(result)
6258+        d.addCallback(_check_failure)
6259+        # Now set the checkstring to the empty string, which
6260+        # indicates that no share is there.
6261+        d.addCallback(lambda ignored:
6262+            mw.set_checkstring(""))
6263+        d.addCallback(lambda ignored:
6264+            mw.put_block(self.block, 0, self.salt))
6265+        def _check_success(results):
6266+            result, readv = results
6267+            self.failUnless(result)
6268+        d.addCallback(_check_success)
6269+        # Now set the checkstring to something wrong
6270+        d.addCallback(lambda ignored:
6271+            mw.set_checkstring("something wrong"))
6272+        # This should fail to do anything
6273+        d.addCallback(lambda ignored:
6274+            mw.put_block(self.block, 1, self.salt))
6275+        d.addCallback(_check_failure)
6276+        # Now set it back to what it should be.
6277+        d.addCallback(lambda ignored:
6278+            mw.set_checkstring(mw.get_checkstring()))
6279+        for i in xrange(1, 6):
6280+            d.addCallback(lambda ignored, i=i:
6281+                mw.put_block(self.block, i, self.salt))
6282+            d.addCallback(_check_success)
6283+        d.addCallback(lambda ignored:
6284+            mw.put_encprivkey(self.encprivkey))
6285+        d.addCallback(_check_success)
6286+        d.addCallback(lambda ignored:
6287+            mw.put_blockhashes(self.block_hash_tree))
6288+        d.addCallback(_check_success)
6289+        d.addCallback(lambda ignored:
6290+            mw.put_sharehashes(self.share_hash_chain))
6291+        d.addCallback(_check_success)
6292+        def _keep_old_checkstring(ignored):
6293+            self.old_checkstring = mw.get_checkstring()
6294+            mw.set_checkstring("foobarbaz")
6295+        d.addCallback(_keep_old_checkstring)
6296+        d.addCallback(lambda ignored:
6297+            mw.put_root_hash(self.root_hash))
6298+        d.addCallback(_check_failure)
6299+        d.addCallback(lambda ignored:
6300+            self.failUnlessEqual(self.old_checkstring, mw.get_checkstring()))
6301+        def _restore_old_checkstring(ignored):
6302+            mw.set_checkstring(self.old_checkstring)
6303+        d.addCallback(_restore_old_checkstring)
6304+        d.addCallback(lambda ignored:
6305+            mw.put_root_hash(self.root_hash))
6306+        d.addCallback(_check_success)
6307+        # The checkstring should have been set appropriately for us on
6308+        # the last write; if we try to change it to something else,
6309+        # that change should cause the verification key step to fail.
6310+        d.addCallback(lambda ignored:
6311+            mw.set_checkstring("something else"))
6312+        d.addCallback(lambda ignored:
6313+            mw.put_signature(self.signature))
6314+        d.addCallback(_check_failure)
6315+        d.addCallback(lambda ignored:
6316+            mw.set_checkstring(mw.get_checkstring()))
6317+        d.addCallback(lambda ignored:
6318+            mw.put_signature(self.signature))
6319+        d.addCallback(_check_success)
6320+        d.addCallback(lambda ignored:
6321+            mw.put_verification_key(self.verification_key))
6322+        d.addCallback(_check_success)
6323+        return d
6324+
6325+
6326+    def test_offset_only_set_on_success(self):
6327+        # The write proxy should be smart enough to detect when a write
6328+        # has failed, and to temper its definition of progress based on
6329+        # that.
6330+        mw = self._make_new_mw("si1", 0)
6331+        d = defer.succeed(None)
6332+        for i in xrange(1, 6):
6333+            d.addCallback(lambda ignored, i=i:
6334+                mw.put_block(self.block, i, self.salt))
6335+        def _break_checkstring(ignored):
6336+            self._old_checkstring = mw.get_checkstring()
6337+            mw.set_checkstring("foobarbaz")
6338+
6339+        def _fix_checkstring(ignored):
6340+            mw.set_checkstring(self._old_checkstring)
6341+
6342+        d.addCallback(_break_checkstring)
6343+
6344+        # Setting the encrypted private key shouldn't work now, which is
6345+        # to be expected and is tested elsewhere. We also want to make
6346+        # sure that we can't add the block hash tree after a failed
6347+        # write of this sort.
6348+        d.addCallback(lambda ignored:
6349+            mw.put_encprivkey(self.encprivkey))
6350+        d.addCallback(lambda ignored:
6351+            self.shouldFail(LayoutInvalid, "test out-of-order blockhashes",
6352+                            None,
6353+                            mw.put_blockhashes, self.block_hash_tree))
6354+        d.addCallback(_fix_checkstring)
6355+        d.addCallback(lambda ignored:
6356+            mw.put_encprivkey(self.encprivkey))
6357+        d.addCallback(_break_checkstring)
6358+        d.addCallback(lambda ignored:
6359+            mw.put_blockhashes(self.block_hash_tree))
6360+        d.addCallback(lambda ignored:
6361+            self.shouldFail(LayoutInvalid, "test out-of-order sharehashes",
6362+                            None,
6363+                            mw.put_sharehashes, self.share_hash_chain))
6364+        d.addCallback(_fix_checkstring)
6365+        d.addCallback(lambda ignored:
6366+            mw.put_blockhashes(self.block_hash_tree))
6367+        d.addCallback(_break_checkstring)
6368+        d.addCallback(lambda ignored:
6369+            mw.put_sharehashes(self.share_hash_chain))
6370+        d.addCallback(lambda ignored:
6371+            self.shouldFail(LayoutInvalid, "out-of-order root hash",
6372+                            None,
6373+                            mw.put_root_hash, self.root_hash))
6374+        d.addCallback(_fix_checkstring)
6375+        d.addCallback(lambda ignored:
6376+            mw.put_sharehashes(self.share_hash_chain))
6377+        d.addCallback(_break_checkstring)
6378+        d.addCallback(lambda ignored:
6379+            mw.put_root_hash(self.root_hash))
6380+        d.addCallback(lambda ignored:
6381+            self.shouldFail(LayoutInvalid, "out-of-order signature",
6382+                            None,
6383+                            mw.put_signature, self.signature))
6384+        d.addCallback(_fix_checkstring)
6385+        d.addCallback(lambda ignored:
6386+            mw.put_root_hash(self.root_hash))
6387+        d.addCallback(_break_checkstring)
6388+        d.addCallback(lambda ignored:
6389+            mw.put_signature(self.signature))
6390+        d.addCallback(lambda ignored:
6391+            self.shouldFail(LayoutInvalid, "out-of-order verification key",
6392+                            None,
6393+                            mw.put_verification_key,
6394+                            self.verification_key))
6395+        d.addCallback(_fix_checkstring)
6396+        d.addCallback(lambda ignored:
6397+            mw.put_signature(self.signature))
6398+        d.addCallback(_break_checkstring)
6399+        d.addCallback(lambda ignored:
6400+            mw.put_verification_key(self.verification_key))
6401+        d.addCallback(lambda ignored:
6402+            self.shouldFail(LayoutInvalid, "out-of-order finish",
6403+                            None,
6404+                            mw.finish_publishing))
6405+        return d
6406+
6407+
6408+    def serialize_blockhashes(self, blockhashes):
6409+        return "".join(blockhashes)
6410+
6411+
6412+    def serialize_sharehashes(self, sharehashes):
6413+        ret = "".join([struct.pack(">H32s", i, sharehashes[i])
6414+                        for i in sorted(sharehashes.keys())])
6415+        return ret
6416+
6417+
6418+    def test_write(self):
6419+        # This translates to a file with 6 6-byte segments, and with 2-byte
6420+        # blocks.
6421+        mw = self._make_new_mw("si1", 0)
6422+        mw2 = self._make_new_mw("si1", 1)
6423+        # Test writing some blocks.
6424+        read = self.ss.remote_slot_readv
6425+        expected_sharedata_offset = struct.calcsize(MDMFHEADER)
6426+        written_block_size = 2 + len(self.salt)
6427+        written_block = self.block + self.salt
6428+        def _check_block_write(i, share):
6429+            self.failUnlessEqual(read("si1", [share], [(expected_sharedata_offset + (i * written_block_size), written_block_size)]),
6430+                                {share: [written_block]})
6431+        d = defer.succeed(None)
6432+        for i in xrange(6):
6433+            d.addCallback(lambda ignored, i=i:
6434+                mw.put_block(self.block, i, self.salt))
6435+            d.addCallback(lambda ignored, i=i:
6436+                _check_block_write(i, 0))
6437+        # Now try the same thing, but with share 1 instead of share 0.
6438+        for i in xrange(6):
6439+            d.addCallback(lambda ignored, i=i:
6440+                mw2.put_block(self.block, i, self.salt))
6441+            d.addCallback(lambda ignored, i=i:
6442+                _check_block_write(i, 1))
6443+
6444+        # Next, we make a fake encrypted private key, and put it onto the
6445+        # storage server.
6446+        d.addCallback(lambda ignored:
6447+            mw.put_encprivkey(self.encprivkey))
6448+        expected_private_key_offset = expected_sharedata_offset + \
6449+                                      len(written_block) * 6
6450+        self.failUnlessEqual(len(self.encprivkey), 7)
6451+        d.addCallback(lambda ignored:
6452+            self.failUnlessEqual(read("si1", [0], [(expected_private_key_offset, 7)]),
6453+                                 {0: [self.encprivkey]}))
6454+
6455+        # Next, we put a fake block hash tree.
6456+        d.addCallback(lambda ignored:
6457+            mw.put_blockhashes(self.block_hash_tree))
6458+        expected_block_hash_offset = expected_private_key_offset + len(self.encprivkey)
6459+        self.failUnlessEqual(len(self.block_hash_tree_s), 32 * 6)
6460+        d.addCallback(lambda ignored:
6461+            self.failUnlessEqual(read("si1", [0], [(expected_block_hash_offset, 32 * 6)]),
6462+                                 {0: [self.block_hash_tree_s]}))
6463+
6464+        # Next, put a fake share hash chain
6465+        d.addCallback(lambda ignored:
6466+            mw.put_sharehashes(self.share_hash_chain))
6467+        expected_share_hash_offset = expected_block_hash_offset + len(self.block_hash_tree_s)
6468+        d.addCallback(lambda ignored:
6469+            self.failUnlessEqual(read("si1", [0],[(expected_share_hash_offset, (32 + 2) * 6)]),
6470+                                 {0: [self.share_hash_chain_s]}))
6471+
6472+        # Next, we put what is supposed to be the root hash of
6473+        # our share hash tree but isn't       
6474+        d.addCallback(lambda ignored:
6475+            mw.put_root_hash(self.root_hash))
6476+        # The root hash gets inserted at byte 9 (its position is in the header,
6477+        # and is fixed).
6478+        def _check(ignored):
6479+            self.failUnlessEqual(read("si1", [0], [(9, 32)]),
6480+                                 {0: [self.root_hash]})
6481+        d.addCallback(_check)
6482+
6483+        # Next, we put a signature of the header block.
6484+        d.addCallback(lambda ignored:
6485+            mw.put_signature(self.signature))
6486+        expected_signature_offset = expected_share_hash_offset + len(self.share_hash_chain_s)
6487+        self.failUnlessEqual(len(self.signature), 9)
6488+        d.addCallback(lambda ignored:
6489+            self.failUnlessEqual(read("si1", [0], [(expected_signature_offset, 9)]),
6490+                                 {0: [self.signature]}))
6491+
6492+        # Next, we put the verification key
6493+        d.addCallback(lambda ignored:
6494+            mw.put_verification_key(self.verification_key))
6495+        expected_verification_key_offset = expected_signature_offset + len(self.signature)
6496+        self.failUnlessEqual(len(self.verification_key), 6)
6497+        d.addCallback(lambda ignored:
6498+            self.failUnlessEqual(read("si1", [0], [(expected_verification_key_offset, 6)]),
6499+                                 {0: [self.verification_key]}))
6500+
6501+        def _check_signable(ignored):
6502+            # Make sure that the signable is what we think it should be.
6503+            signable = mw.get_signable()
6504+            verno, seq, roothash, k, n, segsize, datalen = \
6505+                                            struct.unpack(">BQ32sBBQQ",
6506+                                                          signable)
6507+            self.failUnlessEqual(verno, 1)
6508+            self.failUnlessEqual(seq, 0)
6509+            self.failUnlessEqual(roothash, self.root_hash)
6510+            self.failUnlessEqual(k, 3)
6511+            self.failUnlessEqual(n, 10)
6512+            self.failUnlessEqual(segsize, 6)
6513+            self.failUnlessEqual(datalen, 36)
6514+        d.addCallback(_check_signable)
6515+        # Next, we cause the offset table to be published.
6516+        d.addCallback(lambda ignored:
6517+            mw.finish_publishing())
6518+        expected_eof_offset = expected_verification_key_offset + len(self.verification_key)
6519+
6520+        def _check_offsets(ignored):
6521+            # Check the version number to make sure that it is correct.
6522+            expected_version_number = struct.pack(">B", 1)
6523+            self.failUnlessEqual(read("si1", [0], [(0, 1)]),
6524+                                 {0: [expected_version_number]})
6525+            # Check the sequence number to make sure that it is correct
6526+            expected_sequence_number = struct.pack(">Q", 0)
6527+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
6528+                                 {0: [expected_sequence_number]})
6529+            # Check that the encoding parameters (k, N, segement size, data
6530+            # length) are what they should be. These are  3, 10, 6, 36
6531+            expected_k = struct.pack(">B", 3)
6532+            self.failUnlessEqual(read("si1", [0], [(41, 1)]),
6533+                                 {0: [expected_k]})
6534+            expected_n = struct.pack(">B", 10)
6535+            self.failUnlessEqual(read("si1", [0], [(42, 1)]),
6536+                                 {0: [expected_n]})
6537+            expected_segment_size = struct.pack(">Q", 6)
6538+            self.failUnlessEqual(read("si1", [0], [(43, 8)]),
6539+                                 {0: [expected_segment_size]})
6540+            expected_data_length = struct.pack(">Q", 36)
6541+            self.failUnlessEqual(read("si1", [0], [(51, 8)]),
6542+                                 {0: [expected_data_length]})
6543+            expected_offset = struct.pack(">Q", expected_private_key_offset)
6544+            self.failUnlessEqual(read("si1", [0], [(59, 8)]),
6545+                                 {0: [expected_offset]})
6546+            expected_offset = struct.pack(">Q", expected_block_hash_offset)
6547+            self.failUnlessEqual(read("si1", [0], [(67, 8)]),
6548+                                 {0: [expected_offset]})
6549+            expected_offset = struct.pack(">Q", expected_share_hash_offset)
6550+            self.failUnlessEqual(read("si1", [0], [(75, 8)]),
6551+                                 {0: [expected_offset]})
6552+            expected_offset = struct.pack(">Q", expected_signature_offset)
6553+            self.failUnlessEqual(read("si1", [0], [(83, 8)]),
6554+                                 {0: [expected_offset]})
6555+            expected_offset = struct.pack(">Q", expected_verification_key_offset)
6556+            self.failUnlessEqual(read("si1", [0], [(91, 8)]),
6557+                                 {0: [expected_offset]})
6558+            expected_offset = struct.pack(">Q", expected_eof_offset)
6559+            self.failUnlessEqual(read("si1", [0], [(99, 8)]),
6560+                                 {0: [expected_offset]})
6561+        d.addCallback(_check_offsets)
6562+        return d
6563+
6564+    def _make_new_mw(self, si, share, datalength=36):
6565+        # This is a file of size 36 bytes. Since it has a segment
6566+        # size of 6, we know that it has 6 byte segments, which will
6567+        # be split into blocks of 2 bytes because our FEC k
6568+        # parameter is 3.
6569+        mw = MDMFSlotWriteProxy(share, self.rref, si, self.secrets, 0, 3, 10,
6570+                                6, datalength)
6571+        return mw
6572+
6573+
6574+    def test_write_rejected_with_too_many_blocks(self):
6575+        mw = self._make_new_mw("si0", 0)
6576+
6577+        # Try writing too many blocks. We should not be able to write
6578+        # more than 6
6579+        # blocks into each share.
6580+        d = defer.succeed(None)
6581+        for i in xrange(6):
6582+            d.addCallback(lambda ignored, i=i:
6583+                mw.put_block(self.block, i, self.salt))
6584+        d.addCallback(lambda ignored:
6585+            self.shouldFail(LayoutInvalid, "too many blocks",
6586+                            None,
6587+                            mw.put_block, self.block, 7, self.salt))
6588+        return d
6589+
6590+
6591+    def test_write_rejected_with_invalid_salt(self):
6592+        # Try writing an invalid salt. Salts are 16 bytes -- any more or
6593+        # less should cause an error.
6594+        mw = self._make_new_mw("si1", 0)
6595+        bad_salt = "a" * 17 # 17 bytes
6596+        d = defer.succeed(None)
6597+        d.addCallback(lambda ignored:
6598+            self.shouldFail(LayoutInvalid, "test_invalid_salt",
6599+                            None, mw.put_block, self.block, 7, bad_salt))
6600+        return d
6601+
6602+
6603+    def test_write_rejected_with_invalid_root_hash(self):
6604+        # Try writing an invalid root hash. This should be SHA256d, and
6605+        # 32 bytes long as a result.
6606+        mw = self._make_new_mw("si2", 0)
6607+        # 17 bytes != 32 bytes
6608+        invalid_root_hash = "a" * 17
6609+        d = defer.succeed(None)
6610+        # Before this test can work, we need to put some blocks + salts,
6611+        # a block hash tree, and a share hash tree. Otherwise, we'll see
6612+        # failures that match what we are looking for, but are caused by
6613+        # the constraints imposed on operation ordering.
6614+        for i in xrange(6):
6615+            d.addCallback(lambda ignored, i=i:
6616+                mw.put_block(self.block, i, self.salt))
6617+        d.addCallback(lambda ignored:
6618+            mw.put_encprivkey(self.encprivkey))
6619+        d.addCallback(lambda ignored:
6620+            mw.put_blockhashes(self.block_hash_tree))
6621+        d.addCallback(lambda ignored:
6622+            mw.put_sharehashes(self.share_hash_chain))
6623+        d.addCallback(lambda ignored:
6624+            self.shouldFail(LayoutInvalid, "invalid root hash",
6625+                            None, mw.put_root_hash, invalid_root_hash))
6626+        return d
6627+
6628+
6629+    def test_write_rejected_with_invalid_blocksize(self):
6630+        # The blocksize implied by the writer that we get from
6631+        # _make_new_mw is 2bytes -- any more or any less than this
6632+        # should be cause for failure, unless it is the tail segment, in
6633+        # which case it may not be failure.
6634+        invalid_block = "a"
6635+        mw = self._make_new_mw("si3", 0, 33) # implies a tail segment with
6636+                                             # one byte blocks
6637+        # 1 bytes != 2 bytes
6638+        d = defer.succeed(None)
6639+        d.addCallback(lambda ignored, invalid_block=invalid_block:
6640+            self.shouldFail(LayoutInvalid, "test blocksize too small",
6641+                            None, mw.put_block, invalid_block, 0,
6642+                            self.salt))
6643+        invalid_block = invalid_block * 3
6644+        # 3 bytes != 2 bytes
6645+        d.addCallback(lambda ignored:
6646+            self.shouldFail(LayoutInvalid, "test blocksize too large",
6647+                            None,
6648+                            mw.put_block, invalid_block, 0, self.salt))
6649+        for i in xrange(5):
6650+            d.addCallback(lambda ignored, i=i:
6651+                mw.put_block(self.block, i, self.salt))
6652+        # Try to put an invalid tail segment
6653+        d.addCallback(lambda ignored:
6654+            self.shouldFail(LayoutInvalid, "test invalid tail segment",
6655+                            None,
6656+                            mw.put_block, self.block, 5, self.salt))
6657+        valid_block = "a"
6658+        d.addCallback(lambda ignored:
6659+            mw.put_block(valid_block, 5, self.salt))
6660+        return d
6661+
6662+
6663+    def test_write_enforces_order_constraints(self):
6664+        # We require that the MDMFSlotWriteProxy be interacted with in a
6665+        # specific way.
6666+        # That way is:
6667+        # 0: __init__
6668+        # 1: write blocks and salts
6669+        # 2: Write the encrypted private key
6670+        # 3: Write the block hashes
6671+        # 4: Write the share hashes
6672+        # 5: Write the root hash and salt hash
6673+        # 6: Write the signature and verification key
6674+        # 7: Write the file.
6675+        #
6676+        # Some of these can be performed out-of-order, and some can't.
6677+        # The dependencies that I want to test here are:
6678+        #  - Private key before block hashes
6679+        #  - share hashes and block hashes before root hash
6680+        #  - root hash before signature
6681+        #  - signature before verification key
6682+        mw0 = self._make_new_mw("si0", 0)
6683+        # Write some shares
6684+        d = defer.succeed(None)
6685+        for i in xrange(6):
6686+            d.addCallback(lambda ignored, i=i:
6687+                mw0.put_block(self.block, i, self.salt))
6688+        # Try to write the block hashes before writing the encrypted
6689+        # private key
6690+        d.addCallback(lambda ignored:
6691+            self.shouldFail(LayoutInvalid, "block hashes before key",
6692+                            None, mw0.put_blockhashes,
6693+                            self.block_hash_tree))
6694+
6695+        # Write the private key.
6696+        d.addCallback(lambda ignored:
6697+            mw0.put_encprivkey(self.encprivkey))
6698+
6699+
6700+        # Try to write the share hash chain without writing the block
6701+        # hash tree
6702+        d.addCallback(lambda ignored:
6703+            self.shouldFail(LayoutInvalid, "share hash chain before "
6704+                                           "salt hash tree",
6705+                            None,
6706+                            mw0.put_sharehashes, self.share_hash_chain))
6707+
6708+        # Try to write the root hash and without writing either the
6709+        # block hashes or the or the share hashes
6710+        d.addCallback(lambda ignored:
6711+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
6712+                            None,
6713+                            mw0.put_root_hash, self.root_hash))
6714+
6715+        # Now write the block hashes and try again
6716+        d.addCallback(lambda ignored:
6717+            mw0.put_blockhashes(self.block_hash_tree))
6718+
6719+        d.addCallback(lambda ignored:
6720+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
6721+                            None, mw0.put_root_hash, self.root_hash))
6722+
6723+        # We haven't yet put the root hash on the share, so we shouldn't
6724+        # be able to sign it.
6725+        d.addCallback(lambda ignored:
6726+            self.shouldFail(LayoutInvalid, "signature before root hash",
6727+                            None, mw0.put_signature, self.signature))
6728+
6729+        d.addCallback(lambda ignored:
6730+            self.failUnlessRaises(LayoutInvalid, mw0.get_signable))
6731+
6732+        # ..and, since that fails, we also shouldn't be able to put the
6733+        # verification key.
6734+        d.addCallback(lambda ignored:
6735+            self.shouldFail(LayoutInvalid, "key before signature",
6736+                            None, mw0.put_verification_key,
6737+                            self.verification_key))
6738+
6739+        # Now write the share hashes.
6740+        d.addCallback(lambda ignored:
6741+            mw0.put_sharehashes(self.share_hash_chain))
6742+        # We should be able to write the root hash now too
6743+        d.addCallback(lambda ignored:
6744+            mw0.put_root_hash(self.root_hash))
6745+
6746+        # We should still be unable to put the verification key
6747+        d.addCallback(lambda ignored:
6748+            self.shouldFail(LayoutInvalid, "key before signature",
6749+                            None, mw0.put_verification_key,
6750+                            self.verification_key))
6751+
6752+        d.addCallback(lambda ignored:
6753+            mw0.put_signature(self.signature))
6754+
6755+        # We shouldn't be able to write the offsets to the remote server
6756+        # until the offset table is finished; IOW, until we have written
6757+        # the verification key.
6758+        d.addCallback(lambda ignored:
6759+            self.shouldFail(LayoutInvalid, "offsets before verification key",
6760+                            None,
6761+                            mw0.finish_publishing))
6762+
6763+        d.addCallback(lambda ignored:
6764+            mw0.put_verification_key(self.verification_key))
6765+        return d
6766+
6767+
6768+    def test_end_to_end(self):
6769+        mw = self._make_new_mw("si1", 0)
6770+        # Write a share using the mutable writer, and make sure that the
6771+        # reader knows how to read everything back to us.
6772+        d = defer.succeed(None)
6773+        for i in xrange(6):
6774+            d.addCallback(lambda ignored, i=i:
6775+                mw.put_block(self.block, i, self.salt))
6776+        d.addCallback(lambda ignored:
6777+            mw.put_encprivkey(self.encprivkey))
6778+        d.addCallback(lambda ignored:
6779+            mw.put_blockhashes(self.block_hash_tree))
6780+        d.addCallback(lambda ignored:
6781+            mw.put_sharehashes(self.share_hash_chain))
6782+        d.addCallback(lambda ignored:
6783+            mw.put_root_hash(self.root_hash))
6784+        d.addCallback(lambda ignored:
6785+            mw.put_signature(self.signature))
6786+        d.addCallback(lambda ignored:
6787+            mw.put_verification_key(self.verification_key))
6788+        d.addCallback(lambda ignored:
6789+            mw.finish_publishing())
6790+
6791+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6792+        def _check_block_and_salt((block, salt)):
6793+            self.failUnlessEqual(block, self.block)
6794+            self.failUnlessEqual(salt, self.salt)
6795+
6796+        for i in xrange(6):
6797+            d.addCallback(lambda ignored, i=i:
6798+                mr.get_block_and_salt(i))
6799+            d.addCallback(_check_block_and_salt)
6800+
6801+        d.addCallback(lambda ignored:
6802+            mr.get_encprivkey())
6803+        d.addCallback(lambda encprivkey:
6804+            self.failUnlessEqual(self.encprivkey, encprivkey))
6805+
6806+        d.addCallback(lambda ignored:
6807+            mr.get_blockhashes())
6808+        d.addCallback(lambda blockhashes:
6809+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
6810+
6811+        d.addCallback(lambda ignored:
6812+            mr.get_sharehashes())
6813+        d.addCallback(lambda sharehashes:
6814+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
6815+
6816+        d.addCallback(lambda ignored:
6817+            mr.get_signature())
6818+        d.addCallback(lambda signature:
6819+            self.failUnlessEqual(signature, self.signature))
6820+
6821+        d.addCallback(lambda ignored:
6822+            mr.get_verification_key())
6823+        d.addCallback(lambda verification_key:
6824+            self.failUnlessEqual(verification_key, self.verification_key))
6825+
6826+        d.addCallback(lambda ignored:
6827+            mr.get_seqnum())
6828+        d.addCallback(lambda seqnum:
6829+            self.failUnlessEqual(seqnum, 0))
6830+
6831+        d.addCallback(lambda ignored:
6832+            mr.get_root_hash())
6833+        d.addCallback(lambda root_hash:
6834+            self.failUnlessEqual(self.root_hash, root_hash))
6835+
6836+        d.addCallback(lambda ignored:
6837+            mr.get_encoding_parameters())
6838+        def _check_encoding_parameters((k, n, segsize, datalen)):
6839+            self.failUnlessEqual(k, 3)
6840+            self.failUnlessEqual(n, 10)
6841+            self.failUnlessEqual(segsize, 6)
6842+            self.failUnlessEqual(datalen, 36)
6843+        d.addCallback(_check_encoding_parameters)
6844+
6845+        d.addCallback(lambda ignored:
6846+            mr.get_checkstring())
6847+        d.addCallback(lambda checkstring:
6848+            self.failUnlessEqual(checkstring, mw.get_checkstring()))
6849+        return d
6850+
6851+
6852+    def test_is_sdmf(self):
6853+        # The MDMFSlotReadProxy should also know how to read SDMF files,
6854+        # since it will encounter them on the grid. Callers use the
6855+        # is_sdmf method to test this.
6856+        self.write_sdmf_share_to_server("si1")
6857+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6858+        d = mr.is_sdmf()
6859+        d.addCallback(lambda issdmf:
6860+            self.failUnless(issdmf))
6861+        return d
6862+
6863+
6864+    def test_reads_sdmf(self):
6865+        # The slot read proxy should, naturally, know how to tell us
6866+        # about data in the SDMF format
6867+        self.write_sdmf_share_to_server("si1")
6868+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6869+        d = defer.succeed(None)
6870+        d.addCallback(lambda ignored:
6871+            mr.is_sdmf())
6872+        d.addCallback(lambda issdmf:
6873+            self.failUnless(issdmf))
6874+
6875+        # What do we need to read?
6876+        #  - The sharedata
6877+        #  - The salt
6878+        d.addCallback(lambda ignored:
6879+            mr.get_block_and_salt(0))
6880+        def _check_block_and_salt(results):
6881+            block, salt = results
6882+            # Our original file is 36 bytes long. Then each share is 12
6883+            # bytes in size. The share is composed entirely of the
6884+            # letter a. self.block contains 2 as, so 6 * self.block is
6885+            # what we are looking for.
6886+            self.failUnlessEqual(block, self.block * 6)
6887+            self.failUnlessEqual(salt, self.salt)
6888+        d.addCallback(_check_block_and_salt)
6889+
6890+        #  - The blockhashes
6891+        d.addCallback(lambda ignored:
6892+            mr.get_blockhashes())
6893+        d.addCallback(lambda blockhashes:
6894+            self.failUnlessEqual(self.block_hash_tree,
6895+                                 blockhashes,
6896+                                 blockhashes))
6897+        #  - The sharehashes
6898+        d.addCallback(lambda ignored:
6899+            mr.get_sharehashes())
6900+        d.addCallback(lambda sharehashes:
6901+            self.failUnlessEqual(self.share_hash_chain,
6902+                                 sharehashes))
6903+        #  - The keys
6904+        d.addCallback(lambda ignored:
6905+            mr.get_encprivkey())
6906+        d.addCallback(lambda encprivkey:
6907+            self.failUnlessEqual(encprivkey, self.encprivkey, encprivkey))
6908+        d.addCallback(lambda ignored:
6909+            mr.get_verification_key())
6910+        d.addCallback(lambda verification_key:
6911+            self.failUnlessEqual(verification_key,
6912+                                 self.verification_key,
6913+                                 verification_key))
6914+        #  - The signature
6915+        d.addCallback(lambda ignored:
6916+            mr.get_signature())
6917+        d.addCallback(lambda signature:
6918+            self.failUnlessEqual(signature, self.signature, signature))
6919+
6920+        #  - The sequence number
6921+        d.addCallback(lambda ignored:
6922+            mr.get_seqnum())
6923+        d.addCallback(lambda seqnum:
6924+            self.failUnlessEqual(seqnum, 0, seqnum))
6925+
6926+        #  - The root hash
6927+        d.addCallback(lambda ignored:
6928+            mr.get_root_hash())
6929+        d.addCallback(lambda root_hash:
6930+            self.failUnlessEqual(root_hash, self.root_hash, root_hash))
6931+        return d
6932+
6933+
6934+    def test_only_reads_one_segment_sdmf(self):
6935+        # SDMF shares have only one segment, so it doesn't make sense to
6936+        # read more segments than that. The reader should know this and
6937+        # complain if we try to do that.
6938+        self.write_sdmf_share_to_server("si1")
6939+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6940+        d = defer.succeed(None)
6941+        d.addCallback(lambda ignored:
6942+            mr.is_sdmf())
6943+        d.addCallback(lambda issdmf:
6944+            self.failUnless(issdmf))
6945+        d.addCallback(lambda ignored:
6946+            self.shouldFail(LayoutInvalid, "test bad segment",
6947+                            None,
6948+                            mr.get_block_and_salt, 1))
6949+        return d
6950+
6951+
6952+    def test_read_with_prefetched_mdmf_data(self):
6953+        # The MDMFSlotReadProxy will prefill certain fields if you pass
6954+        # it data that you have already fetched. This is useful for
6955+        # cases like the Servermap, which prefetches ~2kb of data while
6956+        # finding out which shares are on the remote peer so that it
6957+        # doesn't waste round trips.
6958+        mdmf_data = self.build_test_mdmf_share()
6959+        self.write_test_share_to_server("si1")
6960+        def _make_mr(ignored, length):
6961+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, mdmf_data[:length])
6962+            return mr
6963+
6964+        d = defer.succeed(None)
6965+        # This should be enough to fill in both the encoding parameters
6966+        # and the table of offsets, which will complete the version
6967+        # information tuple.
6968+        d.addCallback(_make_mr, 107)
6969+        d.addCallback(lambda mr:
6970+            mr.get_verinfo())
6971+        def _check_verinfo(verinfo):
6972+            self.failUnless(verinfo)
6973+            self.failUnlessEqual(len(verinfo), 9)
6974+            (seqnum,
6975+             root_hash,
6976+             salt_hash,
6977+             segsize,
6978+             datalen,
6979+             k,
6980+             n,
6981+             prefix,
6982+             offsets) = verinfo
6983+            self.failUnlessEqual(seqnum, 0)
6984+            self.failUnlessEqual(root_hash, self.root_hash)
6985+            self.failUnlessEqual(segsize, 6)
6986+            self.failUnlessEqual(datalen, 36)
6987+            self.failUnlessEqual(k, 3)
6988+            self.failUnlessEqual(n, 10)
6989+            expected_prefix = struct.pack(MDMFSIGNABLEHEADER,
6990+                                          1,
6991+                                          seqnum,
6992+                                          root_hash,
6993+                                          k,
6994+                                          n,
6995+                                          segsize,
6996+                                          datalen)
6997+            self.failUnlessEqual(expected_prefix, prefix)
6998+            self.failUnlessEqual(self.rref.read_count, 0)
6999+        d.addCallback(_check_verinfo)
7000+        # This is not enough data to read a block and a share, so the
7001+        # wrapper should attempt to read this from the remote server.
7002+        d.addCallback(_make_mr, 107)
7003+        d.addCallback(lambda mr:
7004+            mr.get_block_and_salt(0))
7005+        def _check_block_and_salt((block, salt)):
7006+            self.failUnlessEqual(block, self.block)
7007+            self.failUnlessEqual(salt, self.salt)
7008+            self.failUnlessEqual(self.rref.read_count, 1)
7009+        # This should be enough data to read one block.
7010+        d.addCallback(_make_mr, 249)
7011+        d.addCallback(lambda mr:
7012+            mr.get_block_and_salt(0))
7013+        d.addCallback(_check_block_and_salt)
7014+        return d
7015+
7016+
7017+    def test_read_with_prefetched_sdmf_data(self):
7018+        sdmf_data = self.build_test_sdmf_share()
7019+        self.write_sdmf_share_to_server("si1")
7020+        def _make_mr(ignored, length):
7021+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, sdmf_data[:length])
7022+            return mr
7023+
7024+        d = defer.succeed(None)
7025+        # This should be enough to get us the encoding parameters,
7026+        # offset table, and everything else we need to build a verinfo
7027+        # string.
7028+        d.addCallback(_make_mr, 107)
7029+        d.addCallback(lambda mr:
7030+            mr.get_verinfo())
7031+        def _check_verinfo(verinfo):
7032+            self.failUnless(verinfo)
7033+            self.failUnlessEqual(len(verinfo), 9)
7034+            (seqnum,
7035+             root_hash,
7036+             salt,
7037+             segsize,
7038+             datalen,
7039+             k,
7040+             n,
7041+             prefix,
7042+             offsets) = verinfo
7043+            self.failUnlessEqual(seqnum, 0)
7044+            self.failUnlessEqual(root_hash, self.root_hash)
7045+            self.failUnlessEqual(salt, self.salt)
7046+            self.failUnlessEqual(segsize, 36)
7047+            self.failUnlessEqual(datalen, 36)
7048+            self.failUnlessEqual(k, 3)
7049+            self.failUnlessEqual(n, 10)
7050+            expected_prefix = struct.pack(SIGNED_PREFIX,
7051+                                          0,
7052+                                          seqnum,
7053+                                          root_hash,
7054+                                          salt,
7055+                                          k,
7056+                                          n,
7057+                                          segsize,
7058+                                          datalen)
7059+            self.failUnlessEqual(expected_prefix, prefix)
7060+            self.failUnlessEqual(self.rref.read_count, 0)
7061+        d.addCallback(_check_verinfo)
7062+        # This shouldn't be enough to read any share data.
7063+        d.addCallback(_make_mr, 107)
7064+        d.addCallback(lambda mr:
7065+            mr.get_block_and_salt(0))
7066+        def _check_block_and_salt((block, salt)):
7067+            self.failUnlessEqual(block, self.block * 6)
7068+            self.failUnlessEqual(salt, self.salt)
7069+            # TODO: Fix the read routine so that it reads only the data
7070+            #       that it has cached if it can't read all of it.
7071+            self.failUnlessEqual(self.rref.read_count, 2)
7072+
7073+        # This should be enough to read share data.
7074+        d.addCallback(_make_mr, self.offsets['share_data'])
7075+        d.addCallback(lambda mr:
7076+            mr.get_block_and_salt(0))
7077+        d.addCallback(_check_block_and_salt)
7078+        return d
7079+
7080+
7081+    def test_read_with_empty_mdmf_file(self):
7082+        # Some tests upload a file with no contents to test things
7083+        # unrelated to the actual handling of the content of the file.
7084+        # The reader should behave intelligently in these cases.
7085+        self.write_test_share_to_server("si1", empty=True)
7086+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7087+        # We should be able to get the encoding parameters, and they
7088+        # should be correct.
7089+        d = defer.succeed(None)
7090+        d.addCallback(lambda ignored:
7091+            mr.get_encoding_parameters())
7092+        def _check_encoding_parameters(params):
7093+            self.failUnlessEqual(len(params), 4)
7094+            k, n, segsize, datalen = params
7095+            self.failUnlessEqual(k, 3)
7096+            self.failUnlessEqual(n, 10)
7097+            self.failUnlessEqual(segsize, 0)
7098+            self.failUnlessEqual(datalen, 0)
7099+        d.addCallback(_check_encoding_parameters)
7100+
7101+        # We should not be able to fetch a block, since there are no
7102+        # blocks to fetch
7103+        d.addCallback(lambda ignored:
7104+            self.shouldFail(LayoutInvalid, "get block on empty file",
7105+                            None,
7106+                            mr.get_block_and_salt, 0))
7107+        return d
7108+
7109+
7110+    def test_read_with_empty_sdmf_file(self):
7111+        self.write_sdmf_share_to_server("si1", empty=True)
7112+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7113+        # We should be able to get the encoding parameters, and they
7114+        # should be correct
7115+        d = defer.succeed(None)
7116+        d.addCallback(lambda ignored:
7117+            mr.get_encoding_parameters())
7118+        def _check_encoding_parameters(params):
7119+            self.failUnlessEqual(len(params), 4)
7120+            k, n, segsize, datalen = params
7121+            self.failUnlessEqual(k, 3)
7122+            self.failUnlessEqual(n, 10)
7123+            self.failUnlessEqual(segsize, 0)
7124+            self.failUnlessEqual(datalen, 0)
7125+        d.addCallback(_check_encoding_parameters)
7126+
7127+        # It does not make sense to get a block in this format, so we
7128+        # should not be able to.
7129+        d.addCallback(lambda ignored:
7130+            self.shouldFail(LayoutInvalid, "get block on an empty file",
7131+                            None,
7132+                            mr.get_block_and_salt, 0))
7133+        return d
7134+
7135+
7136+    def test_verinfo_with_sdmf_file(self):
7137+        self.write_sdmf_share_to_server("si1")
7138+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7139+        # We should be able to get the version information.
7140+        d = defer.succeed(None)
7141+        d.addCallback(lambda ignored:
7142+            mr.get_verinfo())
7143+        def _check_verinfo(verinfo):
7144+            self.failUnless(verinfo)
7145+            self.failUnlessEqual(len(verinfo), 9)
7146+            (seqnum,
7147+             root_hash,
7148+             salt,
7149+             segsize,
7150+             datalen,
7151+             k,
7152+             n,
7153+             prefix,
7154+             offsets) = verinfo
7155+            self.failUnlessEqual(seqnum, 0)
7156+            self.failUnlessEqual(root_hash, self.root_hash)
7157+            self.failUnlessEqual(salt, self.salt)
7158+            self.failUnlessEqual(segsize, 36)
7159+            self.failUnlessEqual(datalen, 36)
7160+            self.failUnlessEqual(k, 3)
7161+            self.failUnlessEqual(n, 10)
7162+            expected_prefix = struct.pack(">BQ32s16s BBQQ",
7163+                                          0,
7164+                                          seqnum,
7165+                                          root_hash,
7166+                                          salt,
7167+                                          k,
7168+                                          n,
7169+                                          segsize,
7170+                                          datalen)
7171+            self.failUnlessEqual(prefix, expected_prefix)
7172+            self.failUnlessEqual(offsets, self.offsets)
7173+        d.addCallback(_check_verinfo)
7174+        return d
7175+
7176+
7177+    def test_verinfo_with_mdmf_file(self):
7178+        self.write_test_share_to_server("si1")
7179+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7180+        d = defer.succeed(None)
7181+        d.addCallback(lambda ignored:
7182+            mr.get_verinfo())
7183+        def _check_verinfo(verinfo):
7184+            self.failUnless(verinfo)
7185+            self.failUnlessEqual(len(verinfo), 9)
7186+            (seqnum,
7187+             root_hash,
7188+             IV,
7189+             segsize,
7190+             datalen,
7191+             k,
7192+             n,
7193+             prefix,
7194+             offsets) = verinfo
7195+            self.failUnlessEqual(seqnum, 0)
7196+            self.failUnlessEqual(root_hash, self.root_hash)
7197+            self.failIf(IV)
7198+            self.failUnlessEqual(segsize, 6)
7199+            self.failUnlessEqual(datalen, 36)
7200+            self.failUnlessEqual(k, 3)
7201+            self.failUnlessEqual(n, 10)
7202+            expected_prefix = struct.pack(">BQ32s BBQQ",
7203+                                          1,
7204+                                          seqnum,
7205+                                          root_hash,
7206+                                          k,
7207+                                          n,
7208+                                          segsize,
7209+                                          datalen)
7210+            self.failUnlessEqual(prefix, expected_prefix)
7211+            self.failUnlessEqual(offsets, self.offsets)
7212+        d.addCallback(_check_verinfo)
7213+        return d
7214+
7215+
7216+    def test_reader_queue(self):
7217+        self.write_test_share_to_server('si1')
7218+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7219+        d1 = mr.get_block_and_salt(0, queue=True)
7220+        d2 = mr.get_blockhashes(queue=True)
7221+        d3 = mr.get_sharehashes(queue=True)
7222+        d4 = mr.get_signature(queue=True)
7223+        d5 = mr.get_verification_key(queue=True)
7224+        dl = defer.DeferredList([d1, d2, d3, d4, d5])
7225+        mr.flush()
7226+        def _print(results):
7227+            self.failUnlessEqual(len(results), 5)
7228+            # We have one read for version information and offsets, and
7229+            # one for everything else.
7230+            self.failUnlessEqual(self.rref.read_count, 2)
7231+            block, salt = results[0][1] # results[0] is a boolean that says
7232+                                           # whether or not the operation
7233+                                           # worked.
7234+            self.failUnlessEqual(self.block, block)
7235+            self.failUnlessEqual(self.salt, salt)
7236+
7237+            blockhashes = results[1][1]
7238+            self.failUnlessEqual(self.block_hash_tree, blockhashes)
7239+
7240+            sharehashes = results[2][1]
7241+            self.failUnlessEqual(self.share_hash_chain, sharehashes)
7242+
7243+            signature = results[3][1]
7244+            self.failUnlessEqual(self.signature, signature)
7245+
7246+            verification_key = results[4][1]
7247+            self.failUnlessEqual(self.verification_key, verification_key)
7248+        dl.addCallback(_print)
7249+        return dl
7250+
7251+
7252+    def test_sdmf_writer(self):
7253+        # Go through the motions of writing an SDMF share to the storage
7254+        # server. Then read the storage server to see that the share got
7255+        # written in the way that we think it should have.
7256+
7257+        # We do this first so that the necessary instance variables get
7258+        # set the way we want them for the tests below.
7259+        data = self.build_test_sdmf_share()
7260+        sdmfr = SDMFSlotWriteProxy(0,
7261+                                   self.rref,
7262+                                   "si1",
7263+                                   self.secrets,
7264+                                   0, 3, 10, 36, 36)
7265+        # Put the block and salt.
7266+        sdmfr.put_block(self.blockdata, 0, self.salt)
7267+
7268+        # Put the encprivkey
7269+        sdmfr.put_encprivkey(self.encprivkey)
7270+
7271+        # Put the block and share hash chains
7272+        sdmfr.put_blockhashes(self.block_hash_tree)
7273+        sdmfr.put_sharehashes(self.share_hash_chain)
7274+        sdmfr.put_root_hash(self.root_hash)
7275+
7276+        # Put the signature
7277+        sdmfr.put_signature(self.signature)
7278+
7279+        # Put the verification key
7280+        sdmfr.put_verification_key(self.verification_key)
7281+
7282+        # Now check to make sure that nothing has been written yet.
7283+        self.failUnlessEqual(self.rref.write_count, 0)
7284+
7285+        # Now finish publishing
7286+        d = sdmfr.finish_publishing()
7287+        def _then(ignored):
7288+            self.failUnlessEqual(self.rref.write_count, 1)
7289+            read = self.ss.remote_slot_readv
7290+            self.failUnlessEqual(read("si1", [0], [(0, len(data))]),
7291+                                 {0: [data]})
7292+        d.addCallback(_then)
7293+        return d
7294+
7295+
7296+    def test_sdmf_writer_preexisting_share(self):
7297+        data = self.build_test_sdmf_share()
7298+        self.write_sdmf_share_to_server("si1")
7299+
7300+        # Now there is a share on the storage server. To successfully
7301+        # write, we need to set the checkstring correctly. When we
7302+        # don't, no write should occur.
7303+        sdmfw = SDMFSlotWriteProxy(0,
7304+                                   self.rref,
7305+                                   "si1",
7306+                                   self.secrets,
7307+                                   1, 3, 10, 36, 36)
7308+        sdmfw.put_block(self.blockdata, 0, self.salt)
7309+
7310+        # Put the encprivkey
7311+        sdmfw.put_encprivkey(self.encprivkey)
7312+
7313+        # Put the block and share hash chains
7314+        sdmfw.put_blockhashes(self.block_hash_tree)
7315+        sdmfw.put_sharehashes(self.share_hash_chain)
7316+
7317+        # Put the root hash
7318+        sdmfw.put_root_hash(self.root_hash)
7319+
7320+        # Put the signature
7321+        sdmfw.put_signature(self.signature)
7322+
7323+        # Put the verification key
7324+        sdmfw.put_verification_key(self.verification_key)
7325+
7326+        # We shouldn't have a checkstring yet
7327+        self.failUnlessEqual(sdmfw.get_checkstring(), "")
7328+
7329+        d = sdmfw.finish_publishing()
7330+        def _then(results):
7331+            self.failIf(results[0])
7332+            # this is the correct checkstring
7333+            self._expected_checkstring = results[1][0][0]
7334+            return self._expected_checkstring
7335+
7336+        d.addCallback(_then)
7337+        d.addCallback(sdmfw.set_checkstring)
7338+        d.addCallback(lambda ignored:
7339+            sdmfw.get_checkstring())
7340+        d.addCallback(lambda checkstring:
7341+            self.failUnlessEqual(checkstring, self._expected_checkstring))
7342+        d.addCallback(lambda ignored:
7343+            sdmfw.finish_publishing())
7344+        def _then_again(results):
7345+            self.failUnless(results[0])
7346+            read = self.ss.remote_slot_readv
7347+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
7348+                                 {0: [struct.pack(">Q", 1)]})
7349+            self.failUnlessEqual(read("si1", [0], [(9, len(data) - 9)]),
7350+                                 {0: [data[9:]]})
7351+        d.addCallback(_then_again)
7352+        return d
7353+
7354+
7355 class Stats(unittest.TestCase):
7356 
7357     def setUp(self):
7358}
7359[mutable/publish.py: cleanup + simplification
7360Kevan Carstensen <kevan@isnotajoke.com>**20100702225554
7361 Ignore-this: 36a58424ceceffb1ddc55cc5934399e2
7362] {
7363hunk ./src/allmydata/mutable/publish.py 19
7364      UncoordinatedWriteError, NotEnoughServersError
7365 from allmydata.mutable.servermap import ServerMap
7366 from allmydata.mutable.layout import pack_prefix, pack_share, unpack_header, pack_checkstring, \
7367-     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy
7368+     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy, \
7369+     SDMFSlotWriteProxy
7370 
7371 KiB = 1024
7372 DEFAULT_MAX_SEGMENT_SIZE = 128 * KiB
7373hunk ./src/allmydata/mutable/publish.py 24
7374+PUSHING_BLOCKS_STATE = 0
7375+PUSHING_EVERYTHING_ELSE_STATE = 1
7376+DONE_STATE = 2
7377 
7378 class PublishStatus:
7379     implements(IPublishStatus)
7380hunk ./src/allmydata/mutable/publish.py 229
7381 
7382         self.bad_share_checkstrings = {}
7383 
7384+        # This is set at the last step of the publishing process.
7385+        self.versioninfo = ""
7386+
7387         # we use the servermap to populate the initial goal: this way we will
7388         # try to update each existing share in place.
7389         for (peerid, shnum) in self._servermap.servermap:
7390hunk ./src/allmydata/mutable/publish.py 245
7391             self.bad_share_checkstrings[key] = old_checkstring
7392             self.connections[peerid] = self._servermap.connections[peerid]
7393 
7394-        # Now, the process dovetails -- if this is an SDMF file, we need
7395-        # to write an SDMF file. Otherwise, we need to write an MDMF
7396-        # file.
7397-        if self._version == MDMF_VERSION:
7398-            return self._publish_mdmf()
7399-        else:
7400-            return self._publish_sdmf()
7401-        #return self.done_deferred
7402-
7403-    def _publish_mdmf(self):
7404-        # Next, we find homes for all of the shares that we don't have
7405-        # homes for yet.
7406         # TODO: Make this part do peer selection.
7407         self.update_goal()
7408         self.writers = {}
7409hunk ./src/allmydata/mutable/publish.py 248
7410-        # For each (peerid, shnum) in self.goal, we make an
7411-        # MDMFSlotWriteProxy for that peer. We'll use this to write
7412+        if self._version == MDMF_VERSION:
7413+            writer_class = MDMFSlotWriteProxy
7414+        else:
7415+            writer_class = SDMFSlotWriteProxy
7416+
7417+        # For each (peerid, shnum) in self.goal, we make a
7418+        # write proxy for that peer. We'll use this to write
7419         # shares to the peer.
7420         for key in self.goal:
7421             peerid, shnum = key
7422hunk ./src/allmydata/mutable/publish.py 263
7423             cancel_secret = self._node.get_cancel_secret(peerid)
7424             secrets = (write_enabler, renew_secret, cancel_secret)
7425 
7426-            self.writers[shnum] =  MDMFSlotWriteProxy(shnum,
7427-                                                      self.connections[peerid],
7428-                                                      self._storage_index,
7429-                                                      secrets,
7430-                                                      self._new_seqnum,
7431-                                                      self.required_shares,
7432-                                                      self.total_shares,
7433-                                                      self.segment_size,
7434-                                                      len(self.newdata))
7435+            self.writers[shnum] =  writer_class(shnum,
7436+                                                self.connections[peerid],
7437+                                                self._storage_index,
7438+                                                secrets,
7439+                                                self._new_seqnum,
7440+                                                self.required_shares,
7441+                                                self.total_shares,
7442+                                                self.segment_size,
7443+                                                len(self.newdata))
7444+            self.writers[shnum].peerid = peerid
7445             if (peerid, shnum) in self._servermap.servermap:
7446                 old_versionid, old_timestamp = self._servermap.servermap[key]
7447                 (old_seqnum, old_root_hash, old_salt, old_segsize,
7448hunk ./src/allmydata/mutable/publish.py 278
7449                  old_datalength, old_k, old_N, old_prefix,
7450                  old_offsets_tuple) = old_versionid
7451-                self.writers[shnum].set_checkstring(old_seqnum, old_root_hash)
7452+                self.writers[shnum].set_checkstring(old_seqnum,
7453+                                                    old_root_hash,
7454+                                                    old_salt)
7455+            elif (peerid, shnum) in self.bad_share_checkstrings:
7456+                old_checkstring = self.bad_share_checkstrings[(peerid, shnum)]
7457+                self.writers[shnum].set_checkstring(old_checkstring)
7458+
7459+        # Our remote shares will not have a complete checkstring until
7460+        # after we are done writing share data and have started to write
7461+        # blocks. In the meantime, we need to know what to look for when
7462+        # writing, so that we can detect UncoordinatedWriteErrors.
7463+        self._checkstring = self.writers.values()[0].get_checkstring()
7464 
7465         # Now, we start pushing shares.
7466         self._status.timings["setup"] = time.time() - self._started
7467hunk ./src/allmydata/mutable/publish.py 293
7468-        def _start_pushing(res):
7469-            self._started_pushing = time.time()
7470-            return res
7471-
7472         # First, we encrypt, encode, and publish the shares that we need
7473         # to encrypt, encode, and publish.
7474 
7475hunk ./src/allmydata/mutable/publish.py 306
7476 
7477         d = defer.succeed(None)
7478         self.log("Starting push")
7479-        for i in xrange(self.num_segments - 1):
7480-            d.addCallback(lambda ignored, i=i:
7481-                self.push_segment(i))
7482-            d.addCallback(self._turn_barrier)
7483-        # We have at least one segment, so we will have a tail segment
7484-        if self.num_segments > 0:
7485-            d.addCallback(lambda ignored:
7486-                self.push_tail_segment())
7487-
7488-        d.addCallback(lambda ignored:
7489-            self.push_encprivkey())
7490-        d.addCallback(lambda ignored:
7491-            self.push_blockhashes())
7492-        d.addCallback(lambda ignored:
7493-            self.push_sharehashes())
7494-        d.addCallback(lambda ignored:
7495-            self.push_toplevel_hashes_and_signature())
7496-        d.addCallback(lambda ignored:
7497-            self.finish_publishing())
7498-        return d
7499-
7500-
7501-    def _publish_sdmf(self):
7502-        self._status.timings["setup"] = time.time() - self._started
7503-        self.salt = os.urandom(16)
7504 
7505hunk ./src/allmydata/mutable/publish.py 307
7506-        d = self._encrypt_and_encode()
7507-        d.addCallback(self._generate_shares)
7508-        def _start_pushing(res):
7509-            self._started_pushing = time.time()
7510-            return res
7511-        d.addCallback(_start_pushing)
7512-        d.addCallback(self.loop) # trigger delivery
7513-        d.addErrback(self._fatal_error)
7514+        self._state = PUSHING_BLOCKS_STATE
7515+        self._push()
7516 
7517         return self.done_deferred
7518 
7519hunk ./src/allmydata/mutable/publish.py 327
7520                                                   segment_size)
7521         else:
7522             self.num_segments = 0
7523+
7524+        self.log("building encoding parameters for file")
7525+        self.log("got segsize %d" % self.segment_size)
7526+        self.log("got %d segments" % self.num_segments)
7527+
7528         if self._version == SDMF_VERSION:
7529             assert self.num_segments in (0, 1) # SDMF
7530hunk ./src/allmydata/mutable/publish.py 334
7531-            return
7532         # calculate the tail segment size.
7533hunk ./src/allmydata/mutable/publish.py 335
7534-        self.tail_segment_size = len(self.newdata) % segment_size
7535 
7536hunk ./src/allmydata/mutable/publish.py 336
7537-        if self.tail_segment_size == 0:
7538+        if segment_size and self.newdata:
7539+            self.tail_segment_size = len(self.newdata) % segment_size
7540+        else:
7541+            self.tail_segment_size = 0
7542+
7543+        if self.tail_segment_size == 0 and segment_size:
7544             # The tail segment is the same size as the other segments.
7545             self.tail_segment_size = segment_size
7546 
7547hunk ./src/allmydata/mutable/publish.py 345
7548-        # We'll make an encoder ahead-of-time for the normal-sized
7549-        # segments (defined as any segment of segment_size size.
7550-        # (the part of the code that puts the tail segment will make its
7551-        #  own encoder for that part)
7552+        # Make FEC encoders
7553         fec = codec.CRSEncoder()
7554         fec.set_params(self.segment_size,
7555                        self.required_shares, self.total_shares)
7556hunk ./src/allmydata/mutable/publish.py 352
7557         self.piece_size = fec.get_block_size()
7558         self.fec = fec
7559 
7560+        if self.tail_segment_size == self.segment_size:
7561+            self.tail_fec = self.fec
7562+        else:
7563+            tail_fec = codec.CRSEncoder()
7564+            tail_fec.set_params(self.tail_segment_size,
7565+                                self.required_shares,
7566+                                self.total_shares)
7567+            self.tail_fec = tail_fec
7568+
7569+        self._current_segment = 0
7570+
7571+
7572+    def _push(self, ignored=None):
7573+        """
7574+        I manage state transitions. In particular, I see that we still
7575+        have a good enough number of writers to complete the upload
7576+        successfully.
7577+        """
7578+        # Can we still successfully publish this file?
7579+        # TODO: Keep track of outstanding queries before aborting the
7580+        #       process.
7581+        if len(self.writers) <= self.required_shares or self.surprised:
7582+            return self._failure()
7583+
7584+        # Figure out what we need to do next. Each of these needs to
7585+        # return a deferred so that we don't block execution when this
7586+        # is first called in the upload method.
7587+        if self._state == PUSHING_BLOCKS_STATE:
7588+            return self.push_segment(self._current_segment)
7589+
7590+        # XXX: Do we want more granularity in states? Is that useful at
7591+        #      all?
7592+        #      Yes -- quicker reaction to UCW.
7593+        elif self._state == PUSHING_EVERYTHING_ELSE_STATE:
7594+            return self.push_everything_else()
7595+
7596+        # If we make it to this point, we were successful in placing the
7597+        # file.
7598+        return self._done(None)
7599+
7600 
7601     def push_segment(self, segnum):
7602hunk ./src/allmydata/mutable/publish.py 394
7603+        if self.num_segments == 0 and self._version == SDMF_VERSION:
7604+            self._add_dummy_salts()
7605+
7606+        if segnum == self.num_segments:
7607+            # We don't have any more segments to push.
7608+            self._state = PUSHING_EVERYTHING_ELSE_STATE
7609+            return self._push()
7610+
7611+        d = self._encode_segment(segnum)
7612+        d.addCallback(self._push_segment, segnum)
7613+        def _increment_segnum(ign):
7614+            self._current_segment += 1
7615+        # XXX: I don't think we need to do addBoth here -- any errBacks
7616+        # should be handled within push_segment.
7617+        d.addBoth(_increment_segnum)
7618+        d.addBoth(self._push)
7619+
7620+
7621+    def _add_dummy_salts(self):
7622+        """
7623+        SDMF files need a salt even if they're empty, or the signature
7624+        won't make sense. This method adds a dummy salt to each of our
7625+        SDMF writers so that they can write the signature later.
7626+        """
7627+        salt = os.urandom(16)
7628+        assert self._version == SDMF_VERSION
7629+
7630+        for writer in self.writers.itervalues():
7631+            writer.put_salt(salt)
7632+
7633+
7634+    def _encode_segment(self, segnum):
7635+        """
7636+        I encrypt and encode the segment segnum.
7637+        """
7638         started = time.time()
7639hunk ./src/allmydata/mutable/publish.py 430
7640-        segsize = self.segment_size
7641+
7642+        if segnum + 1 == self.num_segments:
7643+            segsize = self.tail_segment_size
7644+        else:
7645+            segsize = self.segment_size
7646+
7647+
7648+        offset = self.segment_size * segnum
7649+        length = segsize + offset
7650         self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
7651hunk ./src/allmydata/mutable/publish.py 440
7652-        data = self.newdata[segsize * segnum:segsize*(segnum + 1)]
7653+        data = self.newdata[offset:length]
7654         assert len(data) == segsize
7655 
7656         salt = os.urandom(16)
7657hunk ./src/allmydata/mutable/publish.py 455
7658         started = now
7659 
7660         # now apply FEC
7661+        if segnum + 1 == self.num_segments:
7662+            fec = self.tail_fec
7663+        else:
7664+            fec = self.fec
7665 
7666         self._status.set_status("Encoding")
7667         crypttext_pieces = [None] * self.required_shares
7668hunk ./src/allmydata/mutable/publish.py 462
7669-        piece_size = self.piece_size
7670+        piece_size = fec.get_block_size()
7671         for i in range(len(crypttext_pieces)):
7672             offset = i * piece_size
7673             piece = crypttext[offset:offset+piece_size]
7674hunk ./src/allmydata/mutable/publish.py 469
7675             piece = piece + "\x00"*(piece_size - len(piece)) # padding
7676             crypttext_pieces[i] = piece
7677             assert len(piece) == piece_size
7678-        d = self.fec.encode(crypttext_pieces)
7679+        d = fec.encode(crypttext_pieces)
7680         def _done_encoding(res):
7681             elapsed = time.time() - started
7682             self._status.timings["encode"] = elapsed
7683hunk ./src/allmydata/mutable/publish.py 473
7684-            return res
7685+            return (res, salt)
7686         d.addCallback(_done_encoding)
7687hunk ./src/allmydata/mutable/publish.py 475
7688-
7689-        def _push_shares_and_salt(results):
7690-            shares, shareids = results
7691-            dl = []
7692-            for i in xrange(len(shares)):
7693-                sharedata = shares[i]
7694-                shareid = shareids[i]
7695-                block_hash = hashutil.block_hash(salt + sharedata)
7696-                self.blockhashes[shareid].append(block_hash)
7697-
7698-                # find the writer for this share
7699-                d = self.writers[shareid].put_block(sharedata, segnum, salt)
7700-                dl.append(d)
7701-            # TODO: Naturally, we need to check on the results of these.
7702-            return defer.DeferredList(dl)
7703-        d.addCallback(_push_shares_and_salt)
7704         return d
7705 
7706 
7707hunk ./src/allmydata/mutable/publish.py 478
7708-    def push_tail_segment(self):
7709-        # This is essentially the same as push_segment, except that we
7710-        # don't use the cached encoder that we use elsewhere.
7711-        self.log("Pushing tail segment")
7712+    def _push_segment(self, encoded_and_salt, segnum):
7713+        """
7714+        I push (data, salt) as segment number segnum.
7715+        """
7716+        results, salt = encoded_and_salt
7717+        shares, shareids = results
7718         started = time.time()
7719hunk ./src/allmydata/mutable/publish.py 485
7720-        segsize = self.segment_size
7721-        data = self.newdata[segsize * (self.num_segments-1):]
7722-        assert len(data) == self.tail_segment_size
7723-        salt = os.urandom(16)
7724-
7725-        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
7726-        enc = AES(key)
7727-        crypttext = enc.process(data)
7728-        assert len(crypttext) == len(data)
7729+        dl = []
7730+        for i in xrange(len(shares)):
7731+            sharedata = shares[i]
7732+            shareid = shareids[i]
7733+            if self._version == MDMF_VERSION:
7734+                hashed = salt + sharedata
7735+            else:
7736+                hashed = sharedata
7737+            block_hash = hashutil.block_hash(hashed)
7738+            self.blockhashes[shareid].append(block_hash)
7739 
7740hunk ./src/allmydata/mutable/publish.py 496
7741-        now = time.time()
7742-        self._status.timings['encrypt'] = now - started
7743-        started = now
7744+            # find the writer for this share
7745+            writer = self.writers[shareid]
7746+            d = writer.put_block(sharedata, segnum, salt)
7747+            d.addCallback(self._got_write_answer, writer, started)
7748+            d.addErrback(self._connection_problem, writer)
7749+            dl.append(d)
7750+            # TODO: Naturally, we need to check on the results of these.
7751+        return defer.DeferredList(dl)
7752 
7753hunk ./src/allmydata/mutable/publish.py 505
7754-        self._status.set_status("Encoding")
7755-        tail_fec = codec.CRSEncoder()
7756-        tail_fec.set_params(self.tail_segment_size,
7757-                            self.required_shares,
7758-                            self.total_shares)
7759 
7760hunk ./src/allmydata/mutable/publish.py 506
7761-        crypttext_pieces = [None] * self.required_shares
7762-        piece_size = tail_fec.get_block_size()
7763-        for i in range(len(crypttext_pieces)):
7764-            offset = i * piece_size
7765-            piece = crypttext[offset:offset+piece_size]
7766-            piece = piece + "\x00"*(piece_size - len(piece)) # padding
7767-            crypttext_pieces[i] = piece
7768-            assert len(piece) == piece_size
7769-        d = tail_fec.encode(crypttext_pieces)
7770-        def _push_shares_and_salt(results):
7771-            shares, shareids = results
7772-            dl = []
7773-            for i in xrange(len(shares)):
7774-                sharedata = shares[i]
7775-                shareid = shareids[i]
7776-                block_hash = hashutil.block_hash(salt + sharedata)
7777-                self.blockhashes[shareid].append(block_hash)
7778-                # find the writer for this share
7779-                d = self.writers[shareid].put_block(sharedata,
7780-                                                    self.num_segments - 1,
7781-                                                    salt)
7782-                dl.append(d)
7783-            # TODO: Naturally, we need to check on the results of these.
7784-            return defer.DeferredList(dl)
7785-        d.addCallback(_push_shares_and_salt)
7786+    def push_everything_else(self):
7787+        """
7788+        I put everything else associated with a share.
7789+        """
7790+        encprivkey = self._encprivkey
7791+        d = self.push_encprivkey()
7792+        d.addCallback(self.push_blockhashes)
7793+        d.addCallback(self.push_sharehashes)
7794+        d.addCallback(self.push_toplevel_hashes_and_signature)
7795+        d.addCallback(self.finish_publishing)
7796+        def _change_state(ignored):
7797+            self._state = DONE_STATE
7798+        d.addCallback(_change_state)
7799+        d.addCallback(self._push)
7800         return d
7801 
7802 
7803hunk ./src/allmydata/mutable/publish.py 527
7804         started = time.time()
7805         encprivkey = self._encprivkey
7806         dl = []
7807-        def _spy_on_writer(results):
7808-            print results
7809-            return results
7810-        for shnum, writer in self.writers.iteritems():
7811+        for writer in self.writers.itervalues():
7812             d = writer.put_encprivkey(encprivkey)
7813hunk ./src/allmydata/mutable/publish.py 529
7814+            d.addCallback(self._got_write_answer, writer, started)
7815+            d.addErrback(self._connection_problem, writer)
7816             dl.append(d)
7817         d = defer.DeferredList(dl)
7818         return d
7819hunk ./src/allmydata/mutable/publish.py 536
7820 
7821 
7822-    def push_blockhashes(self):
7823+    def push_blockhashes(self, ignored):
7824         started = time.time()
7825         dl = []
7826hunk ./src/allmydata/mutable/publish.py 539
7827-        def _spy_on_results(results):
7828-            print results
7829-            return results
7830         self.sharehash_leaves = [None] * len(self.blockhashes)
7831         for shnum, blockhashes in self.blockhashes.iteritems():
7832             t = hashtree.HashTree(blockhashes)
7833hunk ./src/allmydata/mutable/publish.py 545
7834             self.blockhashes[shnum] = list(t)
7835             # set the leaf for future use.
7836             self.sharehash_leaves[shnum] = t[0]
7837-            d = self.writers[shnum].put_blockhashes(self.blockhashes[shnum])
7838+            writer = self.writers[shnum]
7839+            d = writer.put_blockhashes(self.blockhashes[shnum])
7840+            d.addCallback(self._got_write_answer, writer, started)
7841+            d.addErrback(self._connection_problem, self.writers[shnum])
7842             dl.append(d)
7843         d = defer.DeferredList(dl)
7844         return d
7845hunk ./src/allmydata/mutable/publish.py 554
7846 
7847 
7848-    def push_sharehashes(self):
7849+    def push_sharehashes(self, ignored):
7850+        started = time.time()
7851         share_hash_tree = hashtree.HashTree(self.sharehash_leaves)
7852         share_hash_chain = {}
7853         ds = []
7854hunk ./src/allmydata/mutable/publish.py 559
7855-        def _spy_on_results(results):
7856-            print results
7857-            return results
7858         for shnum in xrange(len(self.sharehash_leaves)):
7859             needed_indices = share_hash_tree.needed_hashes(shnum)
7860             self.sharehashes[shnum] = dict( [ (i, share_hash_tree[i])
7861hunk ./src/allmydata/mutable/publish.py 563
7862                                              for i in needed_indices] )
7863-            d = self.writers[shnum].put_sharehashes(self.sharehashes[shnum])
7864+            writer = self.writers[shnum]
7865+            d = writer.put_sharehashes(self.sharehashes[shnum])
7866+            d.addCallback(self._got_write_answer, writer, started)
7867+            d.addErrback(self._connection_problem, writer)
7868             ds.append(d)
7869         self.root_hash = share_hash_tree[0]
7870         d = defer.DeferredList(ds)
7871hunk ./src/allmydata/mutable/publish.py 573
7872         return d
7873 
7874 
7875-    def push_toplevel_hashes_and_signature(self):
7876+    def push_toplevel_hashes_and_signature(self, ignored):
7877         # We need to to three things here:
7878         #   - Push the root hash and salt hash
7879         #   - Get the checkstring of the resulting layout; sign that.
7880hunk ./src/allmydata/mutable/publish.py 578
7881         #   - Push the signature
7882+        started = time.time()
7883         ds = []
7884hunk ./src/allmydata/mutable/publish.py 580
7885-        def _spy_on_results(results):
7886-            print results
7887-            return results
7888         for shnum in xrange(self.total_shares):
7889hunk ./src/allmydata/mutable/publish.py 581
7890-            d = self.writers[shnum].put_root_hash(self.root_hash)
7891+            writer = self.writers[shnum]
7892+            d = writer.put_root_hash(self.root_hash)
7893+            d.addCallback(self._got_write_answer, writer, started)
7894             ds.append(d)
7895         d = defer.DeferredList(ds)
7896hunk ./src/allmydata/mutable/publish.py 586
7897-        def _make_and_place_signature(ignored):
7898-            signable = self.writers[0].get_signable()
7899-            self.signature = self._privkey.sign(signable)
7900-
7901-            ds = []
7902-            for (shnum, writer) in self.writers.iteritems():
7903-                d = writer.put_signature(self.signature)
7904-                ds.append(d)
7905-            return defer.DeferredList(ds)
7906-        d.addCallback(_make_and_place_signature)
7907+        d.addCallback(self._update_checkstring)
7908+        d.addCallback(self._make_and_place_signature)
7909         return d
7910 
7911 
7912hunk ./src/allmydata/mutable/publish.py 591
7913-    def finish_publishing(self):
7914+    def _update_checkstring(self, ignored):
7915+        """
7916+        After putting the root hash, MDMF files will have the
7917+        checkstring written to the storage server. This means that we
7918+        can update our copy of the checkstring so we can detect
7919+        uncoordinated writes. SDMF files will have the same checkstring,
7920+        so we need not do anything.
7921+        """
7922+        self._checkstring = self.writers.values()[0].get_checkstring()
7923+
7924+
7925+    def _make_and_place_signature(self, ignored):
7926+        """
7927+        I create and place the signature.
7928+        """
7929+        started = time.time()
7930+        signable = self.writers[0].get_signable()
7931+        self.signature = self._privkey.sign(signable)
7932+
7933+        ds = []
7934+        for (shnum, writer) in self.writers.iteritems():
7935+            d = writer.put_signature(self.signature)
7936+            d.addCallback(self._got_write_answer, writer, started)
7937+            d.addErrback(self._connection_problem, writer)
7938+            ds.append(d)
7939+        return defer.DeferredList(ds)
7940+
7941+
7942+    def finish_publishing(self, ignored):
7943         # We're almost done -- we just need to put the verification key
7944         # and the offsets
7945hunk ./src/allmydata/mutable/publish.py 622
7946+        started = time.time()
7947         ds = []
7948         verification_key = self._pubkey.serialize()
7949 
7950hunk ./src/allmydata/mutable/publish.py 626
7951-        def _spy_on_results(results):
7952-            print results
7953-            return results
7954+
7955+        # TODO: Bad, since we remove from this same dict. We need to
7956+        # make a copy, or just use a non-iterated value.
7957         for (shnum, writer) in self.writers.iteritems():
7958             d = writer.put_verification_key(verification_key)
7959hunk ./src/allmydata/mutable/publish.py 631
7960+            d.addCallback(self._got_write_answer, writer, started)
7961+            d.addCallback(self._record_verinfo)
7962             d.addCallback(lambda ignored, writer=writer:
7963                 writer.finish_publishing())
7964hunk ./src/allmydata/mutable/publish.py 635
7965+            d.addCallback(self._got_write_answer, writer, started)
7966+            d.addErrback(self._connection_problem, writer)
7967             ds.append(d)
7968         return defer.DeferredList(ds)
7969 
7970hunk ./src/allmydata/mutable/publish.py 641
7971 
7972-    def _turn_barrier(self, res):
7973-        # putting this method in a Deferred chain imposes a guaranteed
7974-        # reactor turn between the pre- and post- portions of that chain.
7975-        # This can be useful to limit memory consumption: since Deferreds do
7976-        # not do tail recursion, code which uses defer.succeed(result) for
7977-        # consistency will cause objects to live for longer than you might
7978-        # normally expect.
7979-        return fireEventually(res)
7980+    def _record_verinfo(self, ignored):
7981+        self.versioninfo = self.writers.values()[0].get_verinfo()
7982 
7983 
7984hunk ./src/allmydata/mutable/publish.py 645
7985-    def _fatal_error(self, f):
7986-        self.log("error during loop", failure=f, level=log.UNUSUAL)
7987-        self._done(f)
7988+    def _connection_problem(self, f, writer):
7989+        """
7990+        We ran into a connection problem while working with writer, and
7991+        need to deal with that.
7992+        """
7993+        self.log("found problem: %s" % str(f))
7994+        self._last_failure = f
7995+        del(self.writers[writer.shnum])
7996 
7997hunk ./src/allmydata/mutable/publish.py 654
7998-    def _update_status(self):
7999-        self._status.set_status("Sending Shares: %d placed out of %d, "
8000-                                "%d messages outstanding" %
8001-                                (len(self.placed),
8002-                                 len(self.goal),
8003-                                 len(self.outstanding)))
8004-        self._status.set_progress(1.0 * len(self.placed) / len(self.goal))
8005 
8006     def loop(self, ignored=None):
8007         self.log("entering loop", level=log.NOISY)
8008hunk ./src/allmydata/mutable/publish.py 778
8009             self.log_goal(self.goal, "after update: ")
8010 
8011 
8012-    def _encrypt_and_encode(self):
8013-        # this returns a Deferred that fires with a list of (sharedata,
8014-        # sharenum) tuples. TODO: cache the ciphertext, only produce the
8015-        # shares that we care about.
8016-        self.log("_encrypt_and_encode")
8017-
8018-        self._status.set_status("Encrypting")
8019-        started = time.time()
8020+    def _got_write_answer(self, answer, writer, started):
8021+        if not answer:
8022+            # SDMF writers only pretend to write when readers set their
8023+            # blocks, salts, and so on -- they actually just write once,
8024+            # at the end of the upload process. In fake writes, they
8025+            # return defer.succeed(None). If we see that, we shouldn't
8026+            # bother checking it.
8027+            return
8028 
8029hunk ./src/allmydata/mutable/publish.py 787
8030-        key = hashutil.ssk_readkey_data_hash(self.salt, self.readkey)
8031-        enc = AES(key)
8032-        crypttext = enc.process(self.newdata)
8033-        assert len(crypttext) == len(self.newdata)
8034+        peerid = writer.peerid
8035+        lp = self.log("_got_write_answer from %s, share %d" %
8036+                      (idlib.shortnodeid_b2a(peerid), writer.shnum))
8037 
8038         now = time.time()
8039hunk ./src/allmydata/mutable/publish.py 792
8040-        self._status.timings["encrypt"] = now - started
8041-        started = now
8042-
8043-        # now apply FEC
8044-
8045-        self._status.set_status("Encoding")
8046-        fec = codec.CRSEncoder()
8047-        fec.set_params(self.segment_size,
8048-                       self.required_shares, self.total_shares)
8049-        piece_size = fec.get_block_size()
8050-        crypttext_pieces = [None] * self.required_shares
8051-        for i in range(len(crypttext_pieces)):
8052-            offset = i * piece_size
8053-            piece = crypttext[offset:offset+piece_size]
8054-            piece = piece + "\x00"*(piece_size - len(piece)) # padding
8055-            crypttext_pieces[i] = piece
8056-            assert len(piece) == piece_size
8057-
8058-        d = fec.encode(crypttext_pieces)
8059-        def _done_encoding(res):
8060-            elapsed = time.time() - started
8061-            self._status.timings["encode"] = elapsed
8062-            return res
8063-        d.addCallback(_done_encoding)
8064-        return d
8065-
8066-
8067-    def _generate_shares(self, shares_and_shareids):
8068-        # this sets self.shares and self.root_hash
8069-        self.log("_generate_shares")
8070-        self._status.set_status("Generating Shares")
8071-        started = time.time()
8072-
8073-        # we should know these by now
8074-        privkey = self._privkey
8075-        encprivkey = self._encprivkey
8076-        pubkey = self._pubkey
8077-
8078-        (shares, share_ids) = shares_and_shareids
8079-
8080-        assert len(shares) == len(share_ids)
8081-        assert len(shares) == self.total_shares
8082-        all_shares = {}
8083-        block_hash_trees = {}
8084-        share_hash_leaves = [None] * len(shares)
8085-        for i in range(len(shares)):
8086-            share_data = shares[i]
8087-            shnum = share_ids[i]
8088-            all_shares[shnum] = share_data
8089-
8090-            # build the block hash tree. SDMF has only one leaf.
8091-            leaves = [hashutil.block_hash(share_data)]
8092-            t = hashtree.HashTree(leaves)
8093-            block_hash_trees[shnum] = list(t)
8094-            share_hash_leaves[shnum] = t[0]
8095-        for leaf in share_hash_leaves:
8096-            assert leaf is not None
8097-        share_hash_tree = hashtree.HashTree(share_hash_leaves)
8098-        share_hash_chain = {}
8099-        for shnum in range(self.total_shares):
8100-            needed_hashes = share_hash_tree.needed_hashes(shnum)
8101-            share_hash_chain[shnum] = dict( [ (i, share_hash_tree[i])
8102-                                              for i in needed_hashes ] )
8103-        root_hash = share_hash_tree[0]
8104-        assert len(root_hash) == 32
8105-        self.log("my new root_hash is %s" % base32.b2a(root_hash))
8106-        self._new_version_info = (self._new_seqnum, root_hash, self.salt)
8107-
8108-        prefix = pack_prefix(self._new_seqnum, root_hash, self.salt,
8109-                             self.required_shares, self.total_shares,
8110-                             self.segment_size, len(self.newdata))
8111-
8112-        # now pack the beginning of the share. All shares are the same up
8113-        # to the signature, then they have divergent share hash chains,
8114-        # then completely different block hash trees + salt + share data,
8115-        # then they all share the same encprivkey at the end. The sizes
8116-        # of everything are the same for all shares.
8117-
8118-        sign_started = time.time()
8119-        signature = privkey.sign(prefix)
8120-        self._status.timings["sign"] = time.time() - sign_started
8121-
8122-        verification_key = pubkey.serialize()
8123-
8124-        final_shares = {}
8125-        for shnum in range(self.total_shares):
8126-            final_share = pack_share(prefix,
8127-                                     verification_key,
8128-                                     signature,
8129-                                     share_hash_chain[shnum],
8130-                                     block_hash_trees[shnum],
8131-                                     all_shares[shnum],
8132-                                     encprivkey)
8133-            final_shares[shnum] = final_share
8134-        elapsed = time.time() - started
8135-        self._status.timings["pack"] = elapsed
8136-        self.shares = final_shares
8137-        self.root_hash = root_hash
8138-
8139-        # we also need to build up the version identifier for what we're
8140-        # pushing. Extract the offsets from one of our shares.
8141-        assert final_shares
8142-        offsets = unpack_header(final_shares.values()[0])[-1]
8143-        offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
8144-        verinfo = (self._new_seqnum, root_hash, self.salt,
8145-                   self.segment_size, len(self.newdata),
8146-                   self.required_shares, self.total_shares,
8147-                   prefix, offsets_tuple)
8148-        self.versioninfo = verinfo
8149-
8150-
8151-
8152-    def _send_shares(self, needed):
8153-        self.log("_send_shares")
8154-
8155-        # we're finally ready to send out our shares. If we encounter any
8156-        # surprises here, it's because somebody else is writing at the same
8157-        # time. (Note: in the future, when we remove the _query_peers() step
8158-        # and instead speculate about [or remember] which shares are where,
8159-        # surprises here are *not* indications of UncoordinatedWriteError,
8160-        # and we'll need to respond to them more gracefully.)
8161-
8162-        # needed is a set of (peerid, shnum) tuples. The first thing we do is
8163-        # organize it by peerid.
8164-
8165-        peermap = DictOfSets()
8166-        for (peerid, shnum) in needed:
8167-            peermap.add(peerid, shnum)
8168-
8169-        # the next thing is to build up a bunch of test vectors. The
8170-        # semantics of Publish are that we perform the operation if the world
8171-        # hasn't changed since the ServerMap was constructed (more or less).
8172-        # For every share we're trying to place, we create a test vector that
8173-        # tests to see if the server*share still corresponds to the
8174-        # map.
8175-
8176-        all_tw_vectors = {} # maps peerid to tw_vectors
8177-        sm = self._servermap.servermap
8178-
8179-        for key in needed:
8180-            (peerid, shnum) = key
8181-
8182-            if key in sm:
8183-                # an old version of that share already exists on the
8184-                # server, according to our servermap. We will create a
8185-                # request that attempts to replace it.
8186-                old_versionid, old_timestamp = sm[key]
8187-                (old_seqnum, old_root_hash, old_salt, old_segsize,
8188-                 old_datalength, old_k, old_N, old_prefix,
8189-                 old_offsets_tuple) = old_versionid
8190-                old_checkstring = pack_checkstring(old_seqnum,
8191-                                                   old_root_hash,
8192-                                                   old_salt)
8193-                testv = (0, len(old_checkstring), "eq", old_checkstring)
8194-
8195-            elif key in self.bad_share_checkstrings:
8196-                old_checkstring = self.bad_share_checkstrings[key]
8197-                testv = (0, len(old_checkstring), "eq", old_checkstring)
8198-
8199-            else:
8200-                # add a testv that requires the share not exist
8201-
8202-                # Unfortunately, foolscap-0.2.5 has a bug in the way inbound
8203-                # constraints are handled. If the same object is referenced
8204-                # multiple times inside the arguments, foolscap emits a
8205-                # 'reference' token instead of a distinct copy of the
8206-                # argument. The bug is that these 'reference' tokens are not
8207-                # accepted by the inbound constraint code. To work around
8208-                # this, we need to prevent python from interning the
8209-                # (constant) tuple, by creating a new copy of this vector
8210-                # each time.
8211-
8212-                # This bug is fixed in foolscap-0.2.6, and even though this
8213-                # version of Tahoe requires foolscap-0.3.1 or newer, we are
8214-                # supposed to be able to interoperate with older versions of
8215-                # Tahoe which are allowed to use older versions of foolscap,
8216-                # including foolscap-0.2.5 . In addition, I've seen other
8217-                # foolscap problems triggered by 'reference' tokens (see #541
8218-                # for details). So we must keep this workaround in place.
8219-
8220-                #testv = (0, 1, 'eq', "")
8221-                testv = tuple([0, 1, 'eq', ""])
8222-
8223-            testvs = [testv]
8224-            # the write vector is simply the share
8225-            writev = [(0, self.shares[shnum])]
8226-
8227-            if peerid not in all_tw_vectors:
8228-                all_tw_vectors[peerid] = {}
8229-                # maps shnum to (testvs, writevs, new_length)
8230-            assert shnum not in all_tw_vectors[peerid]
8231-
8232-            all_tw_vectors[peerid][shnum] = (testvs, writev, None)
8233-
8234-        # we read the checkstring back from each share, however we only use
8235-        # it to detect whether there was a new share that we didn't know
8236-        # about. The success or failure of the write will tell us whether
8237-        # there was a collision or not. If there is a collision, the first
8238-        # thing we'll do is update the servermap, which will find out what
8239-        # happened. We could conceivably reduce a roundtrip by using the
8240-        # readv checkstring to populate the servermap, but really we'd have
8241-        # to read enough data to validate the signatures too, so it wouldn't
8242-        # be an overall win.
8243-        read_vector = [(0, struct.calcsize(SIGNED_PREFIX))]
8244-
8245-        # ok, send the messages!
8246-        self.log("sending %d shares" % len(all_tw_vectors), level=log.NOISY)
8247-        started = time.time()
8248-        for (peerid, tw_vectors) in all_tw_vectors.items():
8249-
8250-            write_enabler = self._node.get_write_enabler(peerid)
8251-            renew_secret = self._node.get_renewal_secret(peerid)
8252-            cancel_secret = self._node.get_cancel_secret(peerid)
8253-            secrets = (write_enabler, renew_secret, cancel_secret)
8254-            shnums = tw_vectors.keys()
8255-
8256-            for shnum in shnums:
8257-                self.outstanding.add( (peerid, shnum) )
8258-
8259-            d = self._do_testreadwrite(peerid, secrets,
8260-                                       tw_vectors, read_vector)
8261-            d.addCallbacks(self._got_write_answer, self._got_write_error,
8262-                           callbackArgs=(peerid, shnums, started),
8263-                           errbackArgs=(peerid, shnums, started))
8264-            # tolerate immediate errback, like with DeadReferenceError
8265-            d.addBoth(fireEventually)
8266-            d.addCallback(self.loop)
8267-            d.addErrback(self._fatal_error)
8268-
8269-        self._update_status()
8270-        self.log("%d shares sent" % len(all_tw_vectors), level=log.NOISY)
8271+        elapsed = now - started
8272 
8273hunk ./src/allmydata/mutable/publish.py 794
8274-    def _do_testreadwrite(self, peerid, secrets,
8275-                          tw_vectors, read_vector):
8276-        storage_index = self._storage_index
8277-        ss = self.connections[peerid]
8278+        self._status.add_per_server_time(peerid, elapsed)
8279 
8280hunk ./src/allmydata/mutable/publish.py 796
8281-        #print "SS[%s] is %s" % (idlib.shortnodeid_b2a(peerid), ss), ss.tracker.interfaceName
8282-        d = ss.callRemote("slot_testv_and_readv_and_writev",
8283-                          storage_index,
8284-                          secrets,
8285-                          tw_vectors,
8286-                          read_vector)
8287-        return d
8288+        wrote, read_data = answer
8289 
8290hunk ./src/allmydata/mutable/publish.py 798
8291-    def _got_write_answer(self, answer, peerid, shnums, started):
8292-        lp = self.log("_got_write_answer from %s" %
8293-                      idlib.shortnodeid_b2a(peerid))
8294-        for shnum in shnums:
8295-            self.outstanding.discard( (peerid, shnum) )
8296+        surprise_shares = set(read_data.keys()) - set([writer.shnum])
8297 
8298hunk ./src/allmydata/mutable/publish.py 800
8299-        now = time.time()
8300-        elapsed = now - started
8301-        self._status.add_per_server_time(peerid, elapsed)
8302+        # We need to remove from surprise_shares any shares that we are
8303+        # knowingly also writing to that peer from other writers.
8304 
8305hunk ./src/allmydata/mutable/publish.py 803
8306-        wrote, read_data = answer
8307+        # TODO: Precompute this.
8308+        known_shnums = [x.shnum for x in self.writers.values()
8309+                        if x.peerid == peerid]
8310+        surprise_shares -= set(known_shnums)
8311+        self.log("found the following surprise shares: %s" %
8312+                 str(surprise_shares))
8313 
8314hunk ./src/allmydata/mutable/publish.py 810
8315-        surprise_shares = set(read_data.keys()) - set(shnums)
8316+        # Now surprise shares contains all of the shares that we did not
8317+        # expect to be there.
8318 
8319         surprised = False
8320         for shnum in surprise_shares:
8321hunk ./src/allmydata/mutable/publish.py 817
8322             # read_data is a dict mapping shnum to checkstring (SIGNED_PREFIX)
8323             checkstring = read_data[shnum][0]
8324-            their_version_info = unpack_checkstring(checkstring)
8325-            if their_version_info == self._new_version_info:
8326+            # What we want to do here is to see if their (seqnum,
8327+            # roothash, salt) is the same as our (seqnum, roothash,
8328+            # salt), or the equivalent for MDMF. The best way to do this
8329+            # is to store a packed representation of our checkstring
8330+            # somewhere, then not bother unpacking the other
8331+            # checkstring.
8332+            if checkstring == self._checkstring:
8333                 # they have the right share, somehow
8334 
8335                 if (peerid,shnum) in self.goal:
8336hunk ./src/allmydata/mutable/publish.py 902
8337             self.log("our testv failed, so the write did not happen",
8338                      parent=lp, level=log.WEIRD, umid="8sc26g")
8339             self.surprised = True
8340-            self.bad_peers.add(peerid) # don't ask them again
8341+            # TODO: This needs to
8342+            self.bad_peers.add(writer) # don't ask them again
8343             # use the checkstring to add information to the log message
8344             for (shnum,readv) in read_data.items():
8345                 checkstring = readv[0]
8346hunk ./src/allmydata/mutable/publish.py 928
8347             # self.loop() will take care of finding new homes
8348             return
8349 
8350-        for shnum in shnums:
8351-            self.placed.add( (peerid, shnum) )
8352-            # and update the servermap
8353-            self._servermap.add_new_share(peerid, shnum,
8354+        # and update the servermap
8355+        # self.versioninfo is set during the last phase of publishing.
8356+        # If we get there, we know that responses correspond to placed
8357+        # shares, and can safely execute these statements.
8358+        if self.versioninfo:
8359+            self.log("wrote successfully: adding new share to servermap")
8360+            self._servermap.add_new_share(peerid, writer.shnum,
8361                                           self.versioninfo, started)
8362hunk ./src/allmydata/mutable/publish.py 936
8363-
8364-        # self.loop() will take care of checking to see if we're done
8365-        return
8366+            self.placed.add( (peerid, writer.shnum) )
8367 
8368hunk ./src/allmydata/mutable/publish.py 938
8369-    def _got_write_error(self, f, peerid, shnums, started):
8370-        for shnum in shnums:
8371-            self.outstanding.discard( (peerid, shnum) )
8372-        self.bad_peers.add(peerid)
8373-        if self._first_write_error is None:
8374-            self._first_write_error = f
8375-        self.log(format="error while writing shares %(shnums)s to peerid %(peerid)s",
8376-                 shnums=list(shnums), peerid=idlib.shortnodeid_b2a(peerid),
8377-                 failure=f,
8378-                 level=log.UNUSUAL)
8379         # self.loop() will take care of checking to see if we're done
8380         return
8381 
8382hunk ./src/allmydata/mutable/publish.py 949
8383         now = time.time()
8384         self._status.timings["total"] = now - self._started
8385         self._status.set_active(False)
8386-        if isinstance(res, failure.Failure):
8387-            self.log("Publish done, with failure", failure=res,
8388-                     level=log.WEIRD, umid="nRsR9Q")
8389-            self._status.set_status("Failed")
8390-        elif self.surprised:
8391-            self.log("Publish done, UncoordinatedWriteError", level=log.UNUSUAL)
8392-            self._status.set_status("UncoordinatedWriteError")
8393-            # deliver a failure
8394-            res = failure.Failure(UncoordinatedWriteError())
8395-            # TODO: recovery
8396-        else:
8397-            self.log("Publish done, success")
8398-            self._status.set_status("Finished")
8399-            self._status.set_progress(1.0)
8400+        self.log("Publish done, success")
8401+        self._status.set_status("Finished")
8402+        self._status.set_progress(1.0)
8403         eventually(self.done_deferred.callback, res)
8404 
8405hunk ./src/allmydata/mutable/publish.py 954
8406+    def _failure(self):
8407+
8408+        if not self.surprised:
8409+            # We ran out of servers
8410+            self.log("Publish ran out of good servers, "
8411+                     "last failure was: %s" % str(self._last_failure))
8412+            e = NotEnoughServersError("Ran out of non-bad servers, "
8413+                                      "last failure was %s" %
8414+                                      str(self._last_failure))
8415+        else:
8416+            # We ran into shares that we didn't recognize, which means
8417+            # that we need to return an UncoordinatedWriteError.
8418+            self.log("Publish failed with UncoordinatedWriteError")
8419+            e = UncoordinatedWriteError()
8420+        f = failure.Failure(e)
8421+        eventually(self.done_deferred.callback, f)
8422}
8423[test/test_mutable.py: remove tests that are no longer relevant
8424Kevan Carstensen <kevan@isnotajoke.com>**20100702225710
8425 Ignore-this: 90a26b4cc4b2e190a635474ba7097e21
8426] hunk ./src/allmydata/test/test_mutable.py 627
8427         return d
8428 
8429 
8430-class MakeShares(unittest.TestCase):
8431-    def test_encrypt(self):
8432-        nm = make_nodemaker()
8433-        CONTENTS = "some initial contents"
8434-        d = nm.create_mutable_file(CONTENTS)
8435-        def _created(fn):
8436-            p = Publish(fn, nm.storage_broker, None)
8437-            p.salt = "SALT" * 4
8438-            p.readkey = "\x00" * 16
8439-            p.newdata = CONTENTS
8440-            p.required_shares = 3
8441-            p.total_shares = 10
8442-            p.setup_encoding_parameters()
8443-            return p._encrypt_and_encode()
8444-        d.addCallback(_created)
8445-        def _done(shares_and_shareids):
8446-            (shares, share_ids) = shares_and_shareids
8447-            self.failUnlessEqual(len(shares), 10)
8448-            for sh in shares:
8449-                self.failUnless(isinstance(sh, str))
8450-                self.failUnlessEqual(len(sh), 7)
8451-            self.failUnlessEqual(len(share_ids), 10)
8452-        d.addCallback(_done)
8453-        return d
8454-    test_encrypt.todo = "Write an equivalent of this for the new uploader"
8455-
8456-    def test_generate(self):
8457-        nm = make_nodemaker()
8458-        CONTENTS = "some initial contents"
8459-        d = nm.create_mutable_file(CONTENTS)
8460-        def _created(fn):
8461-            self._fn = fn
8462-            p = Publish(fn, nm.storage_broker, None)
8463-            self._p = p
8464-            p.newdata = CONTENTS
8465-            p.required_shares = 3
8466-            p.total_shares = 10
8467-            p.setup_encoding_parameters()
8468-            p._new_seqnum = 3
8469-            p.salt = "SALT" * 4
8470-            # make some fake shares
8471-            shares_and_ids = ( ["%07d" % i for i in range(10)], range(10) )
8472-            p._privkey = fn.get_privkey()
8473-            p._encprivkey = fn.get_encprivkey()
8474-            p._pubkey = fn.get_pubkey()
8475-            return p._generate_shares(shares_and_ids)
8476-        d.addCallback(_created)
8477-        def _generated(res):
8478-            p = self._p
8479-            final_shares = p.shares
8480-            root_hash = p.root_hash
8481-            self.failUnlessEqual(len(root_hash), 32)
8482-            self.failUnless(isinstance(final_shares, dict))
8483-            self.failUnlessEqual(len(final_shares), 10)
8484-            self.failUnlessEqual(sorted(final_shares.keys()), range(10))
8485-            for i,sh in final_shares.items():
8486-                self.failUnless(isinstance(sh, str))
8487-                # feed the share through the unpacker as a sanity-check
8488-                pieces = unpack_share(sh)
8489-                (u_seqnum, u_root_hash, IV, k, N, segsize, datalen,
8490-                 pubkey, signature, share_hash_chain, block_hash_tree,
8491-                 share_data, enc_privkey) = pieces
8492-                self.failUnlessEqual(u_seqnum, 3)
8493-                self.failUnlessEqual(u_root_hash, root_hash)
8494-                self.failUnlessEqual(k, 3)
8495-                self.failUnlessEqual(N, 10)
8496-                self.failUnlessEqual(segsize, 21)
8497-                self.failUnlessEqual(datalen, len(CONTENTS))
8498-                self.failUnlessEqual(pubkey, p._pubkey.serialize())
8499-                sig_material = struct.pack(">BQ32s16s BBQQ",
8500-                                           0, p._new_seqnum, root_hash, IV,
8501-                                           k, N, segsize, datalen)
8502-                self.failUnless(p._pubkey.verify(sig_material, signature))
8503-                #self.failUnlessEqual(signature, p._privkey.sign(sig_material))
8504-                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
8505-                for shnum,share_hash in share_hash_chain.items():
8506-                    self.failUnless(isinstance(shnum, int))
8507-                    self.failUnless(isinstance(share_hash, str))
8508-                    self.failUnlessEqual(len(share_hash), 32)
8509-                self.failUnless(isinstance(block_hash_tree, list))
8510-                self.failUnlessEqual(len(block_hash_tree), 1) # very small tree
8511-                self.failUnlessEqual(IV, "SALT"*4)
8512-                self.failUnlessEqual(len(share_data), len("%07d" % 1))
8513-                self.failUnlessEqual(enc_privkey, self._fn.get_encprivkey())
8514-        d.addCallback(_generated)
8515-        return d
8516-    test_generate.todo = "Write an equivalent of this for the new uploader"
8517-
8518-    # TODO: when we publish to 20 peers, we should get one share per peer on 10
8519-    # when we publish to 3 peers, we should get either 3 or 4 shares per peer
8520-    # when we publish to zero peers, we should get a NotEnoughSharesError
8521-
8522 class PublishMixin:
8523     def publish_one(self):
8524         # publish a file and create shares, which can then be manipulated
8525
8526Context:
8527
8528[docs/how_to_make_a_tahoe-lafs_release.txt: trivial correction, install.html should now be quickstart.html.
8529david-sarah@jacaranda.org**20100625223929
8530 Ignore-this: 99a5459cac51bd867cc11ad06927ff30
8531] 
8532[setup: in the Makefile, refuse to upload tarballs unless someone has passed the environment variable "BB_BRANCH" with value "trunk"
8533zooko@zooko.com**20100619034928
8534 Ignore-this: 276ddf9b6ad7ec79e27474862e0f7d6
8535] 
8536[trivial: tiny update to in-line comment
8537zooko@zooko.com**20100614045715
8538 Ignore-this: 10851b0ed2abfed542c97749e5d280bc
8539 (I'm actually committing this patch as a test of the new eager-annotation-computation of trac-darcs.)
8540] 
8541[docs: about.html link to home page early on, and be decentralized storage instead of cloud storage this time around
8542zooko@zooko.com**20100619065318
8543 Ignore-this: dc6db03f696e5b6d2848699e754d8053
8544] 
8545[docs: update about.html, especially to have a non-broken link to quickstart.html, and also to comment out the broken links to "for Paranoids" and "for Corporates"
8546zooko@zooko.com**20100619065124
8547 Ignore-this: e292c7f51c337a84ebfeb366fbd24d6c
8548] 
8549[TAG allmydata-tahoe-1.7.0
8550zooko@zooko.com**20100619052631
8551 Ignore-this: d21e27afe6d85e2e3ba6a3292ba2be1
8552] 
8553Patch bundle hash:
85548adad761d37db13217f0a726a45d062fead3b0cb