Ticket #393: 393status36.dpatch

File 393status36.dpatch, 560.3 KB (added by warner, at 2011-02-22T00:13:24Z)

updated to current trunk, conflicts and test failures fixed

Line 
120 patches for repository http://tahoe-lafs.org/source/tahoe/trunk:
2
3Mon Aug  9 16:32:44 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
4  * interfaces.py: Add #993 interfaces
5
6Mon Aug  9 16:35:35 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
7  * frontends/sftpd.py: Modify the sftp frontend to work with the MDMF changes
8
9Mon Aug  9 17:06:19 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
10  * immutable/filenode.py: Make the immutable file node implement the same interfaces as the mutable one
11
12Mon Aug  9 17:06:33 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
13  * immutable/literal.py: implement the same interfaces as other filenodes
14
15Fri Aug 13 16:49:57 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
16  * scripts: tell 'tahoe put' about MDMF
17
18Sat Aug 14 01:10:12 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
19  * web: Alter the webapi to get along with and take advantage of the MDMF changes
20 
21  The main benefit that the webapi gets from MDMF, at least initially, is
22  the ability to do a streaming download of an MDMF mutable file. It also
23  exposes a way (through the PUT verb) to append to or otherwise modify
24  (in-place) an MDMF mutable file.
25
26Sat Aug 14 15:57:11 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
27  * client.py: learn how to create different kinds of mutable files
28
29Wed Aug 18 17:32:16 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
30  * mutable/checker.py and mutable/repair.py: Modify checker and repairer to work with MDMF
31 
32  The checker and repairer required minimal changes to work with the MDMF
33  modifications made elsewhere. The checker duplicated a lot of the code
34  that was already in the downloader, so I modified the downloader
35  slightly to expose this functionality to the checker and removed the
36  duplicated code. The repairer only required a minor change to deal with
37  data representation.
38
39Wed Aug 18 17:32:31 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
40  * mutable/filenode.py: add versions and partial-file updates to the mutable file node
41 
42  One of the goals of MDMF as a GSoC project is to lay the groundwork for
43  LDMF, a format that will allow Tahoe-LAFS to deal with and encourage
44  multiple versions of a single cap on the grid. In line with this, there
45  is a now a distinction between an overriding mutable file (which can be
46  thought to correspond to the cap/unique identifier for that mutable
47  file) and versions of the mutable file (which we can download, update,
48  and so on). All download, upload, and modification operations end up
49  happening on a particular version of a mutable file, but there are
50  shortcut methods on the object representing the overriding mutable file
51  that perform these operations on the best version of the mutable file
52  (which is what code should be doing until we have LDMF and better
53  support for other paradigms).
54 
55  Another goal of MDMF was to take advantage of segmentation to give
56  callers more efficient partial file updates or appends. This patch
57  implements methods that do that, too.
58 
59
60Wed Aug 18 17:33:42 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
61  * mutable/publish.py: Modify the publish process to support MDMF
62 
63  The inner workings of the publishing process needed to be reworked to a
64  large extend to cope with segmented mutable files, and to cope with
65  partial-file updates of mutable files. This patch does that. It also
66  introduces wrappers for uploadable data, allowing the use of
67  filehandle-like objects as data sources, in addition to strings. This
68  reduces memory inefficiency when dealing with large files through the
69  webapi, and clarifies update code there.
70
71Wed Aug 18 17:35:09 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
72  * nodemaker.py: Make nodemaker expose a way to create MDMF files
73
74Sat Aug 14 15:56:44 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
75  * docs: update docs to mention MDMF
76
77Wed Aug 18 17:33:04 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
78  * mutable/layout.py and interfaces.py: add MDMF writer and reader
79 
80  The MDMF writer is responsible for keeping state as plaintext is
81  gradually processed into share data by the upload process. When the
82  upload finishes, it will write all of its share data to a remote server,
83  reporting its status back to the publisher.
84 
85  The MDMF reader is responsible for abstracting an MDMF file as it sits
86  on the grid from the downloader; specifically, by receiving and
87  responding to requests for arbitrary data within the MDMF file.
88 
89  The interfaces.py file has also been modified to contain an interface
90  for the writer.
91
92Wed Aug 18 17:34:09 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
93  * mutable/retrieve.py: Modify the retrieval process to support MDMF
94 
95  The logic behind a mutable file download had to be adapted to work with
96  segmented mutable files; this patch performs those adaptations. It also
97  exposes some decoding and decrypting functionality to make partial-file
98  updates a little easier, and supports efficient random-access downloads
99  of parts of an MDMF file.
100
101Wed Aug 18 17:34:39 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
102  * mutable/servermap.py: Alter the servermap updater to work with MDMF files
103 
104  These modifications were basically all to the end of having the
105  servermap updater use the unified MDMF + SDMF read interface whenever
106  possible -- this reduces the complexity of the code, making it easier to
107  read and maintain. To do this, I needed to modify the process of
108  updating the servermap a little bit.
109 
110  To support partial-file updates, I also modified the servermap updater
111  to fetch the block hash trees and certain segments of files while it
112  performed a servermap update (this can be done without adding any new
113  roundtrips because of batch-read functionality that the read proxy has).
114 
115
116Wed Aug 18 17:35:31 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
117  * tests:
118 
119      - A lot of existing tests relied on aspects of the mutable file
120        implementation that were changed. This patch updates those tests
121        to work with the changes.
122      - This patch also adds tests for new features.
123
124Sun Feb 20 15:02:01 PST 2011  "Brian Warner <warner@lothar.com>"
125  * resolve conflicts between 393-MDMF patches and trunk as of 1.8.2
126
127Sun Feb 20 17:46:59 PST 2011  "Brian Warner <warner@lothar.com>"
128  * mutable/filenode.py: fix create_mutable_file('string')
129
130Sun Feb 20 21:56:00 PST 2011  "Brian Warner <warner@lothar.com>"
131  * resolve more conflicts with current trunk
132
133Sun Feb 20 22:10:04 PST 2011  "Brian Warner <warner@lothar.com>"
134  * update MDMF code with StorageFarmBroker changes
135
136
137New patches:
138
139[interfaces.py: Add #993 interfaces
140Kevan Carstensen <kevan@isnotajoke.com>**20100809233244
141 Ignore-this: b58621ac5cc86f1b4b4149f9e6c6a1ce
142] {
143hunk ./src/allmydata/interfaces.py 499
144 class MustNotBeUnknownRWError(CapConstraintError):
145     """Cannot add an unknown child cap specified in a rw_uri field."""
146 
147+
148+class IReadable(Interface):
149+    """I represent a readable object -- either an immutable file, or a
150+    specific version of a mutable file.
151+    """
152+
153+    def is_readonly():
154+        """Return True if this reference provides mutable access to the given
155+        file or directory (i.e. if you can modify it), or False if not. Note
156+        that even if this reference is read-only, someone else may hold a
157+        read-write reference to it.
158+
159+        For an IReadable returned by get_best_readable_version(), this will
160+        always return True, but for instances of subinterfaces such as
161+        IMutableFileVersion, it may return False."""
162+
163+    def is_mutable():
164+        """Return True if this file or directory is mutable (by *somebody*,
165+        not necessarily you), False if it is is immutable. Note that a file
166+        might be mutable overall, but your reference to it might be
167+        read-only. On the other hand, all references to an immutable file
168+        will be read-only; there are no read-write references to an immutable
169+        file."""
170+
171+    def get_storage_index():
172+        """Return the storage index of the file."""
173+
174+    def get_size():
175+        """Return the length (in bytes) of this readable object."""
176+
177+    def download_to_data():
178+        """Download all of the file contents. I return a Deferred that fires
179+        with the contents as a byte string."""
180+
181+    def read(consumer, offset=0, size=None):
182+        """Download a portion (possibly all) of the file's contents, making
183+        them available to the given IConsumer. Return a Deferred that fires
184+        (with the consumer) when the consumer is unregistered (either because
185+        the last byte has been given to it, or because the consumer threw an
186+        exception during write(), possibly because it no longer wants to
187+        receive data). The portion downloaded will start at 'offset' and
188+        contain 'size' bytes (or the remainder of the file if size==None).
189+
190+        The consumer will be used in non-streaming mode: an IPullProducer
191+        will be attached to it.
192+
193+        The consumer will not receive data right away: several network trips
194+        must occur first. The order of events will be::
195+
196+         consumer.registerProducer(p, streaming)
197+          (if streaming == False)::
198+           consumer does p.resumeProducing()
199+            consumer.write(data)
200+           consumer does p.resumeProducing()
201+            consumer.write(data).. (repeat until all data is written)
202+         consumer.unregisterProducer()
203+         deferred.callback(consumer)
204+
205+        If a download error occurs, or an exception is raised by
206+        consumer.registerProducer() or consumer.write(), I will call
207+        consumer.unregisterProducer() and then deliver the exception via
208+        deferred.errback(). To cancel the download, the consumer should call
209+        p.stopProducing(), which will result in an exception being delivered
210+        via deferred.errback().
211+
212+        See src/allmydata/util/consumer.py for an example of a simple
213+        download-to-memory consumer.
214+        """
215+
216+
217+class IWritable(Interface):
218+    """
219+    I define methods that callers can use to update SDMF and MDMF
220+    mutable files on a Tahoe-LAFS grid.
221+    """
222+    # XXX: For the moment, we have only this. It is possible that we
223+    #      want to move overwrite() and modify() in here too.
224+    def update(data, offset):
225+        """
226+        I write the data from my data argument to the MDMF file,
227+        starting at offset. I continue writing data until my data
228+        argument is exhausted, appending data to the file as necessary.
229+        """
230+        # assert IMutableUploadable.providedBy(data)
231+        # to append data: offset=node.get_size_of_best_version()
232+        # do we want to support compacting MDMF?
233+        # for an MDMF file, this can be done with O(data.get_size())
234+        # memory. For an SDMF file, any modification takes
235+        # O(node.get_size_of_best_version()).
236+
237+
238+class IMutableFileVersion(IReadable):
239+    """I provide access to a particular version of a mutable file. The
240+    access is read/write if I was obtained from a filenode derived from
241+    a write cap, or read-only if the filenode was derived from a read cap.
242+    """
243+
244+    def get_sequence_number():
245+        """Return the sequence number of this version."""
246+
247+    def get_servermap():
248+        """Return the IMutableFileServerMap instance that was used to create
249+        this object.
250+        """
251+
252+    def get_writekey():
253+        """Return this filenode's writekey, or None if the node does not have
254+        write-capability. This may be used to assist with data structures
255+        that need to make certain data available only to writers, such as the
256+        read-write child caps in dirnodes. The recommended process is to have
257+        reader-visible data be submitted to the filenode in the clear (where
258+        it will be encrypted by the filenode using the readkey), but encrypt
259+        writer-visible data using this writekey.
260+        """
261+
262+    # TODO: Can this be overwrite instead of replace?
263+    def replace(new_contents):
264+        """Replace the contents of the mutable file, provided that no other
265+        node has published (or is attempting to publish, concurrently) a
266+        newer version of the file than this one.
267+
268+        I will avoid modifying any share that is different than the version
269+        given by get_sequence_number(). However, if another node is writing
270+        to the file at the same time as me, I may manage to update some shares
271+        while they update others. If I see any evidence of this, I will signal
272+        UncoordinatedWriteError, and the file will be left in an inconsistent
273+        state (possibly the version you provided, possibly the old version,
274+        possibly somebody else's version, and possibly a mix of shares from
275+        all of these).
276+
277+        The recommended response to UncoordinatedWriteError is to either
278+        return it to the caller (since they failed to coordinate their
279+        writes), or to attempt some sort of recovery. It may be sufficient to
280+        wait a random interval (with exponential backoff) and repeat your
281+        operation. If I do not signal UncoordinatedWriteError, then I was
282+        able to write the new version without incident.
283+
284+        I return a Deferred that fires (with a PublishStatus object) when the
285+        update has completed.
286+        """
287+
288+    def modify(modifier_cb):
289+        """Modify the contents of the file, by downloading this version,
290+        applying the modifier function (or bound method), then uploading
291+        the new version. This will succeed as long as no other node
292+        publishes a version between the download and the upload.
293+        I return a Deferred that fires (with a PublishStatus object) when
294+        the update is complete.
295+
296+        The modifier callable will be given three arguments: a string (with
297+        the old contents), a 'first_time' boolean, and a servermap. As with
298+        download_to_data(), the old contents will be from this version,
299+        but the modifier can use the servermap to make other decisions
300+        (such as refusing to apply the delta if there are multiple parallel
301+        versions, or if there is evidence of a newer unrecoverable version).
302+        'first_time' will be True the first time the modifier is called,
303+        and False on any subsequent calls.
304+
305+        The callable should return a string with the new contents. The
306+        callable must be prepared to be called multiple times, and must
307+        examine the input string to see if the change that it wants to make
308+        is already present in the old version. If it does not need to make
309+        any changes, it can either return None, or return its input string.
310+
311+        If the modifier raises an exception, it will be returned in the
312+        errback.
313+        """
314+
315+
316 # The hierarchy looks like this:
317 #  IFilesystemNode
318 #   IFileNode
319hunk ./src/allmydata/interfaces.py 758
320     def raise_error():
321         """Raise any error associated with this node."""
322 
323+    # XXX: These may not be appropriate outside the context of an IReadable.
324     def get_size():
325         """Return the length (in bytes) of the data this node represents. For
326         directory nodes, I return the size of the backing store. I return
327hunk ./src/allmydata/interfaces.py 775
328 class IFileNode(IFilesystemNode):
329     """I am a node which represents a file: a sequence of bytes. I am not a
330     container, like IDirectoryNode."""
331+    def get_best_readable_version():
332+        """Return a Deferred that fires with an IReadable for the 'best'
333+        available version of the file. The IReadable provides only read
334+        access, even if this filenode was derived from a write cap.
335 
336hunk ./src/allmydata/interfaces.py 780
337-class IImmutableFileNode(IFileNode):
338-    def read(consumer, offset=0, size=None):
339-        """Download a portion (possibly all) of the file's contents, making
340-        them available to the given IConsumer. Return a Deferred that fires
341-        (with the consumer) when the consumer is unregistered (either because
342-        the last byte has been given to it, or because the consumer threw an
343-        exception during write(), possibly because it no longer wants to
344-        receive data). The portion downloaded will start at 'offset' and
345-        contain 'size' bytes (or the remainder of the file if size==None).
346-
347-        The consumer will be used in non-streaming mode: an IPullProducer
348-        will be attached to it.
349+        For an immutable file, there is only one version. For a mutable
350+        file, the 'best' version is the recoverable version with the
351+        highest sequence number. If no uncoordinated writes have occurred,
352+        and if enough shares are available, then this will be the most
353+        recent version that has been uploaded. If no version is recoverable,
354+        the Deferred will errback with an UnrecoverableFileError.
355+        """
356 
357hunk ./src/allmydata/interfaces.py 788
358-        The consumer will not receive data right away: several network trips
359-        must occur first. The order of events will be::
360+    def download_best_version():
361+        """Download the contents of the version that would be returned
362+        by get_best_readable_version(). This is equivalent to calling
363+        download_to_data() on the IReadable given by that method.
364 
365hunk ./src/allmydata/interfaces.py 793
366-         consumer.registerProducer(p, streaming)
367-          (if streaming == False)::
368-           consumer does p.resumeProducing()
369-            consumer.write(data)
370-           consumer does p.resumeProducing()
371-            consumer.write(data).. (repeat until all data is written)
372-         consumer.unregisterProducer()
373-         deferred.callback(consumer)
374+        I return a Deferred that fires with a byte string when the file
375+        has been fully downloaded. To support streaming download, use
376+        the 'read' method of IReadable. If no version is recoverable,
377+        the Deferred will errback with an UnrecoverableFileError.
378+        """
379 
380hunk ./src/allmydata/interfaces.py 799
381-        If a download error occurs, or an exception is raised by
382-        consumer.registerProducer() or consumer.write(), I will call
383-        consumer.unregisterProducer() and then deliver the exception via
384-        deferred.errback(). To cancel the download, the consumer should call
385-        p.stopProducing(), which will result in an exception being delivered
386-        via deferred.errback().
387+    def get_size_of_best_version():
388+        """Find the size of the version that would be returned by
389+        get_best_readable_version().
390 
391hunk ./src/allmydata/interfaces.py 803
392-        See src/allmydata/util/consumer.py for an example of a simple
393-        download-to-memory consumer.
394+        I return a Deferred that fires with an integer. If no version
395+        is recoverable, the Deferred will errback with an
396+        UnrecoverableFileError.
397         """
398 
399hunk ./src/allmydata/interfaces.py 808
400+
401+class IImmutableFileNode(IFileNode, IReadable):
402+    """I am a node representing an immutable file. Immutable files have
403+    only one version"""
404+
405+
406 class IMutableFileNode(IFileNode):
407     """I provide access to a 'mutable file', which retains its identity
408     regardless of what contents are put in it.
409hunk ./src/allmydata/interfaces.py 873
410     only be retrieved and updated all-at-once, as a single big string. Future
411     versions of our mutable files will remove this restriction.
412     """
413-
414-    def download_best_version():
415-        """Download the 'best' available version of the file, meaning one of
416-        the recoverable versions with the highest sequence number. If no
417+    def get_best_mutable_version():
418+        """Return a Deferred that fires with an IMutableFileVersion for
419+        the 'best' available version of the file. The best version is
420+        the recoverable version with the highest sequence number. If no
421         uncoordinated writes have occurred, and if enough shares are
422hunk ./src/allmydata/interfaces.py 878
423-        available, then this will be the most recent version that has been
424-        uploaded.
425+        available, then this will be the most recent version that has
426+        been uploaded.
427 
428hunk ./src/allmydata/interfaces.py 881
429-        I update an internal servermap with MODE_READ, determine which
430-        version of the file is indicated by
431-        servermap.best_recoverable_version(), and return a Deferred that
432-        fires with its contents. If no version is recoverable, the Deferred
433-        will errback with UnrecoverableFileError.
434-        """
435-
436-    def get_size_of_best_version():
437-        """Find the size of the version that would be downloaded with
438-        download_best_version(), without actually downloading the whole file.
439-
440-        I return a Deferred that fires with an integer.
441+        If no version is recoverable, the Deferred will errback with an
442+        UnrecoverableFileError.
443         """
444 
445     def overwrite(new_contents):
446hunk ./src/allmydata/interfaces.py 921
447         errback.
448         """
449 
450-
451     def get_servermap(mode):
452         """Return a Deferred that fires with an IMutableFileServerMap
453         instance, updated using the given mode.
454hunk ./src/allmydata/interfaces.py 974
455         writer-visible data using this writekey.
456         """
457 
458+    def set_version(version):
459+        """Tahoe-LAFS supports SDMF and MDMF mutable files. By default,
460+        we upload in SDMF for reasons of compatibility. If you want to
461+        change this, set_version will let you do that.
462+
463+        To say that this file should be uploaded in SDMF, pass in a 0. To
464+        say that the file should be uploaded as MDMF, pass in a 1.
465+        """
466+
467+    def get_version():
468+        """Returns the mutable file protocol version."""
469+
470 class NotEnoughSharesError(Exception):
471     """Download was unable to get enough shares"""
472 
473hunk ./src/allmydata/interfaces.py 1822
474         """The upload is finished, and whatever filehandle was in use may be
475         closed."""
476 
477+
478+class IMutableUploadable(Interface):
479+    """
480+    I represent content that is due to be uploaded to a mutable filecap.
481+    """
482+    # This is somewhat simpler than the IUploadable interface above
483+    # because mutable files do not need to be concerned with possibly
484+    # generating a CHK, nor with per-file keys. It is a subset of the
485+    # methods in IUploadable, though, so we could just as well implement
486+    # the mutable uploadables as IUploadables that don't happen to use
487+    # those methods (with the understanding that the unused methods will
488+    # never be called on such objects)
489+    def get_size():
490+        """
491+        Returns a Deferred that fires with the size of the content held
492+        by the uploadable.
493+        """
494+
495+    def read(length):
496+        """
497+        Returns a list of strings which, when concatenated, are the next
498+        length bytes of the file, or fewer if there are fewer bytes
499+        between the current location and the end of the file.
500+        """
501+
502+    def close():
503+        """
504+        The process that used the Uploadable is finished using it, so
505+        the uploadable may be closed.
506+        """
507+
508 class IUploadResults(Interface):
509     """I am returned by upload() methods. I contain a number of public
510     attributes which can be read to determine the results of the upload. Some
511}
512[frontends/sftpd.py: Modify the sftp frontend to work with the MDMF changes
513Kevan Carstensen <kevan@isnotajoke.com>**20100809233535
514 Ignore-this: 2d25e2cfcd0d7bbcbba660c7e1da12f
515] {
516hunk ./src/allmydata/frontends/sftpd.py 33
517 from allmydata.interfaces import IFileNode, IDirectoryNode, ExistingChildError, \
518      NoSuchChildError, ChildOfWrongTypeError
519 from allmydata.mutable.common import NotWriteableError
520+from allmydata.mutable.publish import MutableFileHandle
521 from allmydata.immutable.upload import FileHandle
522 from allmydata.dirnode import update_metadata
523 from allmydata.util.fileutil import EncryptedTemporaryFile
524hunk ./src/allmydata/frontends/sftpd.py 667
525         else:
526             assert IFileNode.providedBy(filenode), filenode
527 
528-            if filenode.is_mutable():
529-                self.async.addCallback(lambda ign: filenode.download_best_version())
530-                def _downloaded(data):
531-                    self.consumer = OverwriteableFileConsumer(len(data), tempfile_maker)
532-                    self.consumer.write(data)
533-                    self.consumer.finish()
534-                    return None
535-                self.async.addCallback(_downloaded)
536-            else:
537-                download_size = filenode.get_size()
538-                assert download_size is not None, "download_size is None"
539+            self.async.addCallback(lambda ignored: filenode.get_best_readable_version())
540+
541+            def _read(version):
542+                if noisy: self.log("_read", level=NOISY)
543+                download_size = version.get_size()
544+                assert download_size is not None
545+
546                 self.consumer = OverwriteableFileConsumer(download_size, tempfile_maker)
547hunk ./src/allmydata/frontends/sftpd.py 675
548-                def _read(ign):
549-                    if noisy: self.log("_read immutable", level=NOISY)
550-                    filenode.read(self.consumer, 0, None)
551-                self.async.addCallback(_read)
552+
553+                version.read(self.consumer, 0, None)
554+            self.async.addCallback(_read)
555 
556         eventually(self.async.callback, None)
557 
558hunk ./src/allmydata/frontends/sftpd.py 821
559                     assert parent and childname, (parent, childname, self.metadata)
560                     d2.addCallback(lambda ign: parent.set_metadata_for(childname, self.metadata))
561 
562-                d2.addCallback(lambda ign: self.consumer.get_current_size())
563-                d2.addCallback(lambda size: self.consumer.read(0, size))
564-                d2.addCallback(lambda new_contents: self.filenode.overwrite(new_contents))
565+                d2.addCallback(lambda ign: self.filenode.overwrite(MutableFileHandle(self.consumer.get_file())))
566             else:
567                 def _add_file(ign):
568                     self.log("_add_file childname=%r" % (childname,), level=OPERATIONAL)
569}
570[immutable/filenode.py: Make the immutable file node implement the same interfaces as the mutable one
571Kevan Carstensen <kevan@isnotajoke.com>**20100810000619
572 Ignore-this: 93e536c0f8efb705310f13ff64621527
573] {
574hunk ./src/allmydata/immutable/filenode.py 8
575 now = time.time
576 from zope.interface import implements, Interface
577 from twisted.internet import defer
578-from twisted.internet.interfaces import IConsumer
579 
580hunk ./src/allmydata/immutable/filenode.py 9
581-from allmydata.interfaces import IImmutableFileNode, IUploadResults
582 from allmydata import uri
583hunk ./src/allmydata/immutable/filenode.py 10
584+from twisted.internet.interfaces import IConsumer
585+from twisted.protocols import basic
586+from foolscap.api import eventually
587+from allmydata.interfaces import IImmutableFileNode, ICheckable, \
588+     IDownloadTarget, IUploadResults
589+from allmydata.util import dictutil, log, base32, consumer
590+from allmydata.immutable.checker import Checker
591 from allmydata.check_results import CheckResults, CheckAndRepairResults
592 from allmydata.util.dictutil import DictOfSets
593 from pycryptopp.cipher.aes import AES
594hunk ./src/allmydata/immutable/filenode.py 296
595         return self._cnode.check_and_repair(monitor, verify, add_lease)
596     def check(self, monitor, verify=False, add_lease=False):
597         return self._cnode.check(monitor, verify, add_lease)
598+
599+    def get_best_readable_version(self):
600+        """
601+        Return an IReadable of the best version of this file. Since
602+        immutable files can have only one version, we just return the
603+        current filenode.
604+        """
605+        return defer.succeed(self)
606+
607+
608+    def download_best_version(self):
609+        """
610+        Download the best version of this file, returning its contents
611+        as a bytestring. Since there is only one version of an immutable
612+        file, we download and return the contents of this file.
613+        """
614+        d = consumer.download_to_data(self)
615+        return d
616+
617+    # for an immutable file, download_to_data (specified in IReadable)
618+    # is the same as download_best_version (specified in IFileNode). For
619+    # mutable files, the difference is more meaningful, since they can
620+    # have multiple versions.
621+    download_to_data = download_best_version
622+
623+
624+    # get_size() (IReadable), get_current_size() (IFilesystemNode), and
625+    # get_size_of_best_version(IFileNode) are all the same for immutable
626+    # files.
627+    get_size_of_best_version = get_current_size
628}
629[immutable/literal.py: implement the same interfaces as other filenodes
630Kevan Carstensen <kevan@isnotajoke.com>**20100810000633
631 Ignore-this: b50dd5df2d34ecd6477b8499a27aef13
632] hunk ./src/allmydata/immutable/literal.py 106
633         d.addCallback(lambda lastSent: consumer)
634         return d
635 
636+    # IReadable, IFileNode, IFilesystemNode
637+    def get_best_readable_version(self):
638+        return defer.succeed(self)
639+
640+
641+    def download_best_version(self):
642+        return defer.succeed(self.u.data)
643+
644+
645+    download_to_data = download_best_version
646+    get_size_of_best_version = get_current_size
647+
648[scripts: tell 'tahoe put' about MDMF
649Kevan Carstensen <kevan@isnotajoke.com>**20100813234957
650 Ignore-this: c106b3384fc676bd3c0fb466d2a52b1b
651] {
652hunk ./src/allmydata/scripts/cli.py 160
653     optFlags = [
654         ("mutable", "m", "Create a mutable file instead of an immutable one."),
655         ]
656+    optParameters = [
657+        ("mutable-type", None, False, "Create a mutable file in the given format. Valid formats are 'sdmf' for SDMF and 'mdmf' for MDMF"),
658+        ]
659 
660     def parseArgs(self, arg1=None, arg2=None):
661         # see Examples below
662hunk ./src/allmydata/scripts/tahoe_put.py 21
663     from_file = options.from_file
664     to_file = options.to_file
665     mutable = options['mutable']
666+    mutable_type = False
667+
668+    if mutable:
669+        mutable_type = options['mutable-type']
670     if options['quiet']:
671         verbosity = 0
672     else:
673hunk ./src/allmydata/scripts/tahoe_put.py 33
674     stdout = options.stdout
675     stderr = options.stderr
676 
677+    if mutable_type and mutable_type not in ('sdmf', 'mdmf'):
678+        # Don't try to pass unsupported types to the webapi
679+        print >>stderr, "error: %s is an invalid format" % mutable_type
680+        return 1
681+
682     if nodeurl[-1] != "/":
683         nodeurl += "/"
684     if to_file:
685hunk ./src/allmydata/scripts/tahoe_put.py 76
686         url = nodeurl + "uri"
687     if mutable:
688         url += "?mutable=true"
689+    if mutable_type:
690+        assert mutable
691+        url += "&mutable-type=%s" % mutable_type
692+
693     if from_file:
694         infileobj = open(os.path.expanduser(from_file), "rb")
695     else:
696}
697[web: Alter the webapi to get along with and take advantage of the MDMF changes
698Kevan Carstensen <kevan@isnotajoke.com>**20100814081012
699 Ignore-this: 96c2ed4e4a9f450fb84db5d711d10bd6
700 
701 The main benefit that the webapi gets from MDMF, at least initially, is
702 the ability to do a streaming download of an MDMF mutable file. It also
703 exposes a way (through the PUT verb) to append to or otherwise modify
704 (in-place) an MDMF mutable file.
705] {
706hunk ./src/allmydata/web/common.py 12
707 from allmydata.interfaces import ExistingChildError, NoSuchChildError, \
708      FileTooLargeError, NotEnoughSharesError, NoSharesError, \
709      EmptyPathnameComponentError, MustBeDeepImmutableError, \
710-     MustBeReadonlyError, MustNotBeUnknownRWError
711+     MustBeReadonlyError, MustNotBeUnknownRWError, SDMF_VERSION, MDMF_VERSION
712 from allmydata.mutable.common import UnrecoverableFileError
713 from allmydata.util import abbreviate
714 from allmydata.util.encodingutil import to_str, quote_output
715hunk ./src/allmydata/web/common.py 35
716     else:
717         return boolean_of_arg(replace)
718 
719+
720+def parse_mutable_type_arg(arg):
721+    if not arg:
722+        return None # interpreted by the caller as "let the nodemaker decide"
723+
724+    arg = arg.lower()
725+    assert arg in ("mdmf", "sdmf")
726+
727+    if arg == "mdmf":
728+        return MDMF_VERSION
729+
730+    return SDMF_VERSION
731+
732+
733+def parse_offset_arg(offset):
734+    # XXX: This will raise a ValueError when invoked on something that
735+    # is not an integer. Is that okay? Or do we want a better error
736+    # message? Since this call is going to be used by programmers and
737+    # their tools rather than users (through the wui), it is not
738+    # inconsistent to return that, I guess.
739+    offset = int(offset)
740+    return offset
741+
742+
743 def get_root(ctx_or_req):
744     req = IRequest(ctx_or_req)
745     # the addSlash=True gives us one extra (empty) segment
746hunk ./src/allmydata/web/directory.py 19
747 from allmydata.uri import from_string_dirnode
748 from allmydata.interfaces import IDirectoryNode, IFileNode, IFilesystemNode, \
749      IImmutableFileNode, IMutableFileNode, ExistingChildError, \
750-     NoSuchChildError, EmptyPathnameComponentError
751+     NoSuchChildError, EmptyPathnameComponentError, SDMF_VERSION, MDMF_VERSION
752 from allmydata.monitor import Monitor, OperationCancelledError
753 from allmydata import dirnode
754 from allmydata.web.common import text_plain, WebError, \
755hunk ./src/allmydata/web/directory.py 153
756         if not t:
757             # render the directory as HTML, using the docFactory and Nevow's
758             # whole templating thing.
759-            return DirectoryAsHTML(self.node)
760+            return DirectoryAsHTML(self.node,
761+                                   self.client.mutable_file_default)
762 
763         if t == "json":
764             return DirectoryJSONMetadata(ctx, self.node)
765hunk ./src/allmydata/web/directory.py 556
766     docFactory = getxmlfile("directory.xhtml")
767     addSlash = True
768 
769-    def __init__(self, node):
770+    def __init__(self, node, default_mutable_format):
771         rend.Page.__init__(self)
772         self.node = node
773 
774hunk ./src/allmydata/web/directory.py 560
775+        assert default_mutable_format in (MDMF_VERSION, SDMF_VERSION)
776+        self.default_mutable_format = default_mutable_format
777+
778     def beforeRender(self, ctx):
779         # attempt to get the dirnode's children, stashing them (or the
780         # failure that results) for later use
781hunk ./src/allmydata/web/directory.py 780
782             ]]
783         forms.append(T.div(class_="freeform-form")[mkdir])
784 
785+        # Build input elements for mutable file type. We do this outside
786+        # of the list so we can check the appropriate format, based on
787+        # the default configured in the client (which reflects the
788+        # default configured in tahoe.cfg)
789+        if self.default_mutable_format == MDMF_VERSION:
790+            mdmf_input = T.input(type='radio', name='mutable-type',
791+                                 id='mutable-type-mdmf', value='mdmf',
792+                                 checked='checked')
793+        else:
794+            mdmf_input = T.input(type='radio', name='mutable-type',
795+                                 id='mutable-type-mdmf', value='mdmf')
796+
797+        if self.default_mutable_format == SDMF_VERSION:
798+            sdmf_input = T.input(type='radio', name='mutable-type',
799+                                 id='mutable-type-sdmf', value='sdmf',
800+                                 checked="checked")
801+        else:
802+            sdmf_input = T.input(type='radio', name='mutable-type',
803+                                 id='mutable-type-sdmf', value='sdmf')
804+
805         upload = T.form(action=".", method="post",
806                         enctype="multipart/form-data")[
807             T.fieldset[
808hunk ./src/allmydata/web/directory.py 812
809             T.input(type="submit", value="Upload"),
810             " Mutable?:",
811             T.input(type="checkbox", name="mutable"),
812+            sdmf_input, T.label(for_="mutable-type-sdmf")["SDMF"],
813+            mdmf_input,
814+            T.label(for_="mutable-type-mdmf")["MDMF (experimental)"],
815             ]]
816         forms.append(T.div(class_="freeform-form")[upload])
817 
818hunk ./src/allmydata/web/directory.py 850
819                 kiddata = ("filenode", {'size': childnode.get_size(),
820                                         'mutable': childnode.is_mutable(),
821                                         })
822+                if childnode.is_mutable() and \
823+                    childnode.get_version() is not None:
824+                    mutable_type = childnode.get_version()
825+                    assert mutable_type in (SDMF_VERSION, MDMF_VERSION)
826+
827+                    if mutable_type == MDMF_VERSION:
828+                        mutable_type = "mdmf"
829+                    else:
830+                        mutable_type = "sdmf"
831+                    kiddata[1]['mutable-type'] = mutable_type
832+
833             elif IDirectoryNode.providedBy(childnode):
834                 kiddata = ("dirnode", {'mutable': childnode.is_mutable()})
835             else:
836hunk ./src/allmydata/web/filenode.py 9
837 from nevow import url, rend
838 from nevow.inevow import IRequest
839 
840-from allmydata.interfaces import ExistingChildError
841+from allmydata.interfaces import ExistingChildError, SDMF_VERSION, MDMF_VERSION
842 from allmydata.monitor import Monitor
843 from allmydata.immutable.upload import FileHandle
844hunk ./src/allmydata/web/filenode.py 12
845+from allmydata.mutable.publish import MutableFileHandle
846+from allmydata.mutable.common import MODE_READ
847 from allmydata.util import log, base32
848 
849 from allmydata.web.common import text_plain, WebError, RenderMixin, \
850hunk ./src/allmydata/web/filenode.py 18
851      boolean_of_arg, get_arg, should_create_intermediate_directories, \
852-     MyExceptionHandler, parse_replace_arg
853+     MyExceptionHandler, parse_replace_arg, parse_offset_arg, \
854+     parse_mutable_type_arg
855 from allmydata.web.check_results import CheckResults, \
856      CheckAndRepairResults, LiteralCheckResults
857 from allmydata.web.info import MoreInfo
858hunk ./src/allmydata/web/filenode.py 29
859         # a new file is being uploaded in our place.
860         mutable = boolean_of_arg(get_arg(req, "mutable", "false"))
861         if mutable:
862-            req.content.seek(0)
863-            data = req.content.read()
864-            d = client.create_mutable_file(data)
865+            mutable_type = parse_mutable_type_arg(get_arg(req,
866+                                                          "mutable-type",
867+                                                          None))
868+            data = MutableFileHandle(req.content)
869+            d = client.create_mutable_file(data, version=mutable_type)
870             def _uploaded(newnode):
871                 d2 = self.parentnode.set_node(self.name, newnode,
872                                               overwrite=replace)
873hunk ./src/allmydata/web/filenode.py 66
874         d.addCallback(lambda res: childnode.get_uri())
875         return d
876 
877-    def _read_data_from_formpost(self, req):
878-        # SDMF: files are small, and we can only upload data, so we read
879-        # the whole file into memory before uploading.
880-        contents = req.fields["file"]
881-        contents.file.seek(0)
882-        data = contents.file.read()
883-        return data
884 
885     def replace_me_with_a_formpost(self, req, client, replace):
886         # create a new file, maybe mutable, maybe immutable
887hunk ./src/allmydata/web/filenode.py 71
888         mutable = boolean_of_arg(get_arg(req, "mutable", "false"))
889 
890+        # create an immutable file
891+        contents = req.fields["file"]
892         if mutable:
893hunk ./src/allmydata/web/filenode.py 74
894-            data = self._read_data_from_formpost(req)
895-            d = client.create_mutable_file(data)
896+            mutable_type = parse_mutable_type_arg(get_arg(req, "mutable-type",
897+                                                          None))
898+            uploadable = MutableFileHandle(contents.file)
899+            d = client.create_mutable_file(uploadable, version=mutable_type)
900             def _uploaded(newnode):
901                 d2 = self.parentnode.set_node(self.name, newnode,
902                                               overwrite=replace)
903hunk ./src/allmydata/web/filenode.py 85
904                 return d2
905             d.addCallback(_uploaded)
906             return d
907-        # create an immutable file
908-        contents = req.fields["file"]
909+
910         uploadable = FileHandle(contents.file, convergence=client.convergence)
911         d = self.parentnode.add_file(self.name, uploadable, overwrite=replace)
912         d.addCallback(lambda newnode: newnode.get_uri())
913hunk ./src/allmydata/web/filenode.py 91
914         return d
915 
916+
917 class PlaceHolderNodeHandler(RenderMixin, rend.Page, ReplaceMeMixin):
918     def __init__(self, client, parentnode, name):
919         rend.Page.__init__(self)
920hunk ./src/allmydata/web/filenode.py 174
921             # properly. So we assume that at least the browser will agree
922             # with itself, and echo back the same bytes that we were given.
923             filename = get_arg(req, "filename", self.name) or "unknown"
924-            if self.node.is_mutable():
925-                # some day: d = self.node.get_best_version()
926-                d = makeMutableDownloadable(self.node)
927-            else:
928-                d = defer.succeed(self.node)
929+            d = self.node.get_best_readable_version()
930             d.addCallback(lambda dn: FileDownloader(dn, filename))
931             return d
932         if t == "json":
933hunk ./src/allmydata/web/filenode.py 178
934-            if self.parentnode and self.name:
935-                d = self.parentnode.get_metadata_for(self.name)
936+            # We do this to make sure that fields like size and
937+            # mutable-type (which depend on the file on the grid and not
938+            # just on the cap) are filled in. The latter gets used in
939+            # tests, in particular.
940+            #
941+            # TODO: Make it so that the servermap knows how to update in
942+            # a mode specifically designed to fill in these fields, and
943+            # then update it in that mode.
944+            if self.node.is_mutable():
945+                d = self.node.get_servermap(MODE_READ)
946             else:
947                 d = defer.succeed(None)
948hunk ./src/allmydata/web/filenode.py 190
949+            if self.parentnode and self.name:
950+                d.addCallback(lambda ignored:
951+                    self.parentnode.get_metadata_for(self.name))
952+            else:
953+                d.addCallback(lambda ignored: None)
954             d.addCallback(lambda md: FileJSONMetadata(ctx, self.node, md))
955             return d
956         if t == "info":
957hunk ./src/allmydata/web/filenode.py 211
958         if t:
959             raise WebError("GET file: bad t=%s" % t)
960         filename = get_arg(req, "filename", self.name) or "unknown"
961-        if self.node.is_mutable():
962-            # some day: d = self.node.get_best_version()
963-            d = makeMutableDownloadable(self.node)
964-        else:
965-            d = defer.succeed(self.node)
966+        d = self.node.get_best_readable_version()
967         d.addCallback(lambda dn: FileDownloader(dn, filename))
968         return d
969 
970hunk ./src/allmydata/web/filenode.py 219
971         req = IRequest(ctx)
972         t = get_arg(req, "t", "").strip()
973         replace = parse_replace_arg(get_arg(req, "replace", "true"))
974+        offset = parse_offset_arg(get_arg(req, "offset", -1))
975 
976         if not t:
977hunk ./src/allmydata/web/filenode.py 222
978-            if self.node.is_mutable():
979+            if self.node.is_mutable() and offset >= 0:
980+                return self.update_my_contents(req, offset)
981+
982+            elif self.node.is_mutable():
983                 return self.replace_my_contents(req)
984             if not replace:
985                 # this is the early trap: if someone else modifies the
986hunk ./src/allmydata/web/filenode.py 232
987                 # directory while we're uploading, the add_file(overwrite=)
988                 # call in replace_me_with_a_child will do the late trap.
989                 raise ExistingChildError()
990+            if offset >= 0:
991+                raise WebError("PUT to a file: append operation invoked "
992+                               "on an immutable cap")
993+
994+
995             assert self.parentnode and self.name
996             return self.replace_me_with_a_child(req, self.client, replace)
997         if t == "uri":
998hunk ./src/allmydata/web/filenode.py 299
999 
1000     def replace_my_contents(self, req):
1001         req.content.seek(0)
1002-        new_contents = req.content.read()
1003+        new_contents = MutableFileHandle(req.content)
1004         d = self.node.overwrite(new_contents)
1005         d.addCallback(lambda res: self.node.get_uri())
1006         return d
1007hunk ./src/allmydata/web/filenode.py 304
1008 
1009+
1010+    def update_my_contents(self, req, offset):
1011+        req.content.seek(0)
1012+        added_contents = MutableFileHandle(req.content)
1013+
1014+        d = self.node.get_best_mutable_version()
1015+        d.addCallback(lambda mv:
1016+            mv.update(added_contents, offset))
1017+        d.addCallback(lambda ignored:
1018+            self.node.get_uri())
1019+        return d
1020+
1021+
1022     def replace_my_contents_with_a_formpost(self, req):
1023         # we have a mutable file. Get the data from the formpost, and replace
1024         # the mutable file's contents with it.
1025hunk ./src/allmydata/web/filenode.py 320
1026-        new_contents = self._read_data_from_formpost(req)
1027+        new_contents = req.fields['file']
1028+        new_contents = MutableFileHandle(new_contents.file)
1029+
1030         d = self.node.overwrite(new_contents)
1031         d.addCallback(lambda res: self.node.get_uri())
1032         return d
1033hunk ./src/allmydata/web/filenode.py 327
1034 
1035-class MutableDownloadable:
1036-    #implements(IDownloadable)
1037-    def __init__(self, size, node):
1038-        self.size = size
1039-        self.node = node
1040-    def get_size(self):
1041-        return self.size
1042-    def is_mutable(self):
1043-        return True
1044-    def read(self, consumer, offset=0, size=None):
1045-        d = self.node.download_best_version()
1046-        d.addCallback(self._got_data, consumer, offset, size)
1047-        return d
1048-    def _got_data(self, contents, consumer, offset, size):
1049-        start = offset
1050-        if size is not None:
1051-            end = offset+size
1052-        else:
1053-            end = self.size
1054-        # SDMF: we can write the whole file in one big chunk
1055-        consumer.write(contents[start:end])
1056-        return consumer
1057-
1058-def makeMutableDownloadable(n):
1059-    d = defer.maybeDeferred(n.get_size_of_best_version)
1060-    d.addCallback(MutableDownloadable, n)
1061-    return d
1062 
1063 class FileDownloader(rend.Page):
1064     # since we override the rendering process (to let the tahoe Downloader
1065hunk ./src/allmydata/web/filenode.py 509
1066     data[1]['mutable'] = filenode.is_mutable()
1067     if edge_metadata is not None:
1068         data[1]['metadata'] = edge_metadata
1069+
1070+    if filenode.is_mutable() and filenode.get_version() is not None:
1071+        mutable_type = filenode.get_version()
1072+        assert mutable_type in (MDMF_VERSION, SDMF_VERSION)
1073+        if mutable_type == MDMF_VERSION:
1074+            mutable_type = "mdmf"
1075+        else:
1076+            mutable_type = "sdmf"
1077+        data[1]['mutable-type'] = mutable_type
1078+
1079     return text_plain(simplejson.dumps(data, indent=1) + "\n", ctx)
1080 
1081 def FileURI(ctx, filenode):
1082hunk ./src/allmydata/web/root.py 15
1083 from allmydata import get_package_versions_string
1084 from allmydata import provisioning
1085 from allmydata.util import idlib, log
1086-from allmydata.interfaces import IFileNode
1087+from allmydata.interfaces import IFileNode, MDMF_VERSION, SDMF_VERSION
1088 from allmydata.web import filenode, directory, unlinked, status, operations
1089 from allmydata.web import reliability, storage
1090 from allmydata.web.common import abbreviate_size, getxmlfile, WebError, \
1091hunk ./src/allmydata/web/root.py 19
1092-     get_arg, RenderMixin, boolean_of_arg
1093+     get_arg, RenderMixin, boolean_of_arg, parse_mutable_type_arg
1094 
1095 
1096 class URIHandler(RenderMixin, rend.Page):
1097hunk ./src/allmydata/web/root.py 50
1098         if t == "":
1099             mutable = boolean_of_arg(get_arg(req, "mutable", "false").strip())
1100             if mutable:
1101-                return unlinked.PUTUnlinkedSSK(req, self.client)
1102+                version = parse_mutable_type_arg(get_arg(req, "mutable-type",
1103+                                                 None))
1104+                return unlinked.PUTUnlinkedSSK(req, self.client, version)
1105             else:
1106                 return unlinked.PUTUnlinkedCHK(req, self.client)
1107         if t == "mkdir":
1108hunk ./src/allmydata/web/root.py 70
1109         if t in ("", "upload"):
1110             mutable = bool(get_arg(req, "mutable", "").strip())
1111             if mutable:
1112-                return unlinked.POSTUnlinkedSSK(req, self.client)
1113+                version = parse_mutable_type_arg(get_arg(req, "mutable-type",
1114+                                                         None))
1115+                return unlinked.POSTUnlinkedSSK(req, self.client, version)
1116             else:
1117                 return unlinked.POSTUnlinkedCHK(req, self.client)
1118         if t == "mkdir":
1119hunk ./src/allmydata/web/root.py 324
1120 
1121     def render_upload_form(self, ctx, data):
1122         # this is a form where users can upload unlinked files
1123+        #
1124+        # for mutable files, users can choose the format by selecting
1125+        # MDMF or SDMF from a radio button. They can also configure a
1126+        # default format in tahoe.cfg, which they rightly expect us to
1127+        # obey. we convey to them that we are obeying their choice by
1128+        # ensuring that the one that they've chosen is selected in the
1129+        # interface.
1130+        if self.client.mutable_file_default == MDMF_VERSION:
1131+            mdmf_input = T.input(type='radio', name='mutable-type',
1132+                                 value='mdmf', id='mutable-type-mdmf',
1133+                                 checked='checked')
1134+        else:
1135+            mdmf_input = T.input(type='radio', name='mutable-type',
1136+                                 value='mdmf', id='mutable-type-mdmf')
1137+
1138+        if self.client.mutable_file_default == SDMF_VERSION:
1139+            sdmf_input = T.input(type='radio', name='mutable-type',
1140+                                 value='sdmf', id='mutable-type-sdmf',
1141+                                 checked='checked')
1142+        else:
1143+            sdmf_input = T.input(type='radio', name='mutable-type',
1144+                                 value='sdmf', id='mutable-type-sdmf')
1145+
1146+
1147         form = T.form(action="uri", method="post",
1148                       enctype="multipart/form-data")[
1149             T.fieldset[
1150hunk ./src/allmydata/web/root.py 356
1151                   T.input(type="file", name="file", class_="freeform-input-file")],
1152             T.input(type="hidden", name="t", value="upload"),
1153             T.div[T.input(type="checkbox", name="mutable"), T.label(for_="mutable")["Create mutable file"],
1154+                  sdmf_input, T.label(for_="mutable-type-sdmf")["SDMF"],
1155+                  mdmf_input,
1156+                  T.label(for_='mutable-type-mdmf')['MDMF (experimental)'],
1157                   " ", T.input(type="submit", value="Upload!")],
1158             ]]
1159         return T.div[form]
1160hunk ./src/allmydata/web/unlinked.py 7
1161 from twisted.internet import defer
1162 from nevow import rend, url, tags as T
1163 from allmydata.immutable.upload import FileHandle
1164+from allmydata.mutable.publish import MutableFileHandle
1165 from allmydata.web.common import getxmlfile, get_arg, boolean_of_arg, \
1166      convert_children_json, WebError
1167 from allmydata.web import status
1168hunk ./src/allmydata/web/unlinked.py 20
1169     # that fires with the URI of the new file
1170     return d
1171 
1172-def PUTUnlinkedSSK(req, client):
1173+def PUTUnlinkedSSK(req, client, version):
1174     # SDMF: files are small, and we can only upload data
1175     req.content.seek(0)
1176hunk ./src/allmydata/web/unlinked.py 23
1177-    data = req.content.read()
1178-    d = client.create_mutable_file(data)
1179+    data = MutableFileHandle(req.content)
1180+    d = client.create_mutable_file(data, version=version)
1181     d.addCallback(lambda n: n.get_uri())
1182     return d
1183 
1184hunk ./src/allmydata/web/unlinked.py 83
1185                       ["/uri/" + res.uri])
1186         return d
1187 
1188-def POSTUnlinkedSSK(req, client):
1189+def POSTUnlinkedSSK(req, client, version):
1190     # "POST /uri", to create an unlinked file.
1191     # SDMF: files are small, and we can only upload data
1192hunk ./src/allmydata/web/unlinked.py 86
1193-    contents = req.fields["file"]
1194-    contents.file.seek(0)
1195-    data = contents.file.read()
1196-    d = client.create_mutable_file(data)
1197+    contents = req.fields["file"].file
1198+    data = MutableFileHandle(contents)
1199+    d = client.create_mutable_file(data, version=version)
1200     d.addCallback(lambda n: n.get_uri())
1201     return d
1202 
1203}
1204[client.py: learn how to create different kinds of mutable files
1205Kevan Carstensen <kevan@isnotajoke.com>**20100814225711
1206 Ignore-this: 61ff665bc050cba5f58bf2ed779d692b
1207] {
1208hunk ./src/allmydata/client.py 25
1209 from allmydata.util.time_format import parse_duration, parse_date
1210 from allmydata.stats import StatsProvider
1211 from allmydata.history import History
1212-from allmydata.interfaces import IStatsProducer, RIStubClient
1213+from allmydata.interfaces import IStatsProducer, RIStubClient, \
1214+                                 SDMF_VERSION, MDMF_VERSION
1215 from allmydata.nodemaker import NodeMaker
1216 
1217 
1218hunk ./src/allmydata/client.py 357
1219                                    self.terminator,
1220                                    self.get_encoding_parameters(),
1221                                    self._key_generator)
1222+        default = self.get_config("client", "mutable.format", default="sdmf")
1223+        if default == "mdmf":
1224+            self.mutable_file_default = MDMF_VERSION
1225+        else:
1226+            self.mutable_file_default = SDMF_VERSION
1227 
1228     def get_history(self):
1229         return self.history
1230hunk ./src/allmydata/client.py 500
1231     def create_immutable_dirnode(self, children, convergence=None):
1232         return self.nodemaker.create_immutable_directory(children, convergence)
1233 
1234-    def create_mutable_file(self, contents=None, keysize=None):
1235-        return self.nodemaker.create_mutable_file(contents, keysize)
1236+    def create_mutable_file(self, contents=None, keysize=None, version=None):
1237+        if not version:
1238+            version = self.mutable_file_default
1239+        return self.nodemaker.create_mutable_file(contents, keysize,
1240+                                                  version=version)
1241 
1242     def upload(self, uploadable):
1243         uploader = self.getServiceNamed("uploader")
1244}
1245[mutable/checker.py and mutable/repair.py: Modify checker and repairer to work with MDMF
1246Kevan Carstensen <kevan@isnotajoke.com>**20100819003216
1247 Ignore-this: d3bd3260742be8964877f0a53543b01b
1248 
1249 The checker and repairer required minimal changes to work with the MDMF
1250 modifications made elsewhere. The checker duplicated a lot of the code
1251 that was already in the downloader, so I modified the downloader
1252 slightly to expose this functionality to the checker and removed the
1253 duplicated code. The repairer only required a minor change to deal with
1254 data representation.
1255] {
1256hunk ./src/allmydata/mutable/checker.py 2
1257 
1258-from twisted.internet import defer
1259-from twisted.python import failure
1260-from allmydata import hashtree
1261 from allmydata.uri import from_string
1262hunk ./src/allmydata/mutable/checker.py 3
1263-from allmydata.util import hashutil, base32, idlib, log
1264+from allmydata.util import base32, idlib, log
1265 from allmydata.check_results import CheckAndRepairResults, CheckResults
1266 
1267 from allmydata.mutable.common import MODE_CHECK, CorruptShareError
1268hunk ./src/allmydata/mutable/checker.py 8
1269 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
1270-from allmydata.mutable.layout import unpack_share, SIGNED_PREFIX_LENGTH
1271+from allmydata.mutable.retrieve import Retrieve # for verifying
1272 
1273 class MutableChecker:
1274 
1275hunk ./src/allmydata/mutable/checker.py 25
1276 
1277     def check(self, verify=False, add_lease=False):
1278         servermap = ServerMap()
1279+        # Updating the servermap in MODE_CHECK will stand a good chance
1280+        # of finding all of the shares, and getting a good idea of
1281+        # recoverability, etc, without verifying.
1282         u = ServermapUpdater(self._node, self._storage_broker, self._monitor,
1283                              servermap, MODE_CHECK, add_lease=add_lease)
1284         if self._history:
1285hunk ./src/allmydata/mutable/checker.py 51
1286         if num_recoverable:
1287             self.best_version = servermap.best_recoverable_version()
1288 
1289+        # The file is unhealthy and needs to be repaired if:
1290+        # - There are unrecoverable versions.
1291         if servermap.unrecoverable_versions():
1292             self.need_repair = True
1293hunk ./src/allmydata/mutable/checker.py 55
1294+        # - There isn't a recoverable version.
1295         if num_recoverable != 1:
1296             self.need_repair = True
1297hunk ./src/allmydata/mutable/checker.py 58
1298+        # - The best recoverable version is missing some shares.
1299         if self.best_version:
1300             available_shares = servermap.shares_available()
1301             (num_distinct_shares, k, N) = available_shares[self.best_version]
1302hunk ./src/allmydata/mutable/checker.py 69
1303 
1304     def _verify_all_shares(self, servermap):
1305         # read every byte of each share
1306+        #
1307+        # This logic is going to be very nearly the same as the
1308+        # downloader. I bet we could pass the downloader a flag that
1309+        # makes it do this, and piggyback onto that instead of
1310+        # duplicating a bunch of code.
1311+        #
1312+        # Like:
1313+        #  r = Retrieve(blah, blah, blah, verify=True)
1314+        #  d = r.download()
1315+        #  (wait, wait, wait, d.callback)
1316+        # 
1317+        #  Then, when it has finished, we can check the servermap (which
1318+        #  we provided to Retrieve) to figure out which shares are bad,
1319+        #  since the Retrieve process will have updated the servermap as
1320+        #  it went along.
1321+        #
1322+        #  By passing the verify=True flag to the constructor, we are
1323+        #  telling the downloader a few things.
1324+        #
1325+        #  1. It needs to download all N shares, not just K shares.
1326+        #  2. It doesn't need to decrypt or decode the shares, only
1327+        #     verify them.
1328         if not self.best_version:
1329             return
1330hunk ./src/allmydata/mutable/checker.py 93
1331-        versionmap = servermap.make_versionmap()
1332-        shares = versionmap[self.best_version]
1333-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
1334-         offsets_tuple) = self.best_version
1335-        offsets = dict(offsets_tuple)
1336-        readv = [ (0, offsets["EOF"]) ]
1337-        dl = []
1338-        for (shnum, peerid, timestamp) in shares:
1339-            ss = servermap.connections[peerid]
1340-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
1341-            d.addCallback(self._got_answer, peerid, servermap)
1342-            dl.append(d)
1343-        return defer.DeferredList(dl, fireOnOneErrback=True, consumeErrors=True)
1344 
1345hunk ./src/allmydata/mutable/checker.py 94
1346-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
1347-        # isolate the callRemote to a separate method, so tests can subclass
1348-        # Publish and override it
1349-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
1350+        r = Retrieve(self._node, servermap, self.best_version, verify=True)
1351+        d = r.download()
1352+        d.addCallback(self._process_bad_shares)
1353         return d
1354 
1355hunk ./src/allmydata/mutable/checker.py 99
1356-    def _got_answer(self, datavs, peerid, servermap):
1357-        for shnum,datav in datavs.items():
1358-            data = datav[0]
1359-            try:
1360-                self._got_results_one_share(shnum, peerid, data)
1361-            except CorruptShareError:
1362-                f = failure.Failure()
1363-                self.need_repair = True
1364-                self.bad_shares.append( (peerid, shnum, f) )
1365-                prefix = data[:SIGNED_PREFIX_LENGTH]
1366-                servermap.mark_bad_share(peerid, shnum, prefix)
1367-                ss = servermap.connections[peerid]
1368-                self.notify_server_corruption(ss, shnum, str(f.value))
1369-
1370-    def check_prefix(self, peerid, shnum, data):
1371-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
1372-         offsets_tuple) = self.best_version
1373-        got_prefix = data[:SIGNED_PREFIX_LENGTH]
1374-        if got_prefix != prefix:
1375-            raise CorruptShareError(peerid, shnum,
1376-                                    "prefix mismatch: share changed while we were reading it")
1377-
1378-    def _got_results_one_share(self, shnum, peerid, data):
1379-        self.check_prefix(peerid, shnum, data)
1380-
1381-        # the [seqnum:signature] pieces are validated by _compare_prefix,
1382-        # which checks their signature against the pubkey known to be
1383-        # associated with this file.
1384 
1385hunk ./src/allmydata/mutable/checker.py 100
1386-        (seqnum, root_hash, IV, k, N, segsize, datalen, pubkey, signature,
1387-         share_hash_chain, block_hash_tree, share_data,
1388-         enc_privkey) = unpack_share(data)
1389-
1390-        # validate [share_hash_chain,block_hash_tree,share_data]
1391-
1392-        leaves = [hashutil.block_hash(share_data)]
1393-        t = hashtree.HashTree(leaves)
1394-        if list(t) != block_hash_tree:
1395-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
1396-        share_hash_leaf = t[0]
1397-        t2 = hashtree.IncompleteHashTree(N)
1398-        # root_hash was checked by the signature
1399-        t2.set_hashes({0: root_hash})
1400-        try:
1401-            t2.set_hashes(hashes=share_hash_chain,
1402-                          leaves={shnum: share_hash_leaf})
1403-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
1404-                IndexError), e:
1405-            msg = "corrupt hashes: %s" % (e,)
1406-            raise CorruptShareError(peerid, shnum, msg)
1407-
1408-        # validate enc_privkey: only possible if we have a write-cap
1409-        if not self._node.is_readonly():
1410-            alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
1411-            alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
1412-            if alleged_writekey != self._node.get_writekey():
1413-                raise CorruptShareError(peerid, shnum, "invalid privkey")
1414+    def _process_bad_shares(self, bad_shares):
1415+        if bad_shares:
1416+            self.need_repair = True
1417+        self.bad_shares = bad_shares
1418 
1419hunk ./src/allmydata/mutable/checker.py 105
1420-    def notify_server_corruption(self, ss, shnum, reason):
1421-        ss.callRemoteOnly("advise_corrupt_share",
1422-                          "mutable", self._storage_index, shnum, reason)
1423 
1424     def _count_shares(self, smap, version):
1425         available_shares = smap.shares_available()
1426hunk ./src/allmydata/mutable/repairer.py 5
1427 from zope.interface import implements
1428 from twisted.internet import defer
1429 from allmydata.interfaces import IRepairResults, ICheckResults
1430+from allmydata.mutable.publish import MutableData
1431 
1432 class RepairResults:
1433     implements(IRepairResults)
1434hunk ./src/allmydata/mutable/repairer.py 108
1435             raise RepairRequiresWritecapError("Sorry, repair currently requires a writecap, to set the write-enabler properly.")
1436 
1437         d = self.node.download_version(smap, best_version, fetch_privkey=True)
1438+        d.addCallback(lambda data:
1439+            MutableData(data))
1440         d.addCallback(self.node.upload, smap)
1441         d.addCallback(self.get_results, smap)
1442         return d
1443}
1444[mutable/filenode.py: add versions and partial-file updates to the mutable file node
1445Kevan Carstensen <kevan@isnotajoke.com>**20100819003231
1446 Ignore-this: b7b5434201fdb9b48f902d7ab25ef45c
1447 
1448 One of the goals of MDMF as a GSoC project is to lay the groundwork for
1449 LDMF, a format that will allow Tahoe-LAFS to deal with and encourage
1450 multiple versions of a single cap on the grid. In line with this, there
1451 is a now a distinction between an overriding mutable file (which can be
1452 thought to correspond to the cap/unique identifier for that mutable
1453 file) and versions of the mutable file (which we can download, update,
1454 and so on). All download, upload, and modification operations end up
1455 happening on a particular version of a mutable file, but there are
1456 shortcut methods on the object representing the overriding mutable file
1457 that perform these operations on the best version of the mutable file
1458 (which is what code should be doing until we have LDMF and better
1459 support for other paradigms).
1460 
1461 Another goal of MDMF was to take advantage of segmentation to give
1462 callers more efficient partial file updates or appends. This patch
1463 implements methods that do that, too.
1464 
1465] {
1466hunk ./src/allmydata/mutable/filenode.py 7
1467 from zope.interface import implements
1468 from twisted.internet import defer, reactor
1469 from foolscap.api import eventually
1470-from allmydata.interfaces import IMutableFileNode, \
1471-     ICheckable, ICheckResults, NotEnoughSharesError
1472-from allmydata.util import hashutil, log
1473+from allmydata.interfaces import IMutableFileNode, ICheckable, ICheckResults, \
1474+     NotEnoughSharesError, MDMF_VERSION, SDMF_VERSION, IMutableUploadable, \
1475+     IMutableFileVersion, IWritable
1476+from allmydata.util import hashutil, log, consumer, deferredutil, mathutil
1477 from allmydata.util.assertutil import precondition
1478 from allmydata.uri import WriteableSSKFileURI, ReadonlySSKFileURI
1479 from allmydata.monitor import Monitor
1480hunk ./src/allmydata/mutable/filenode.py 16
1481 from pycryptopp.cipher.aes import AES
1482 
1483-from allmydata.mutable.publish import Publish
1484+from allmydata.mutable.publish import Publish, MutableData,\
1485+                                      DEFAULT_MAX_SEGMENT_SIZE, \
1486+                                      TransformingUploadable
1487 from allmydata.mutable.common import MODE_READ, MODE_WRITE, UnrecoverableFileError, \
1488      ResponseCache, UncoordinatedWriteError
1489 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
1490hunk ./src/allmydata/mutable/filenode.py 70
1491         self._sharemap = {} # known shares, shnum-to-[nodeids]
1492         self._cache = ResponseCache()
1493         self._most_recent_size = None
1494+        # filled in after __init__ if we're being created for the first time;
1495+        # filled in by the servermap updater before publishing, otherwise.
1496+        # set to this default value in case neither of those things happen,
1497+        # or in case the servermap can't find any shares to tell us what
1498+        # to publish as.
1499+        # TODO: Set this back to None, and find out why the tests fail
1500+        #       with it set to None.
1501+        self._protocol_version = None
1502 
1503         # all users of this MutableFileNode go through the serializer. This
1504         # takes advantage of the fact that Deferreds discard the callbacks
1505hunk ./src/allmydata/mutable/filenode.py 134
1506         return self._upload(initial_contents, None)
1507 
1508     def _get_initial_contents(self, contents):
1509-        if isinstance(contents, str):
1510-            return contents
1511         if contents is None:
1512hunk ./src/allmydata/mutable/filenode.py 135
1513-            return ""
1514+            return MutableData("")
1515+
1516+        if IMutableUploadable.providedBy(contents):
1517+            return contents
1518+
1519         assert callable(contents), "%s should be callable, not %s" % \
1520                (contents, type(contents))
1521         return contents(self)
1522hunk ./src/allmydata/mutable/filenode.py 209
1523 
1524     def get_size(self):
1525         return self._most_recent_size
1526+
1527     def get_current_size(self):
1528         d = self.get_size_of_best_version()
1529         d.addCallback(self._stash_size)
1530hunk ./src/allmydata/mutable/filenode.py 214
1531         return d
1532+
1533     def _stash_size(self, size):
1534         self._most_recent_size = size
1535         return size
1536hunk ./src/allmydata/mutable/filenode.py 273
1537             return cmp(self.__class__, them.__class__)
1538         return cmp(self._uri, them._uri)
1539 
1540-    def _do_serialized(self, cb, *args, **kwargs):
1541-        # note: to avoid deadlock, this callable is *not* allowed to invoke
1542-        # other serialized methods within this (or any other)
1543-        # MutableFileNode. The callable should be a bound method of this same
1544-        # MFN instance.
1545-        d = defer.Deferred()
1546-        self._serializer.addCallback(lambda ignore: cb(*args, **kwargs))
1547-        # we need to put off d.callback until this Deferred is finished being
1548-        # processed. Otherwise the caller's subsequent activities (like,
1549-        # doing other things with this node) can cause reentrancy problems in
1550-        # the Deferred code itself
1551-        self._serializer.addBoth(lambda res: eventually(d.callback, res))
1552-        # add a log.err just in case something really weird happens, because
1553-        # self._serializer stays around forever, therefore we won't see the
1554-        # usual Unhandled Error in Deferred that would give us a hint.
1555-        self._serializer.addErrback(log.err)
1556-        return d
1557 
1558     #################################
1559     # ICheckable
1560hunk ./src/allmydata/mutable/filenode.py 298
1561 
1562 
1563     #################################
1564-    # IMutableFileNode
1565+    # IFileNode
1566+
1567+    def get_best_readable_version(self):
1568+        """
1569+        I return a Deferred that fires with a MutableFileVersion
1570+        representing the best readable version of the file that I
1571+        represent
1572+        """
1573+        return self.get_readable_version()
1574+
1575+
1576+    def get_readable_version(self, servermap=None, version=None):
1577+        """
1578+        I return a Deferred that fires with an MutableFileVersion for my
1579+        version argument, if there is a recoverable file of that version
1580+        on the grid. If there is no recoverable version, I fire with an
1581+        UnrecoverableFileError.
1582+
1583+        If a servermap is provided, I look in there for the requested
1584+        version. If no servermap is provided, I create and update a new
1585+        one.
1586+
1587+        If no version is provided, then I return a MutableFileVersion
1588+        representing the best recoverable version of the file.
1589+        """
1590+        d = self._get_version_from_servermap(MODE_READ, servermap, version)
1591+        def _build_version((servermap, their_version)):
1592+            assert their_version in servermap.recoverable_versions()
1593+            assert their_version in servermap.make_versionmap()
1594+
1595+            mfv = MutableFileVersion(self,
1596+                                     servermap,
1597+                                     their_version,
1598+                                     self._storage_index,
1599+                                     self._storage_broker,
1600+                                     self._readkey,
1601+                                     history=self._history)
1602+            assert mfv.is_readonly()
1603+            # our caller can use this to download the contents of the
1604+            # mutable file.
1605+            return mfv
1606+        return d.addCallback(_build_version)
1607+
1608+
1609+    def _get_version_from_servermap(self,
1610+                                    mode,
1611+                                    servermap=None,
1612+                                    version=None):
1613+        """
1614+        I return a Deferred that fires with (servermap, version).
1615+
1616+        This function performs validation and a servermap update. If it
1617+        returns (servermap, version), the caller can assume that:
1618+            - servermap was last updated in mode.
1619+            - version is recoverable, and corresponds to the servermap.
1620+
1621+        If version and servermap are provided to me, I will validate
1622+        that version exists in the servermap, and that the servermap was
1623+        updated correctly.
1624+
1625+        If version is not provided, but servermap is, I will validate
1626+        the servermap and return the best recoverable version that I can
1627+        find in the servermap.
1628+
1629+        If the version is provided but the servermap isn't, I will
1630+        obtain a servermap that has been updated in the correct mode and
1631+        validate that version is found and recoverable.
1632+
1633+        If neither servermap nor version are provided, I will obtain a
1634+        servermap updated in the correct mode, and return the best
1635+        recoverable version that I can find in there.
1636+        """
1637+        # XXX: wording ^^^^
1638+        if servermap and servermap.last_update_mode == mode:
1639+            d = defer.succeed(servermap)
1640+        else:
1641+            d = self._get_servermap(mode)
1642+
1643+        def _get_version(servermap, v):
1644+            if v and v not in servermap.recoverable_versions():
1645+                v = None
1646+            elif not v:
1647+                v = servermap.best_recoverable_version()
1648+            if not v:
1649+                raise UnrecoverableFileError("no recoverable versions")
1650+
1651+            return (servermap, v)
1652+        return d.addCallback(_get_version, version)
1653+
1654 
1655     def download_best_version(self):
1656hunk ./src/allmydata/mutable/filenode.py 389
1657+        """
1658+        I return a Deferred that fires with the contents of the best
1659+        version of this mutable file.
1660+        """
1661         return self._do_serialized(self._download_best_version)
1662hunk ./src/allmydata/mutable/filenode.py 394
1663+
1664+
1665     def _download_best_version(self):
1666hunk ./src/allmydata/mutable/filenode.py 397
1667-        servermap = ServerMap()
1668-        d = self._try_once_to_download_best_version(servermap, MODE_READ)
1669-        def _maybe_retry(f):
1670-            f.trap(NotEnoughSharesError)
1671-            # the download is worth retrying once. Make sure to use the
1672-            # old servermap, since it is what remembers the bad shares,
1673-            # but use MODE_WRITE to make it look for even more shares.
1674-            # TODO: consider allowing this to retry multiple times.. this
1675-            # approach will let us tolerate about 8 bad shares, I think.
1676-            return self._try_once_to_download_best_version(servermap,
1677-                                                           MODE_WRITE)
1678+        """
1679+        I am the serialized sibling of download_best_version.
1680+        """
1681+        d = self.get_best_readable_version()
1682+        d.addCallback(self._record_size)
1683+        d.addCallback(lambda version: version.download_to_data())
1684+
1685+        # It is possible that the download will fail because there
1686+        # aren't enough shares to be had. If so, we will try again after
1687+        # updating the servermap in MODE_WRITE, which may find more
1688+        # shares than updating in MODE_READ, as we just did. We can do
1689+        # this by getting the best mutable version and downloading from
1690+        # that -- the best mutable version will be a MutableFileVersion
1691+        # with a servermap that was last updated in MODE_WRITE, as we
1692+        # want. If this fails, then we give up.
1693+        def _maybe_retry(failure):
1694+            failure.trap(NotEnoughSharesError)
1695+
1696+            d = self.get_best_mutable_version()
1697+            d.addCallback(self._record_size)
1698+            d.addCallback(lambda version: version.download_to_data())
1699+            return d
1700+
1701         d.addErrback(_maybe_retry)
1702         return d
1703hunk ./src/allmydata/mutable/filenode.py 422
1704-    def _try_once_to_download_best_version(self, servermap, mode):
1705-        d = self._update_servermap(servermap, mode)
1706-        d.addCallback(self._once_updated_download_best_version, servermap)
1707-        return d
1708-    def _once_updated_download_best_version(self, ignored, servermap):
1709-        goal = servermap.best_recoverable_version()
1710-        if not goal:
1711-            raise UnrecoverableFileError("no recoverable versions")
1712-        return self._try_once_to_download_version(servermap, goal)
1713+
1714+
1715+    def _record_size(self, mfv):
1716+        """
1717+        I record the size of a mutable file version.
1718+        """
1719+        self._most_recent_size = mfv.get_size()
1720+        return mfv
1721+
1722 
1723     def get_size_of_best_version(self):
1724hunk ./src/allmydata/mutable/filenode.py 433
1725-        d = self.get_servermap(MODE_READ)
1726-        def _got_servermap(smap):
1727-            ver = smap.best_recoverable_version()
1728-            if not ver:
1729-                raise UnrecoverableFileError("no recoverable version")
1730-            return smap.size_of_version(ver)
1731-        d.addCallback(_got_servermap)
1732-        return d
1733+        """
1734+        I return the size of the best version of this mutable file.
1735 
1736hunk ./src/allmydata/mutable/filenode.py 436
1737+        This is equivalent to calling get_size() on the result of
1738+        get_best_readable_version().
1739+        """
1740+        d = self.get_best_readable_version()
1741+        return d.addCallback(lambda mfv: mfv.get_size())
1742+
1743+
1744+    #################################
1745+    # IMutableFileNode
1746+
1747+    def get_best_mutable_version(self, servermap=None):
1748+        """
1749+        I return a Deferred that fires with a MutableFileVersion
1750+        representing the best readable version of the file that I
1751+        represent. I am like get_best_readable_version, except that I
1752+        will try to make a writable version if I can.
1753+        """
1754+        return self.get_mutable_version(servermap=servermap)
1755+
1756+
1757+    def get_mutable_version(self, servermap=None, version=None):
1758+        """
1759+        I return a version of this mutable file. I return a Deferred
1760+        that fires with a MutableFileVersion
1761+
1762+        If version is provided, the Deferred will fire with a
1763+        MutableFileVersion initailized with that version. Otherwise, it
1764+        will fire with the best version that I can recover.
1765+
1766+        If servermap is provided, I will use that to find versions
1767+        instead of performing my own servermap update.
1768+        """
1769+        if self.is_readonly():
1770+            return self.get_readable_version(servermap=servermap,
1771+                                             version=version)
1772+
1773+        # get_mutable_version => write intent, so we require that the
1774+        # servermap is updated in MODE_WRITE
1775+        d = self._get_version_from_servermap(MODE_WRITE, servermap, version)
1776+        def _build_version((servermap, smap_version)):
1777+            # these should have been set by the servermap update.
1778+            assert self._secret_holder
1779+            assert self._writekey
1780+
1781+            mfv = MutableFileVersion(self,
1782+                                     servermap,
1783+                                     smap_version,
1784+                                     self._storage_index,
1785+                                     self._storage_broker,
1786+                                     self._readkey,
1787+                                     self._writekey,
1788+                                     self._secret_holder,
1789+                                     history=self._history)
1790+            assert not mfv.is_readonly()
1791+            return mfv
1792+
1793+        return d.addCallback(_build_version)
1794+
1795+
1796+    # XXX: I'm uncomfortable with the difference between upload and
1797+    #      overwrite, which, FWICT, is basically that you don't have to
1798+    #      do a servermap update before you overwrite. We split them up
1799+    #      that way anyway, so I guess there's no real difficulty in
1800+    #      offering both ways to callers, but it also makes the
1801+    #      public-facing API cluttery, and makes it hard to discern the
1802+    #      right way of doing things.
1803+
1804+    # In general, we leave it to callers to ensure that they aren't
1805+    # going to cause UncoordinatedWriteErrors when working with
1806+    # MutableFileVersions. We know that the next three operations
1807+    # (upload, overwrite, and modify) will all operate on the same
1808+    # version, so we say that only one of them can be going on at once,
1809+    # and serialize them to ensure that that actually happens, since as
1810+    # the caller in this situation it is our job to do that.
1811     def overwrite(self, new_contents):
1812hunk ./src/allmydata/mutable/filenode.py 511
1813+        """
1814+        I overwrite the contents of the best recoverable version of this
1815+        mutable file with new_contents. This is equivalent to calling
1816+        overwrite on the result of get_best_mutable_version with
1817+        new_contents as an argument. I return a Deferred that eventually
1818+        fires with the results of my replacement process.
1819+        """
1820         return self._do_serialized(self._overwrite, new_contents)
1821hunk ./src/allmydata/mutable/filenode.py 519
1822+
1823+
1824     def _overwrite(self, new_contents):
1825hunk ./src/allmydata/mutable/filenode.py 522
1826+        """
1827+        I am the serialized sibling of overwrite.
1828+        """
1829+        d = self.get_best_mutable_version()
1830+        d.addCallback(lambda mfv: mfv.overwrite(new_contents))
1831+        d.addCallback(self._did_upload, new_contents.get_size())
1832+        return d
1833+
1834+
1835+
1836+    def upload(self, new_contents, servermap):
1837+        """
1838+        I overwrite the contents of the best recoverable version of this
1839+        mutable file with new_contents, using servermap instead of
1840+        creating/updating our own servermap. I return a Deferred that
1841+        fires with the results of my upload.
1842+        """
1843+        return self._do_serialized(self._upload, new_contents, servermap)
1844+
1845+
1846+    def modify(self, modifier, backoffer=None):
1847+        """
1848+        I modify the contents of the best recoverable version of this
1849+        mutable file with the modifier. This is equivalent to calling
1850+        modify on the result of get_best_mutable_version. I return a
1851+        Deferred that eventually fires with an UploadResults instance
1852+        describing this process.
1853+        """
1854+        return self._do_serialized(self._modify, modifier, backoffer)
1855+
1856+
1857+    def _modify(self, modifier, backoffer):
1858+        """
1859+        I am the serialized sibling of modify.
1860+        """
1861+        d = self.get_best_mutable_version()
1862+        d.addCallback(lambda mfv: mfv.modify(modifier, backoffer))
1863+        return d
1864+
1865+
1866+    def download_version(self, servermap, version, fetch_privkey=False):
1867+        """
1868+        Download the specified version of this mutable file. I return a
1869+        Deferred that fires with the contents of the specified version
1870+        as a bytestring, or errbacks if the file is not recoverable.
1871+        """
1872+        d = self.get_readable_version(servermap, version)
1873+        return d.addCallback(lambda mfv: mfv.download_to_data(fetch_privkey))
1874+
1875+
1876+    def get_servermap(self, mode):
1877+        """
1878+        I return a servermap that has been updated in mode.
1879+
1880+        mode should be one of MODE_READ, MODE_WRITE, MODE_CHECK or
1881+        MODE_ANYTHING. See servermap.py for more on what these mean.
1882+        """
1883+        return self._do_serialized(self._get_servermap, mode)
1884+
1885+
1886+    def _get_servermap(self, mode):
1887+        """
1888+        I am a serialized twin to get_servermap.
1889+        """
1890         servermap = ServerMap()
1891hunk ./src/allmydata/mutable/filenode.py 587
1892-        d = self._update_servermap(servermap, mode=MODE_WRITE)
1893-        d.addCallback(lambda ignored: self._upload(new_contents, servermap))
1894+        d = self._update_servermap(servermap, mode)
1895+        # The servermap will tell us about the most recent size of the
1896+        # file, so we may as well set that so that callers might get
1897+        # more data about us.
1898+        if not self._most_recent_size:
1899+            d.addCallback(self._get_size_from_servermap)
1900+        return d
1901+
1902+
1903+    def _get_size_from_servermap(self, servermap):
1904+        """
1905+        I extract the size of the best version of this file and record
1906+        it in self._most_recent_size. I return the servermap that I was
1907+        given.
1908+        """
1909+        if servermap.recoverable_versions():
1910+            v = servermap.best_recoverable_version()
1911+            size = v[4] # verinfo[4] == size
1912+            self._most_recent_size = size
1913+        return servermap
1914+
1915+
1916+    def _update_servermap(self, servermap, mode):
1917+        u = ServermapUpdater(self, self._storage_broker, Monitor(), servermap,
1918+                             mode)
1919+        if self._history:
1920+            self._history.notify_mapupdate(u.get_status())
1921+        return u.update()
1922+
1923+
1924+    def set_version(self, version):
1925+        # I can be set in two ways:
1926+        #  1. When the node is created.
1927+        #  2. (for an existing share) when the Servermap is updated
1928+        #     before I am read.
1929+        assert version in (MDMF_VERSION, SDMF_VERSION)
1930+        self._protocol_version = version
1931+
1932+
1933+    def get_version(self):
1934+        return self._protocol_version
1935+
1936+
1937+    def _do_serialized(self, cb, *args, **kwargs):
1938+        # note: to avoid deadlock, this callable is *not* allowed to invoke
1939+        # other serialized methods within this (or any other)
1940+        # MutableFileNode. The callable should be a bound method of this same
1941+        # MFN instance.
1942+        d = defer.Deferred()
1943+        self._serializer.addCallback(lambda ignore: cb(*args, **kwargs))
1944+        # we need to put off d.callback until this Deferred is finished being
1945+        # processed. Otherwise the caller's subsequent activities (like,
1946+        # doing other things with this node) can cause reentrancy problems in
1947+        # the Deferred code itself
1948+        self._serializer.addBoth(lambda res: eventually(d.callback, res))
1949+        # add a log.err just in case something really weird happens, because
1950+        # self._serializer stays around forever, therefore we won't see the
1951+        # usual Unhandled Error in Deferred that would give us a hint.
1952+        self._serializer.addErrback(log.err)
1953         return d
1954 
1955 
1956hunk ./src/allmydata/mutable/filenode.py 649
1957+    def _upload(self, new_contents, servermap):
1958+        """
1959+        A MutableFileNode still has to have some way of getting
1960+        published initially, which is what I am here for. After that,
1961+        all publishing, updating, modifying and so on happens through
1962+        MutableFileVersions.
1963+        """
1964+        assert self._pubkey, "update_servermap must be called before publish"
1965+
1966+        p = Publish(self, self._storage_broker, servermap)
1967+        if self._history:
1968+            self._history.notify_publish(p.get_status(),
1969+                                         new_contents.get_size())
1970+        d = p.publish(new_contents)
1971+        d.addCallback(self._did_upload, new_contents.get_size())
1972+        return d
1973+
1974+
1975+    def _did_upload(self, res, size):
1976+        self._most_recent_size = size
1977+        return res
1978+
1979+
1980+class MutableFileVersion:
1981+    """
1982+    I represent a specific version (most likely the best version) of a
1983+    mutable file.
1984+
1985+    Since I implement IReadable, instances which hold a
1986+    reference to an instance of me are guaranteed the ability (absent
1987+    connection difficulties or unrecoverable versions) to read the file
1988+    that I represent. Depending on whether I was initialized with a
1989+    write capability or not, I may also provide callers the ability to
1990+    overwrite or modify the contents of the mutable file that I
1991+    reference.
1992+    """
1993+    implements(IMutableFileVersion, IWritable)
1994+
1995+    def __init__(self,
1996+                 node,
1997+                 servermap,
1998+                 version,
1999+                 storage_index,
2000+                 storage_broker,
2001+                 readcap,
2002+                 writekey=None,
2003+                 write_secrets=None,
2004+                 history=None):
2005+
2006+        self._node = node
2007+        self._servermap = servermap
2008+        self._version = version
2009+        self._storage_index = storage_index
2010+        self._write_secrets = write_secrets
2011+        self._history = history
2012+        self._storage_broker = storage_broker
2013+
2014+        #assert isinstance(readcap, IURI)
2015+        self._readcap = readcap
2016+
2017+        self._writekey = writekey
2018+        self._serializer = defer.succeed(None)
2019+
2020+
2021+    def get_sequence_number(self):
2022+        """
2023+        Get the sequence number of the mutable version that I represent.
2024+        """
2025+        return self._version[0] # verinfo[0] == the sequence number
2026+
2027+
2028+    # TODO: Terminology?
2029+    def get_writekey(self):
2030+        """
2031+        I return a writekey or None if I don't have a writekey.
2032+        """
2033+        return self._writekey
2034+
2035+
2036+    def overwrite(self, new_contents):
2037+        """
2038+        I overwrite the contents of this mutable file version with the
2039+        data in new_contents.
2040+        """
2041+        assert not self.is_readonly()
2042+
2043+        return self._do_serialized(self._overwrite, new_contents)
2044+
2045+
2046+    def _overwrite(self, new_contents):
2047+        assert IMutableUploadable.providedBy(new_contents)
2048+        assert self._servermap.last_update_mode == MODE_WRITE
2049+
2050+        return self._upload(new_contents)
2051+
2052+
2053     def modify(self, modifier, backoffer=None):
2054         """I use a modifier callback to apply a change to the mutable file.
2055         I implement the following pseudocode::
2056hunk ./src/allmydata/mutable/filenode.py 785
2057         backoffer should not invoke any methods on this MutableFileNode
2058         instance, and it needs to be highly conscious of deadlock issues.
2059         """
2060+        assert not self.is_readonly()
2061+
2062         return self._do_serialized(self._modify, modifier, backoffer)
2063hunk ./src/allmydata/mutable/filenode.py 788
2064+
2065+
2066     def _modify(self, modifier, backoffer):
2067hunk ./src/allmydata/mutable/filenode.py 791
2068-        servermap = ServerMap()
2069         if backoffer is None:
2070             backoffer = BackoffAgent().delay
2071hunk ./src/allmydata/mutable/filenode.py 793
2072-        return self._modify_and_retry(servermap, modifier, backoffer, True)
2073-    def _modify_and_retry(self, servermap, modifier, backoffer, first_time):
2074-        d = self._modify_once(servermap, modifier, first_time)
2075+        return self._modify_and_retry(modifier, backoffer, True)
2076+
2077+
2078+    def _modify_and_retry(self, modifier, backoffer, first_time):
2079+        """
2080+        I try to apply modifier to the contents of this version of the
2081+        mutable file. If I succeed, I return an UploadResults instance
2082+        describing my success. If I fail, I try again after waiting for
2083+        a little bit.
2084+        """
2085+        log.msg("doing modify")
2086+        d = self._modify_once(modifier, first_time)
2087         def _retry(f):
2088             f.trap(UncoordinatedWriteError)
2089             d2 = defer.maybeDeferred(backoffer, self, f)
2090hunk ./src/allmydata/mutable/filenode.py 809
2091             d2.addCallback(lambda ignored:
2092-                           self._modify_and_retry(servermap, modifier,
2093+                           self._modify_and_retry(modifier,
2094                                                   backoffer, False))
2095             return d2
2096         d.addErrback(_retry)
2097hunk ./src/allmydata/mutable/filenode.py 814
2098         return d
2099-    def _modify_once(self, servermap, modifier, first_time):
2100-        d = self._update_servermap(servermap, MODE_WRITE)
2101-        d.addCallback(self._once_updated_download_best_version, servermap)
2102+
2103+
2104+    def _modify_once(self, modifier, first_time):
2105+        """
2106+        I attempt to apply a modifier to the contents of the mutable
2107+        file.
2108+        """
2109+        # XXX: This is wrong -- we could get more servers if we updated
2110+        # in MODE_ANYTHING and possibly MODE_CHECK. Probably we want to
2111+        # assert that the last update wasn't MODE_READ
2112+        assert self._servermap.last_update_mode == MODE_WRITE
2113+
2114+        # download_to_data is serialized, so we have to call this to
2115+        # avoid deadlock.
2116+        d = self._try_to_download_data()
2117         def _apply(old_contents):
2118hunk ./src/allmydata/mutable/filenode.py 830
2119-            new_contents = modifier(old_contents, servermap, first_time)
2120+            new_contents = modifier(old_contents, self._servermap, first_time)
2121+            precondition((isinstance(new_contents, str) or
2122+                          new_contents is None),
2123+                         "Modifier function must return a string "
2124+                         "or None")
2125+
2126             if new_contents is None or new_contents == old_contents:
2127hunk ./src/allmydata/mutable/filenode.py 837
2128+                log.msg("no changes")
2129                 # no changes need to be made
2130                 if first_time:
2131                     return
2132hunk ./src/allmydata/mutable/filenode.py 845
2133                 # recovery when it observes UCWE, we need to do a second
2134                 # publish. See #551 for details. We'll basically loop until
2135                 # we managed an uncontested publish.
2136-                new_contents = old_contents
2137-            precondition(isinstance(new_contents, str),
2138-                         "Modifier function must return a string or None")
2139-            return self._upload(new_contents, servermap)
2140+                old_uploadable = MutableData(old_contents)
2141+                new_contents = old_uploadable
2142+            else:
2143+                new_contents = MutableData(new_contents)
2144+
2145+            return self._upload(new_contents)
2146         d.addCallback(_apply)
2147         return d
2148 
2149hunk ./src/allmydata/mutable/filenode.py 854
2150-    def get_servermap(self, mode):
2151-        return self._do_serialized(self._get_servermap, mode)
2152-    def _get_servermap(self, mode):
2153-        servermap = ServerMap()
2154-        return self._update_servermap(servermap, mode)
2155-    def _update_servermap(self, servermap, mode):
2156-        u = ServermapUpdater(self, self._storage_broker, Monitor(), servermap,
2157-                             mode)
2158-        if self._history:
2159-            self._history.notify_mapupdate(u.get_status())
2160-        return u.update()
2161 
2162hunk ./src/allmydata/mutable/filenode.py 855
2163-    def download_version(self, servermap, version, fetch_privkey=False):
2164-        return self._do_serialized(self._try_once_to_download_version,
2165-                                   servermap, version, fetch_privkey)
2166-    def _try_once_to_download_version(self, servermap, version,
2167-                                      fetch_privkey=False):
2168-        r = Retrieve(self, servermap, version, fetch_privkey)
2169+    def is_readonly(self):
2170+        """
2171+        I return True if this MutableFileVersion provides no write
2172+        access to the file that it encapsulates, and False if it
2173+        provides the ability to modify the file.
2174+        """
2175+        return self._writekey is None
2176+
2177+
2178+    def is_mutable(self):
2179+        """
2180+        I return True, since mutable files are always mutable by
2181+        somebody.
2182+        """
2183+        return True
2184+
2185+
2186+    def get_storage_index(self):
2187+        """
2188+        I return the storage index of the reference that I encapsulate.
2189+        """
2190+        return self._storage_index
2191+
2192+
2193+    def get_size(self):
2194+        """
2195+        I return the length, in bytes, of this readable object.
2196+        """
2197+        return self._servermap.size_of_version(self._version)
2198+
2199+
2200+    def download_to_data(self, fetch_privkey=False):
2201+        """
2202+        I return a Deferred that fires with the contents of this
2203+        readable object as a byte string.
2204+
2205+        """
2206+        c = consumer.MemoryConsumer()
2207+        d = self.read(c, fetch_privkey=fetch_privkey)
2208+        d.addCallback(lambda mc: "".join(mc.chunks))
2209+        return d
2210+
2211+
2212+    def _try_to_download_data(self):
2213+        """
2214+        I am an unserialized cousin of download_to_data; I am called
2215+        from the children of modify() to download the data associated
2216+        with this mutable version.
2217+        """
2218+        c = consumer.MemoryConsumer()
2219+        # modify will almost certainly write, so we need the privkey.
2220+        d = self._read(c, fetch_privkey=True)
2221+        d.addCallback(lambda mc: "".join(mc.chunks))
2222+        return d
2223+
2224+
2225+    def read(self, consumer, offset=0, size=None, fetch_privkey=False):
2226+        """
2227+        I read a portion (possibly all) of the mutable file that I
2228+        reference into consumer.
2229+        """
2230+        return self._do_serialized(self._read, consumer, offset, size,
2231+                                   fetch_privkey)
2232+
2233+
2234+    def _read(self, consumer, offset=0, size=None, fetch_privkey=False):
2235+        """
2236+        I am the serialized companion of read.
2237+        """
2238+        r = Retrieve(self._node, self._servermap, self._version, fetch_privkey)
2239         if self._history:
2240             self._history.notify_retrieve(r.get_status())
2241hunk ./src/allmydata/mutable/filenode.py 927
2242-        d = r.download()
2243-        d.addCallback(self._downloaded_version)
2244+        d = r.download(consumer, offset, size)
2245         return d
2246hunk ./src/allmydata/mutable/filenode.py 929
2247-    def _downloaded_version(self, data):
2248-        self._most_recent_size = len(data)
2249-        return data
2250 
2251hunk ./src/allmydata/mutable/filenode.py 930
2252-    def upload(self, new_contents, servermap):
2253-        return self._do_serialized(self._upload, new_contents, servermap)
2254-    def _upload(self, new_contents, servermap):
2255-        assert self._pubkey, "update_servermap must be called before publish"
2256-        p = Publish(self, self._storage_broker, servermap)
2257+
2258+    def _do_serialized(self, cb, *args, **kwargs):
2259+        # note: to avoid deadlock, this callable is *not* allowed to invoke
2260+        # other serialized methods within this (or any other)
2261+        # MutableFileNode. The callable should be a bound method of this same
2262+        # MFN instance.
2263+        d = defer.Deferred()
2264+        self._serializer.addCallback(lambda ignore: cb(*args, **kwargs))
2265+        # we need to put off d.callback until this Deferred is finished being
2266+        # processed. Otherwise the caller's subsequent activities (like,
2267+        # doing other things with this node) can cause reentrancy problems in
2268+        # the Deferred code itself
2269+        self._serializer.addBoth(lambda res: eventually(d.callback, res))
2270+        # add a log.err just in case something really weird happens, because
2271+        # self._serializer stays around forever, therefore we won't see the
2272+        # usual Unhandled Error in Deferred that would give us a hint.
2273+        self._serializer.addErrback(log.err)
2274+        return d
2275+
2276+
2277+    def _upload(self, new_contents):
2278+        #assert self._pubkey, "update_servermap must be called before publish"
2279+        p = Publish(self._node, self._storage_broker, self._servermap)
2280         if self._history:
2281hunk ./src/allmydata/mutable/filenode.py 954
2282-            self._history.notify_publish(p.get_status(), len(new_contents))
2283+            self._history.notify_publish(p.get_status(),
2284+                                         new_contents.get_size())
2285         d = p.publish(new_contents)
2286hunk ./src/allmydata/mutable/filenode.py 957
2287-        d.addCallback(self._did_upload, len(new_contents))
2288+        d.addCallback(self._did_upload, new_contents.get_size())
2289         return d
2290hunk ./src/allmydata/mutable/filenode.py 959
2291+
2292+
2293     def _did_upload(self, res, size):
2294         self._most_recent_size = size
2295         return res
2296hunk ./src/allmydata/mutable/filenode.py 964
2297+
2298+    def update(self, data, offset):
2299+        """
2300+        Do an update of this mutable file version by inserting data at
2301+        offset within the file. If offset is the EOF, this is an append
2302+        operation. I return a Deferred that fires with the results of
2303+        the update operation when it has completed.
2304+
2305+        In cases where update does not append any data, or where it does
2306+        not append so many blocks that the block count crosses a
2307+        power-of-two boundary, this operation will use roughly
2308+        O(data.get_size()) memory/bandwidth/CPU to perform the update.
2309+        Otherwise, it must download, re-encode, and upload the entire
2310+        file again, which will use O(filesize) resources.
2311+        """
2312+        return self._do_serialized(self._update, data, offset)
2313+
2314+
2315+    def _update(self, data, offset):
2316+        """
2317+        I update the mutable file version represented by this particular
2318+        IMutableVersion by inserting the data in data at the offset
2319+        offset. I return a Deferred that fires when this has been
2320+        completed.
2321+        """
2322+        # We have two cases here:
2323+        # 1. The new data will add few enough segments so that it does
2324+        #    not cross into the next power-of-two boundary.
2325+        # 2. It doesn't.
2326+        #
2327+        # In the former case, we can modify the file in place. In the
2328+        # latter case, we need to re-encode the file.
2329+        new_size = data.get_size() + offset
2330+        old_size = self.get_size()
2331+        segment_size = self._version[3]
2332+        num_old_segments = mathutil.div_ceil(old_size,
2333+                                             segment_size)
2334+        num_new_segments = mathutil.div_ceil(new_size,
2335+                                             segment_size)
2336+        log.msg("got %d old segments, %d new segments" % \
2337+                        (num_old_segments, num_new_segments))
2338+
2339+        # We also do a whole file re-encode if the file is an SDMF file.
2340+        if self._version[2]: # version[2] == SDMF salt, which MDMF lacks
2341+            log.msg("doing re-encode instead of in-place update")
2342+            return self._do_modify_update(data, offset)
2343+
2344+        log.msg("updating in place")
2345+        d = self._do_update_update(data, offset)
2346+        d.addCallback(self._decode_and_decrypt_segments, data, offset)
2347+        d.addCallback(self._build_uploadable_and_finish, data, offset)
2348+        return d
2349+
2350+
2351+    def _do_modify_update(self, data, offset):
2352+        """
2353+        I perform a file update by modifying the contents of the file
2354+        after downloading it, then reuploading it. I am less efficient
2355+        than _do_update_update, but am necessary for certain updates.
2356+        """
2357+        def m(old, servermap, first_time):
2358+            start = offset
2359+            rest = offset + data.get_size()
2360+            new = old[:start]
2361+            new += "".join(data.read(data.get_size()))
2362+            new += old[rest:]
2363+            return new
2364+        return self._modify(m, None)
2365+
2366+
2367+    def _do_update_update(self, data, offset):
2368+        """
2369+        I start the Servermap update that gets us the data we need to
2370+        continue the update process. I return a Deferred that fires when
2371+        the servermap update is done.
2372+        """
2373+        assert IMutableUploadable.providedBy(data)
2374+        assert self.is_mutable()
2375+        # offset == self.get_size() is valid and means that we are
2376+        # appending data to the file.
2377+        assert offset <= self.get_size()
2378+
2379+        # We'll need the segment that the data starts in, regardless of
2380+        # what we'll do later.
2381+        start_segment = mathutil.div_ceil(offset, DEFAULT_MAX_SEGMENT_SIZE)
2382+        start_segment -= 1
2383+
2384+        # We only need the end segment if the data we append does not go
2385+        # beyond the current end-of-file.
2386+        end_segment = start_segment
2387+        if offset + data.get_size() < self.get_size():
2388+            end_data = offset + data.get_size()
2389+            end_segment = mathutil.div_ceil(end_data, DEFAULT_MAX_SEGMENT_SIZE)
2390+            end_segment -= 1
2391+        self._start_segment = start_segment
2392+        self._end_segment = end_segment
2393+
2394+        # Now ask for the servermap to be updated in MODE_WRITE with
2395+        # this update range.
2396+        u = ServermapUpdater(self._node, self._storage_broker, Monitor(),
2397+                             self._servermap,
2398+                             mode=MODE_WRITE,
2399+                             update_range=(start_segment, end_segment))
2400+        return u.update()
2401+
2402+
2403+    def _decode_and_decrypt_segments(self, ignored, data, offset):
2404+        """
2405+        After the servermap update, I take the encrypted and encoded
2406+        data that the servermap fetched while doing its update and
2407+        transform it into decoded-and-decrypted plaintext that can be
2408+        used by the new uploadable. I return a Deferred that fires with
2409+        the segments.
2410+        """
2411+        r = Retrieve(self._node, self._servermap, self._version)
2412+        # decode: takes in our blocks and salts from the servermap,
2413+        # returns a Deferred that fires with the corresponding plaintext
2414+        # segments. Does not download -- simply takes advantage of
2415+        # existing infrastructure within the Retrieve class to avoid
2416+        # duplicating code.
2417+        sm = self._servermap
2418+        # XXX: If the methods in the servermap don't work as
2419+        # abstractions, you should rewrite them instead of going around
2420+        # them.
2421+        update_data = sm.update_data
2422+        start_segments = {} # shnum -> start segment
2423+        end_segments = {} # shnum -> end segment
2424+        blockhashes = {} # shnum -> blockhash tree
2425+        for (shnum, data) in update_data.iteritems():
2426+            data = [d[1] for d in data if d[0] == self._version]
2427+
2428+            # Every data entry in our list should now be share shnum for
2429+            # a particular version of the mutable file, so all of the
2430+            # entries should be identical.
2431+            datum = data[0]
2432+            assert filter(lambda x: x != datum, data) == []
2433+
2434+            blockhashes[shnum] = datum[0]
2435+            start_segments[shnum] = datum[1]
2436+            end_segments[shnum] = datum[2]
2437+
2438+        d1 = r.decode(start_segments, self._start_segment)
2439+        d2 = r.decode(end_segments, self._end_segment)
2440+        d3 = defer.succeed(blockhashes)
2441+        return deferredutil.gatherResults([d1, d2, d3])
2442+
2443+
2444+    def _build_uploadable_and_finish(self, segments_and_bht, data, offset):
2445+        """
2446+        After the process has the plaintext segments, I build the
2447+        TransformingUploadable that the publisher will eventually
2448+        re-upload to the grid. I then invoke the publisher with that
2449+        uploadable, and return a Deferred when the publish operation has
2450+        completed without issue.
2451+        """
2452+        u = TransformingUploadable(data, offset,
2453+                                   self._version[3],
2454+                                   segments_and_bht[0],
2455+                                   segments_and_bht[1])
2456+        p = Publish(self._node, self._storage_broker, self._servermap)
2457+        return p.update(u, offset, segments_and_bht[2], self._version)
2458}
2459[mutable/publish.py: Modify the publish process to support MDMF
2460Kevan Carstensen <kevan@isnotajoke.com>**20100819003342
2461 Ignore-this: 2bb379974927e2e20cff75bae8302d1d
2462 
2463 The inner workings of the publishing process needed to be reworked to a
2464 large extend to cope with segmented mutable files, and to cope with
2465 partial-file updates of mutable files. This patch does that. It also
2466 introduces wrappers for uploadable data, allowing the use of
2467 filehandle-like objects as data sources, in addition to strings. This
2468 reduces memory inefficiency when dealing with large files through the
2469 webapi, and clarifies update code there.
2470] {
2471hunk ./src/allmydata/mutable/publish.py 3
2472 
2473 
2474-import os, struct, time
2475+import os, time
2476+from StringIO import StringIO
2477 from itertools import count
2478 from zope.interface import implements
2479 from twisted.internet import defer
2480hunk ./src/allmydata/mutable/publish.py 9
2481 from twisted.python import failure
2482-from allmydata.interfaces import IPublishStatus
2483+from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION, \
2484+                                 IMutableUploadable
2485 from allmydata.util import base32, hashutil, mathutil, idlib, log
2486 from allmydata.util.dictutil import DictOfSets
2487 from allmydata import hashtree, codec
2488hunk ./src/allmydata/mutable/publish.py 21
2489 from allmydata.mutable.common import MODE_WRITE, MODE_CHECK, \
2490      UncoordinatedWriteError, NotEnoughServersError
2491 from allmydata.mutable.servermap import ServerMap
2492-from allmydata.mutable.layout import pack_prefix, pack_share, unpack_header, pack_checkstring, \
2493-     unpack_checkstring, SIGNED_PREFIX
2494+from allmydata.mutable.layout import unpack_checkstring, MDMFSlotWriteProxy, \
2495+                                     SDMFSlotWriteProxy
2496+
2497+KiB = 1024
2498+DEFAULT_MAX_SEGMENT_SIZE = 128 * KiB
2499+PUSHING_BLOCKS_STATE = 0
2500+PUSHING_EVERYTHING_ELSE_STATE = 1
2501+DONE_STATE = 2
2502 
2503 class PublishStatus:
2504     implements(IPublishStatus)
2505hunk ./src/allmydata/mutable/publish.py 118
2506         self._status.set_helper(False)
2507         self._status.set_progress(0.0)
2508         self._status.set_active(True)
2509+        self._version = self._node.get_version()
2510+        assert self._version in (SDMF_VERSION, MDMF_VERSION)
2511+
2512 
2513     def get_status(self):
2514         return self._status
2515hunk ./src/allmydata/mutable/publish.py 132
2516             kwargs["facility"] = "tahoe.mutable.publish"
2517         return log.msg(*args, **kwargs)
2518 
2519+
2520+    def update(self, data, offset, blockhashes, version):
2521+        """
2522+        I replace the contents of this file with the contents of data,
2523+        starting at offset. I return a Deferred that fires with None
2524+        when the replacement has been completed, or with an error if
2525+        something went wrong during the process.
2526+
2527+        Note that this process will not upload new shares. If the file
2528+        being updated is in need of repair, callers will have to repair
2529+        it on their own.
2530+        """
2531+        # How this works:
2532+        # 1: Make peer assignments. We'll assign each share that we know
2533+        # about on the grid to that peer that currently holds that
2534+        # share, and will not place any new shares.
2535+        # 2: Setup encoding parameters. Most of these will stay the same
2536+        # -- datalength will change, as will some of the offsets.
2537+        # 3. Upload the new segments.
2538+        # 4. Be done.
2539+        assert IMutableUploadable.providedBy(data)
2540+
2541+        self.data = data
2542+
2543+        # XXX: Use the MutableFileVersion instead.
2544+        self.datalength = self._node.get_size()
2545+        if data.get_size() > self.datalength:
2546+            self.datalength = data.get_size()
2547+
2548+        self.log("starting update")
2549+        self.log("adding new data of length %d at offset %d" % \
2550+                    (data.get_size(), offset))
2551+        self.log("new data length is %d" % self.datalength)
2552+        self._status.set_size(self.datalength)
2553+        self._status.set_status("Started")
2554+        self._started = time.time()
2555+
2556+        self.done_deferred = defer.Deferred()
2557+
2558+        self._writekey = self._node.get_writekey()
2559+        assert self._writekey, "need write capability to publish"
2560+
2561+        # first, which servers will we publish to? We require that the
2562+        # servermap was updated in MODE_WRITE, so we can depend upon the
2563+        # peerlist computed by that process instead of computing our own.
2564+        assert self._servermap
2565+        assert self._servermap.last_update_mode in (MODE_WRITE, MODE_CHECK)
2566+        # we will push a version that is one larger than anything present
2567+        # in the grid, according to the servermap.
2568+        self._new_seqnum = self._servermap.highest_seqnum() + 1
2569+        self._status.set_servermap(self._servermap)
2570+
2571+        self.log(format="new seqnum will be %(seqnum)d",
2572+                 seqnum=self._new_seqnum, level=log.NOISY)
2573+
2574+        # We're updating an existing file, so all of the following
2575+        # should be available.
2576+        self.readkey = self._node.get_readkey()
2577+        self.required_shares = self._node.get_required_shares()
2578+        assert self.required_shares is not None
2579+        self.total_shares = self._node.get_total_shares()
2580+        assert self.total_shares is not None
2581+        self._status.set_encoding(self.required_shares, self.total_shares)
2582+
2583+        self._pubkey = self._node.get_pubkey()
2584+        assert self._pubkey
2585+        self._privkey = self._node.get_privkey()
2586+        assert self._privkey
2587+        self._encprivkey = self._node.get_encprivkey()
2588+
2589+        sb = self._storage_broker
2590+        full_peerlist = sb.get_servers_for_index(self._storage_index)
2591+        self.full_peerlist = full_peerlist # for use later, immutable
2592+        self.bad_peers = set() # peerids who have errbacked/refused requests
2593+
2594+        # This will set self.segment_size, self.num_segments, and
2595+        # self.fec. TODO: Does it know how to do the offset? Probably
2596+        # not. So do that part next.
2597+        self.setup_encoding_parameters(offset=offset)
2598+
2599+        # if we experience any surprises (writes which were rejected because
2600+        # our test vector did not match, or shares which we didn't expect to
2601+        # see), we set this flag and report an UncoordinatedWriteError at the
2602+        # end of the publish process.
2603+        self.surprised = False
2604+
2605+        # we keep track of three tables. The first is our goal: which share
2606+        # we want to see on which servers. This is initially populated by the
2607+        # existing servermap.
2608+        self.goal = set() # pairs of (peerid, shnum) tuples
2609+
2610+        # the second table is our list of outstanding queries: those which
2611+        # are in flight and may or may not be delivered, accepted, or
2612+        # acknowledged. Items are added to this table when the request is
2613+        # sent, and removed when the response returns (or errbacks).
2614+        self.outstanding = set() # (peerid, shnum) tuples
2615+
2616+        # the third is a table of successes: share which have actually been
2617+        # placed. These are populated when responses come back with success.
2618+        # When self.placed == self.goal, we're done.
2619+        self.placed = set() # (peerid, shnum) tuples
2620+
2621+        # we also keep a mapping from peerid to RemoteReference. Each time we
2622+        # pull a connection out of the full peerlist, we add it to this for
2623+        # use later.
2624+        self.connections = {}
2625+
2626+        self.bad_share_checkstrings = {}
2627+
2628+        # This is set at the last step of the publishing process.
2629+        self.versioninfo = ""
2630+
2631+        # we use the servermap to populate the initial goal: this way we will
2632+        # try to update each existing share in place. Since we're
2633+        # updating, we ignore damaged and missing shares -- callers must
2634+        # do a repair to repair and recreate these.
2635+        for (peerid, shnum) in self._servermap.servermap:
2636+            self.goal.add( (peerid, shnum) )
2637+            self.connections[peerid] = self._servermap.connections[peerid]
2638+        self.writers = {}
2639+
2640+        # SDMF files are updated differently.
2641+        self._version = MDMF_VERSION
2642+        writer_class = MDMFSlotWriteProxy
2643+
2644+        # For each (peerid, shnum) in self.goal, we make a
2645+        # write proxy for that peer. We'll use this to write
2646+        # shares to the peer.
2647+        for key in self.goal:
2648+            peerid, shnum = key
2649+            write_enabler = self._node.get_write_enabler(peerid)
2650+            renew_secret = self._node.get_renewal_secret(peerid)
2651+            cancel_secret = self._node.get_cancel_secret(peerid)
2652+            secrets = (write_enabler, renew_secret, cancel_secret)
2653+
2654+            self.writers[shnum] =  writer_class(shnum,
2655+                                                self.connections[peerid],
2656+                                                self._storage_index,
2657+                                                secrets,
2658+                                                self._new_seqnum,
2659+                                                self.required_shares,
2660+                                                self.total_shares,
2661+                                                self.segment_size,
2662+                                                self.datalength)
2663+            self.writers[shnum].peerid = peerid
2664+            assert (peerid, shnum) in self._servermap.servermap
2665+            old_versionid, old_timestamp = self._servermap.servermap[key]
2666+            (old_seqnum, old_root_hash, old_salt, old_segsize,
2667+             old_datalength, old_k, old_N, old_prefix,
2668+             old_offsets_tuple) = old_versionid
2669+            self.writers[shnum].set_checkstring(old_seqnum,
2670+                                                old_root_hash,
2671+                                                old_salt)
2672+
2673+        # Our remote shares will not have a complete checkstring until
2674+        # after we are done writing share data and have started to write
2675+        # blocks. In the meantime, we need to know what to look for when
2676+        # writing, so that we can detect UncoordinatedWriteErrors.
2677+        self._checkstring = self.writers.values()[0].get_checkstring()
2678+
2679+        # Now, we start pushing shares.
2680+        self._status.timings["setup"] = time.time() - self._started
2681+        # First, we encrypt, encode, and publish the shares that we need
2682+        # to encrypt, encode, and publish.
2683+
2684+        # Our update process fetched these for us. We need to update
2685+        # them in place as publishing happens.
2686+        self.blockhashes = {} # (shnum, [blochashes])
2687+        for (i, bht) in blockhashes.iteritems():
2688+            # We need to extract the leaves from our old hash tree.
2689+            old_segcount = mathutil.div_ceil(version[4],
2690+                                             version[3])
2691+            h = hashtree.IncompleteHashTree(old_segcount)
2692+            bht = dict(enumerate(bht))
2693+            h.set_hashes(bht)
2694+            leaves = h[h.get_leaf_index(0):]
2695+            for j in xrange(self.num_segments - len(leaves)):
2696+                leaves.append(None)
2697+
2698+            assert len(leaves) >= self.num_segments
2699+            self.blockhashes[i] = leaves
2700+            # This list will now be the leaves that were set during the
2701+            # initial upload + enough empty hashes to make it a
2702+            # power-of-two. If we exceed a power of two boundary, we
2703+            # should be encoding the file over again, and should not be
2704+            # here. So, we have
2705+            #assert len(self.blockhashes[i]) == \
2706+            #    hashtree.roundup_pow2(self.num_segments), \
2707+            #        len(self.blockhashes[i])
2708+            # XXX: Except this doesn't work. Figure out why.
2709+
2710+        # These are filled in later, after we've modified the block hash
2711+        # tree suitably.
2712+        self.sharehash_leaves = None # eventually [sharehashes]
2713+        self.sharehashes = {} # shnum -> [sharehash leaves necessary to
2714+                              # validate the share]
2715+
2716+        self.log("Starting push")
2717+
2718+        self._state = PUSHING_BLOCKS_STATE
2719+        self._push()
2720+
2721+        return self.done_deferred
2722+
2723+
2724     def publish(self, newdata):
2725         """Publish the filenode's current contents.  Returns a Deferred that
2726         fires (with None) when the publish has done as much work as it's ever
2727hunk ./src/allmydata/mutable/publish.py 344
2728         simultaneous write.
2729         """
2730 
2731-        # 1: generate shares (SDMF: files are small, so we can do it in RAM)
2732-        # 2: perform peer selection, get candidate servers
2733-        #  2a: send queries to n+epsilon servers, to determine current shares
2734-        #  2b: based upon responses, create target map
2735-        # 3: send slot_testv_and_readv_and_writev messages
2736-        # 4: as responses return, update share-dispatch table
2737-        # 4a: may need to run recovery algorithm
2738-        # 5: when enough responses are back, we're done
2739+        # 0. Setup encoding parameters, encoder, and other such things.
2740+        # 1. Encrypt, encode, and publish segments.
2741+        assert IMutableUploadable.providedBy(newdata)
2742 
2743hunk ./src/allmydata/mutable/publish.py 348
2744-        self.log("starting publish, datalen is %s" % len(newdata))
2745-        self._status.set_size(len(newdata))
2746+        self.data = newdata
2747+        self.datalength = newdata.get_size()
2748+        #if self.datalength >= DEFAULT_MAX_SEGMENT_SIZE:
2749+        #    self._version = MDMF_VERSION
2750+        #else:
2751+        #    self._version = SDMF_VERSION
2752+
2753+        self.log("starting publish, datalen is %s" % self.datalength)
2754+        self._status.set_size(self.datalength)
2755         self._status.set_status("Started")
2756         self._started = time.time()
2757 
2758hunk ./src/allmydata/mutable/publish.py 405
2759         self.full_peerlist = full_peerlist # for use later, immutable
2760         self.bad_peers = set() # peerids who have errbacked/refused requests
2761 
2762-        self.newdata = newdata
2763-        self.salt = os.urandom(16)
2764-
2765+        # This will set self.segment_size, self.num_segments, and
2766+        # self.fec.
2767         self.setup_encoding_parameters()
2768 
2769         # if we experience any surprises (writes which were rejected because
2770hunk ./src/allmydata/mutable/publish.py 415
2771         # end of the publish process.
2772         self.surprised = False
2773 
2774-        # as a failsafe, refuse to iterate through self.loop more than a
2775-        # thousand times.
2776-        self.looplimit = 1000
2777-
2778         # we keep track of three tables. The first is our goal: which share
2779         # we want to see on which servers. This is initially populated by the
2780         # existing servermap.
2781hunk ./src/allmydata/mutable/publish.py 438
2782 
2783         self.bad_share_checkstrings = {}
2784 
2785+        # This is set at the last step of the publishing process.
2786+        self.versioninfo = ""
2787+
2788         # we use the servermap to populate the initial goal: this way we will
2789         # try to update each existing share in place.
2790         for (peerid, shnum) in self._servermap.servermap:
2791hunk ./src/allmydata/mutable/publish.py 454
2792             self.bad_share_checkstrings[key] = old_checkstring
2793             self.connections[peerid] = self._servermap.connections[peerid]
2794 
2795-        # create the shares. We'll discard these as they are delivered. SDMF:
2796-        # we're allowed to hold everything in memory.
2797+        # TODO: Make this part do peer selection.
2798+        self.update_goal()
2799+        self.writers = {}
2800+        if self._version == MDMF_VERSION:
2801+            writer_class = MDMFSlotWriteProxy
2802+        else:
2803+            writer_class = SDMFSlotWriteProxy
2804 
2805hunk ./src/allmydata/mutable/publish.py 462
2806+        # For each (peerid, shnum) in self.goal, we make a
2807+        # write proxy for that peer. We'll use this to write
2808+        # shares to the peer.
2809+        for key in self.goal:
2810+            peerid, shnum = key
2811+            write_enabler = self._node.get_write_enabler(peerid)
2812+            renew_secret = self._node.get_renewal_secret(peerid)
2813+            cancel_secret = self._node.get_cancel_secret(peerid)
2814+            secrets = (write_enabler, renew_secret, cancel_secret)
2815+
2816+            self.writers[shnum] =  writer_class(shnum,
2817+                                                self.connections[peerid],
2818+                                                self._storage_index,
2819+                                                secrets,
2820+                                                self._new_seqnum,
2821+                                                self.required_shares,
2822+                                                self.total_shares,
2823+                                                self.segment_size,
2824+                                                self.datalength)
2825+            self.writers[shnum].peerid = peerid
2826+            if (peerid, shnum) in self._servermap.servermap:
2827+                old_versionid, old_timestamp = self._servermap.servermap[key]
2828+                (old_seqnum, old_root_hash, old_salt, old_segsize,
2829+                 old_datalength, old_k, old_N, old_prefix,
2830+                 old_offsets_tuple) = old_versionid
2831+                self.writers[shnum].set_checkstring(old_seqnum,
2832+                                                    old_root_hash,
2833+                                                    old_salt)
2834+            elif (peerid, shnum) in self.bad_share_checkstrings:
2835+                old_checkstring = self.bad_share_checkstrings[(peerid, shnum)]
2836+                self.writers[shnum].set_checkstring(old_checkstring)
2837+
2838+        # Our remote shares will not have a complete checkstring until
2839+        # after we are done writing share data and have started to write
2840+        # blocks. In the meantime, we need to know what to look for when
2841+        # writing, so that we can detect UncoordinatedWriteErrors.
2842+        self._checkstring = self.writers.values()[0].get_checkstring()
2843+
2844+        # Now, we start pushing shares.
2845         self._status.timings["setup"] = time.time() - self._started
2846hunk ./src/allmydata/mutable/publish.py 502
2847-        d = self._encrypt_and_encode()
2848-        d.addCallback(self._generate_shares)
2849-        def _start_pushing(res):
2850-            self._started_pushing = time.time()
2851-            return res
2852-        d.addCallback(_start_pushing)
2853-        d.addCallback(self.loop) # trigger delivery
2854-        d.addErrback(self._fatal_error)
2855+        # First, we encrypt, encode, and publish the shares that we need
2856+        # to encrypt, encode, and publish.
2857+
2858+        # This will eventually hold the block hash chain for each share
2859+        # that we publish. We define it this way so that empty publishes
2860+        # will still have something to write to the remote slot.
2861+        self.blockhashes = dict([(i, []) for i in xrange(self.total_shares)])
2862+        for i in xrange(self.total_shares):
2863+            blocks = self.blockhashes[i]
2864+            for j in xrange(self.num_segments):
2865+                blocks.append(None)
2866+        self.sharehash_leaves = None # eventually [sharehashes]
2867+        self.sharehashes = {} # shnum -> [sharehash leaves necessary to
2868+                              # validate the share]
2869+
2870+        self.log("Starting push")
2871+
2872+        self._state = PUSHING_BLOCKS_STATE
2873+        self._push()
2874 
2875         return self.done_deferred
2876 
2877hunk ./src/allmydata/mutable/publish.py 524
2878-    def setup_encoding_parameters(self):
2879-        segment_size = len(self.newdata)
2880+
2881+    def _update_status(self):
2882+        self._status.set_status("Sending Shares: %d placed out of %d, "
2883+                                "%d messages outstanding" %
2884+                                (len(self.placed),
2885+                                 len(self.goal),
2886+                                 len(self.outstanding)))
2887+        self._status.set_progress(1.0 * len(self.placed) / len(self.goal))
2888+
2889+
2890+    def setup_encoding_parameters(self, offset=0):
2891+        if self._version == MDMF_VERSION:
2892+            segment_size = DEFAULT_MAX_SEGMENT_SIZE # 128 KiB by default
2893+        else:
2894+            segment_size = self.datalength # SDMF is only one segment
2895         # this must be a multiple of self.required_shares
2896         segment_size = mathutil.next_multiple(segment_size,
2897                                               self.required_shares)
2898hunk ./src/allmydata/mutable/publish.py 543
2899         self.segment_size = segment_size
2900+
2901+        # Calculate the starting segment for the upload.
2902         if segment_size:
2903hunk ./src/allmydata/mutable/publish.py 546
2904-            self.num_segments = mathutil.div_ceil(len(self.newdata),
2905+            self.num_segments = mathutil.div_ceil(self.datalength,
2906                                                   segment_size)
2907hunk ./src/allmydata/mutable/publish.py 548
2908+            self.starting_segment = mathutil.div_ceil(offset,
2909+                                                      segment_size)
2910+            self.starting_segment -= 1
2911+            if offset == 0:
2912+                self.starting_segment = 0
2913+
2914         else:
2915             self.num_segments = 0
2916hunk ./src/allmydata/mutable/publish.py 556
2917-        assert self.num_segments in [0, 1,] # SDMF restrictions
2918+            self.starting_segment = 0
2919+
2920+
2921+        self.log("building encoding parameters for file")
2922+        self.log("got segsize %d" % self.segment_size)
2923+        self.log("got %d segments" % self.num_segments)
2924+
2925+        if self._version == SDMF_VERSION:
2926+            assert self.num_segments in (0, 1) # SDMF
2927+        # calculate the tail segment size.
2928+
2929+        if segment_size and self.datalength:
2930+            self.tail_segment_size = self.datalength % segment_size
2931+            self.log("got tail segment size %d" % self.tail_segment_size)
2932+        else:
2933+            self.tail_segment_size = 0
2934+
2935+        if self.tail_segment_size == 0 and segment_size:
2936+            # The tail segment is the same size as the other segments.
2937+            self.tail_segment_size = segment_size
2938+
2939+        # Make FEC encoders
2940+        fec = codec.CRSEncoder()
2941+        fec.set_params(self.segment_size,
2942+                       self.required_shares, self.total_shares)
2943+        self.piece_size = fec.get_block_size()
2944+        self.fec = fec
2945+
2946+        if self.tail_segment_size == self.segment_size:
2947+            self.tail_fec = self.fec
2948+        else:
2949+            tail_fec = codec.CRSEncoder()
2950+            tail_fec.set_params(self.tail_segment_size,
2951+                                self.required_shares,
2952+                                self.total_shares)
2953+            self.tail_fec = tail_fec
2954+
2955+        self._current_segment = self.starting_segment
2956+        self.end_segment = self.num_segments - 1
2957+        # Now figure out where the last segment should be.
2958+        if self.data.get_size() != self.datalength:
2959+            end = self.data.get_size()
2960+            self.end_segment = mathutil.div_ceil(end,
2961+                                                 segment_size)
2962+            self.end_segment -= 1
2963+        self.log("got start segment %d" % self.starting_segment)
2964+        self.log("got end segment %d" % self.end_segment)
2965+
2966+
2967+    def _push(self, ignored=None):
2968+        """
2969+        I manage state transitions. In particular, I see that we still
2970+        have a good enough number of writers to complete the upload
2971+        successfully.
2972+        """
2973+        # Can we still successfully publish this file?
2974+        # TODO: Keep track of outstanding queries before aborting the
2975+        #       process.
2976+        if len(self.writers) <= self.required_shares or self.surprised:
2977+            return self._failure()
2978+
2979+        # Figure out what we need to do next. Each of these needs to
2980+        # return a deferred so that we don't block execution when this
2981+        # is first called in the upload method.
2982+        if self._state == PUSHING_BLOCKS_STATE:
2983+            return self.push_segment(self._current_segment)
2984+
2985+        elif self._state == PUSHING_EVERYTHING_ELSE_STATE:
2986+            return self.push_everything_else()
2987+
2988+        # If we make it to this point, we were successful in placing the
2989+        # file.
2990+        return self._done(None)
2991+
2992+
2993+    def push_segment(self, segnum):
2994+        if self.num_segments == 0 and self._version == SDMF_VERSION:
2995+            self._add_dummy_salts()
2996 
2997hunk ./src/allmydata/mutable/publish.py 635
2998-    def _fatal_error(self, f):
2999-        self.log("error during loop", failure=f, level=log.UNUSUAL)
3000-        self._done(f)
3001+        if segnum > self.end_segment:
3002+            # We don't have any more segments to push.
3003+            self._state = PUSHING_EVERYTHING_ELSE_STATE
3004+            return self._push()
3005+
3006+        d = self._encode_segment(segnum)
3007+        d.addCallback(self._push_segment, segnum)
3008+        def _increment_segnum(ign):
3009+            self._current_segment += 1
3010+        # XXX: I don't think we need to do addBoth here -- any errBacks
3011+        # should be handled within push_segment.
3012+        d.addBoth(_increment_segnum)
3013+        d.addBoth(self._turn_barrier)
3014+        d.addBoth(self._push)
3015+
3016+
3017+    def _turn_barrier(self, result):
3018+        """
3019+        I help the publish process avoid the recursion limit issues
3020+        described in #237.
3021+        """
3022+        return fireEventually(result)
3023+
3024+
3025+    def _add_dummy_salts(self):
3026+        """
3027+        SDMF files need a salt even if they're empty, or the signature
3028+        won't make sense. This method adds a dummy salt to each of our
3029+        SDMF writers so that they can write the signature later.
3030+        """
3031+        salt = os.urandom(16)
3032+        assert self._version == SDMF_VERSION
3033+
3034+        for writer in self.writers.itervalues():
3035+            writer.put_salt(salt)
3036+
3037+
3038+    def _encode_segment(self, segnum):
3039+        """
3040+        I encrypt and encode the segment segnum.
3041+        """
3042+        started = time.time()
3043+
3044+        if segnum + 1 == self.num_segments:
3045+            segsize = self.tail_segment_size
3046+        else:
3047+            segsize = self.segment_size
3048+
3049+
3050+        self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
3051+        data = self.data.read(segsize)
3052+        # XXX: This is dumb. Why return a list?
3053+        data = "".join(data)
3054+
3055+        assert len(data) == segsize, len(data)
3056+
3057+        salt = os.urandom(16)
3058+
3059+        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
3060+        self._status.set_status("Encrypting")
3061+        enc = AES(key)
3062+        crypttext = enc.process(data)
3063+        assert len(crypttext) == len(data)
3064+
3065+        now = time.time()
3066+        self._status.timings["encrypt"] = now - started
3067+        started = now
3068+
3069+        # now apply FEC
3070+        if segnum + 1 == self.num_segments:
3071+            fec = self.tail_fec
3072+        else:
3073+            fec = self.fec
3074+
3075+        self._status.set_status("Encoding")
3076+        crypttext_pieces = [None] * self.required_shares
3077+        piece_size = fec.get_block_size()
3078+        for i in range(len(crypttext_pieces)):
3079+            offset = i * piece_size
3080+            piece = crypttext[offset:offset+piece_size]
3081+            piece = piece + "\x00"*(piece_size - len(piece)) # padding
3082+            crypttext_pieces[i] = piece
3083+            assert len(piece) == piece_size
3084+        d = fec.encode(crypttext_pieces)
3085+        def _done_encoding(res):
3086+            elapsed = time.time() - started
3087+            self._status.timings["encode"] = elapsed
3088+            return (res, salt)
3089+        d.addCallback(_done_encoding)
3090+        return d
3091+
3092+
3093+    def _push_segment(self, encoded_and_salt, segnum):
3094+        """
3095+        I push (data, salt) as segment number segnum.
3096+        """
3097+        results, salt = encoded_and_salt
3098+        shares, shareids = results
3099+        self._status.set_status("Pushing segment")
3100+        for i in xrange(len(shares)):
3101+            sharedata = shares[i]
3102+            shareid = shareids[i]
3103+            if self._version == MDMF_VERSION:
3104+                hashed = salt + sharedata
3105+            else:
3106+                hashed = sharedata
3107+            block_hash = hashutil.block_hash(hashed)
3108+            self.blockhashes[shareid][segnum] = block_hash
3109+            # find the writer for this share
3110+            writer = self.writers[shareid]
3111+            writer.put_block(sharedata, segnum, salt)
3112+
3113+
3114+    def push_everything_else(self):
3115+        """
3116+        I put everything else associated with a share.
3117+        """
3118+        self._pack_started = time.time()
3119+        self.push_encprivkey()
3120+        self.push_blockhashes()
3121+        self.push_sharehashes()
3122+        self.push_toplevel_hashes_and_signature()
3123+        d = self.finish_publishing()
3124+        def _change_state(ignored):
3125+            self._state = DONE_STATE
3126+        d.addCallback(_change_state)
3127+        d.addCallback(self._push)
3128+        return d
3129+
3130+
3131+    def push_encprivkey(self):
3132+        encprivkey = self._encprivkey
3133+        self._status.set_status("Pushing encrypted private key")
3134+        for writer in self.writers.itervalues():
3135+            writer.put_encprivkey(encprivkey)
3136+
3137+
3138+    def push_blockhashes(self):
3139+        self.sharehash_leaves = [None] * len(self.blockhashes)
3140+        self._status.set_status("Building and pushing block hash tree")
3141+        for shnum, blockhashes in self.blockhashes.iteritems():
3142+            t = hashtree.HashTree(blockhashes)
3143+            self.blockhashes[shnum] = list(t)
3144+            # set the leaf for future use.
3145+            self.sharehash_leaves[shnum] = t[0]
3146+
3147+            writer = self.writers[shnum]
3148+            writer.put_blockhashes(self.blockhashes[shnum])
3149+
3150+
3151+    def push_sharehashes(self):
3152+        self._status.set_status("Building and pushing share hash chain")
3153+        share_hash_tree = hashtree.HashTree(self.sharehash_leaves)
3154+        for shnum in xrange(len(self.sharehash_leaves)):
3155+            needed_indices = share_hash_tree.needed_hashes(shnum)
3156+            self.sharehashes[shnum] = dict( [ (i, share_hash_tree[i])
3157+                                             for i in needed_indices] )
3158+            writer = self.writers[shnum]
3159+            writer.put_sharehashes(self.sharehashes[shnum])
3160+        self.root_hash = share_hash_tree[0]
3161+
3162+
3163+    def push_toplevel_hashes_and_signature(self):
3164+        # We need to to three things here:
3165+        #   - Push the root hash and salt hash
3166+        #   - Get the checkstring of the resulting layout; sign that.
3167+        #   - Push the signature
3168+        self._status.set_status("Pushing root hashes and signature")
3169+        for shnum in xrange(self.total_shares):
3170+            writer = self.writers[shnum]
3171+            writer.put_root_hash(self.root_hash)
3172+        self._update_checkstring()
3173+        self._make_and_place_signature()
3174+
3175+
3176+    def _update_checkstring(self):
3177+        """
3178+        After putting the root hash, MDMF files will have the
3179+        checkstring written to the storage server. This means that we
3180+        can update our copy of the checkstring so we can detect
3181+        uncoordinated writes. SDMF files will have the same checkstring,
3182+        so we need not do anything.
3183+        """
3184+        self._checkstring = self.writers.values()[0].get_checkstring()
3185+
3186+
3187+    def _make_and_place_signature(self):
3188+        """
3189+        I create and place the signature.
3190+        """
3191+        started = time.time()
3192+        self._status.set_status("Signing prefix")
3193+        signable = self.writers[0].get_signable()
3194+        self.signature = self._privkey.sign(signable)
3195+
3196+        for (shnum, writer) in self.writers.iteritems():
3197+            writer.put_signature(self.signature)
3198+        self._status.timings['sign'] = time.time() - started
3199+
3200+
3201+    def finish_publishing(self):
3202+        # We're almost done -- we just need to put the verification key
3203+        # and the offsets
3204+        started = time.time()
3205+        self._status.set_status("Pushing shares")
3206+        self._started_pushing = started
3207+        ds = []
3208+        verification_key = self._pubkey.serialize()
3209+
3210+
3211+        # TODO: Bad, since we remove from this same dict. We need to
3212+        # make a copy, or just use a non-iterated value.
3213+        for (shnum, writer) in self.writers.iteritems():
3214+            writer.put_verification_key(verification_key)
3215+            d = writer.finish_publishing()
3216+            # Add the (peerid, shnum) tuple to our list of outstanding
3217+            # queries. This gets used by _loop if some of our queries
3218+            # fail to place shares.
3219+            self.outstanding.add((writer.peerid, writer.shnum))
3220+            d.addCallback(self._got_write_answer, writer, started)
3221+            d.addErrback(self._connection_problem, writer)
3222+            ds.append(d)
3223+        self._record_verinfo()
3224+        self._status.timings['pack'] = time.time() - started
3225+        return defer.DeferredList(ds)
3226+
3227+
3228+    def _record_verinfo(self):
3229+        self.versioninfo = self.writers.values()[0].get_verinfo()
3230+
3231+
3232+    def _connection_problem(self, f, writer):
3233+        """
3234+        We ran into a connection problem while working with writer, and
3235+        need to deal with that.
3236+        """
3237+        self.log("found problem: %s" % str(f))
3238+        self._last_failure = f
3239+        del(self.writers[writer.shnum])
3240 
3241hunk ./src/allmydata/mutable/publish.py 875
3242-    def _update_status(self):
3243-        self._status.set_status("Sending Shares: %d placed out of %d, "
3244-                                "%d messages outstanding" %
3245-                                (len(self.placed),
3246-                                 len(self.goal),
3247-                                 len(self.outstanding)))
3248-        self._status.set_progress(1.0 * len(self.placed) / len(self.goal))
3249 
3250hunk ./src/allmydata/mutable/publish.py 876
3251-    def loop(self, ignored=None):
3252-        self.log("entering loop", level=log.NOISY)
3253-        if not self._running:
3254-            return
3255-
3256-        self.looplimit -= 1
3257-        if self.looplimit <= 0:
3258-            raise LoopLimitExceededError("loop limit exceeded")
3259-
3260-        if self.surprised:
3261-            # don't send out any new shares, just wait for the outstanding
3262-            # ones to be retired.
3263-            self.log("currently surprised, so don't send any new shares",
3264-                     level=log.NOISY)
3265-        else:
3266-            self.update_goal()
3267-            # how far are we from our goal?
3268-            needed = self.goal - self.placed - self.outstanding
3269-            self._update_status()
3270-
3271-            if needed:
3272-                # we need to send out new shares
3273-                self.log(format="need to send %(needed)d new shares",
3274-                         needed=len(needed), level=log.NOISY)
3275-                self._send_shares(needed)
3276-                return
3277-
3278-        if self.outstanding:
3279-            # queries are still pending, keep waiting
3280-            self.log(format="%(outstanding)d queries still outstanding",
3281-                     outstanding=len(self.outstanding),
3282-                     level=log.NOISY)
3283-            return
3284-
3285-        # no queries outstanding, no placements needed: we're done
3286-        self.log("no queries outstanding, no placements needed: done",
3287-                 level=log.OPERATIONAL)
3288-        now = time.time()
3289-        elapsed = now - self._started_pushing
3290-        self._status.timings["push"] = elapsed
3291-        return self._done(None)
3292-
3293     def log_goal(self, goal, message=""):
3294         logmsg = [message]
3295         for (shnum, peerid) in sorted([(s,p) for (p,s) in goal]):
3296hunk ./src/allmydata/mutable/publish.py 957
3297             self.log_goal(self.goal, "after update: ")
3298 
3299 
3300+    def _got_write_answer(self, answer, writer, started):
3301+        if not answer:
3302+            # SDMF writers only pretend to write when readers set their
3303+            # blocks, salts, and so on -- they actually just write once,
3304+            # at the end of the upload process. In fake writes, they
3305+            # return defer.succeed(None). If we see that, we shouldn't
3306+            # bother checking it.
3307+            return
3308 
3309hunk ./src/allmydata/mutable/publish.py 966
3310-    def _encrypt_and_encode(self):
3311-        # this returns a Deferred that fires with a list of (sharedata,
3312-        # sharenum) tuples. TODO: cache the ciphertext, only produce the
3313-        # shares that we care about.
3314-        self.log("_encrypt_and_encode")
3315-
3316-        self._status.set_status("Encrypting")
3317-        started = time.time()
3318-
3319-        key = hashutil.ssk_readkey_data_hash(self.salt, self.readkey)
3320-        enc = AES(key)
3321-        crypttext = enc.process(self.newdata)
3322-        assert len(crypttext) == len(self.newdata)
3323+        peerid = writer.peerid
3324+        lp = self.log("_got_write_answer from %s, share %d" %
3325+                      (idlib.shortnodeid_b2a(peerid), writer.shnum))
3326 
3327         now = time.time()
3328hunk ./src/allmydata/mutable/publish.py 971
3329-        self._status.timings["encrypt"] = now - started
3330-        started = now
3331-
3332-        # now apply FEC
3333-
3334-        self._status.set_status("Encoding")
3335-        fec = codec.CRSEncoder()
3336-        fec.set_params(self.segment_size,
3337-                       self.required_shares, self.total_shares)
3338-        piece_size = fec.get_block_size()
3339-        crypttext_pieces = [None] * self.required_shares
3340-        for i in range(len(crypttext_pieces)):
3341-            offset = i * piece_size
3342-            piece = crypttext[offset:offset+piece_size]
3343-            piece = piece + "\x00"*(piece_size - len(piece)) # padding
3344-            crypttext_pieces[i] = piece
3345-            assert len(piece) == piece_size
3346-
3347-        d = fec.encode(crypttext_pieces)
3348-        def _done_encoding(res):
3349-            elapsed = time.time() - started
3350-            self._status.timings["encode"] = elapsed
3351-            return res
3352-        d.addCallback(_done_encoding)
3353-        return d
3354-
3355-    def _generate_shares(self, shares_and_shareids):
3356-        # this sets self.shares and self.root_hash
3357-        self.log("_generate_shares")
3358-        self._status.set_status("Generating Shares")
3359-        started = time.time()
3360-
3361-        # we should know these by now
3362-        privkey = self._privkey
3363-        encprivkey = self._encprivkey
3364-        pubkey = self._pubkey
3365-
3366-        (shares, share_ids) = shares_and_shareids
3367-
3368-        assert len(shares) == len(share_ids)
3369-        assert len(shares) == self.total_shares
3370-        all_shares = {}
3371-        block_hash_trees = {}
3372-        share_hash_leaves = [None] * len(shares)
3373-        for i in range(len(shares)):
3374-            share_data = shares[i]
3375-            shnum = share_ids[i]
3376-            all_shares[shnum] = share_data
3377-
3378-            # build the block hash tree. SDMF has only one leaf.
3379-            leaves = [hashutil.block_hash(share_data)]
3380-            t = hashtree.HashTree(leaves)
3381-            block_hash_trees[shnum] = list(t)
3382-            share_hash_leaves[shnum] = t[0]
3383-        for leaf in share_hash_leaves:
3384-            assert leaf is not None
3385-        share_hash_tree = hashtree.HashTree(share_hash_leaves)
3386-        share_hash_chain = {}
3387-        for shnum in range(self.total_shares):
3388-            needed_hashes = share_hash_tree.needed_hashes(shnum)
3389-            share_hash_chain[shnum] = dict( [ (i, share_hash_tree[i])
3390-                                              for i in needed_hashes ] )
3391-        root_hash = share_hash_tree[0]
3392-        assert len(root_hash) == 32
3393-        self.log("my new root_hash is %s" % base32.b2a(root_hash))
3394-        self._new_version_info = (self._new_seqnum, root_hash, self.salt)
3395-
3396-        prefix = pack_prefix(self._new_seqnum, root_hash, self.salt,
3397-                             self.required_shares, self.total_shares,
3398-                             self.segment_size, len(self.newdata))
3399-
3400-        # now pack the beginning of the share. All shares are the same up
3401-        # to the signature, then they have divergent share hash chains,
3402-        # then completely different block hash trees + salt + share data,
3403-        # then they all share the same encprivkey at the end. The sizes
3404-        # of everything are the same for all shares.
3405-
3406-        sign_started = time.time()
3407-        signature = privkey.sign(prefix)
3408-        self._status.timings["sign"] = time.time() - sign_started
3409-
3410-        verification_key = pubkey.serialize()
3411-
3412-        final_shares = {}
3413-        for shnum in range(self.total_shares):
3414-            final_share = pack_share(prefix,
3415-                                     verification_key,
3416-                                     signature,
3417-                                     share_hash_chain[shnum],
3418-                                     block_hash_trees[shnum],
3419-                                     all_shares[shnum],
3420-                                     encprivkey)
3421-            final_shares[shnum] = final_share
3422-        elapsed = time.time() - started
3423-        self._status.timings["pack"] = elapsed
3424-        self.shares = final_shares
3425-        self.root_hash = root_hash
3426-
3427-        # we also need to build up the version identifier for what we're
3428-        # pushing. Extract the offsets from one of our shares.
3429-        assert final_shares
3430-        offsets = unpack_header(final_shares.values()[0])[-1]
3431-        offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
3432-        verinfo = (self._new_seqnum, root_hash, self.salt,
3433-                   self.segment_size, len(self.newdata),
3434-                   self.required_shares, self.total_shares,
3435-                   prefix, offsets_tuple)
3436-        self.versioninfo = verinfo
3437-
3438-
3439-
3440-    def _send_shares(self, needed):
3441-        self.log("_send_shares")
3442-
3443-        # we're finally ready to send out our shares. If we encounter any
3444-        # surprises here, it's because somebody else is writing at the same
3445-        # time. (Note: in the future, when we remove the _query_peers() step
3446-        # and instead speculate about [or remember] which shares are where,
3447-        # surprises here are *not* indications of UncoordinatedWriteError,
3448-        # and we'll need to respond to them more gracefully.)
3449-
3450-        # needed is a set of (peerid, shnum) tuples. The first thing we do is
3451-        # organize it by peerid.
3452-
3453-        peermap = DictOfSets()
3454-        for (peerid, shnum) in needed:
3455-            peermap.add(peerid, shnum)
3456-
3457-        # the next thing is to build up a bunch of test vectors. The
3458-        # semantics of Publish are that we perform the operation if the world
3459-        # hasn't changed since the ServerMap was constructed (more or less).
3460-        # For every share we're trying to place, we create a test vector that
3461-        # tests to see if the server*share still corresponds to the
3462-        # map.
3463-
3464-        all_tw_vectors = {} # maps peerid to tw_vectors
3465-        sm = self._servermap.servermap
3466-
3467-        for key in needed:
3468-            (peerid, shnum) = key
3469-
3470-            if key in sm:
3471-                # an old version of that share already exists on the
3472-                # server, according to our servermap. We will create a
3473-                # request that attempts to replace it.
3474-                old_versionid, old_timestamp = sm[key]
3475-                (old_seqnum, old_root_hash, old_salt, old_segsize,
3476-                 old_datalength, old_k, old_N, old_prefix,
3477-                 old_offsets_tuple) = old_versionid
3478-                old_checkstring = pack_checkstring(old_seqnum,
3479-                                                   old_root_hash,
3480-                                                   old_salt)
3481-                testv = (0, len(old_checkstring), "eq", old_checkstring)
3482-
3483-            elif key in self.bad_share_checkstrings:
3484-                old_checkstring = self.bad_share_checkstrings[key]
3485-                testv = (0, len(old_checkstring), "eq", old_checkstring)
3486-
3487-            else:
3488-                # add a testv that requires the share not exist
3489-
3490-                # Unfortunately, foolscap-0.2.5 has a bug in the way inbound
3491-                # constraints are handled. If the same object is referenced
3492-                # multiple times inside the arguments, foolscap emits a
3493-                # 'reference' token instead of a distinct copy of the
3494-                # argument. The bug is that these 'reference' tokens are not
3495-                # accepted by the inbound constraint code. To work around
3496-                # this, we need to prevent python from interning the
3497-                # (constant) tuple, by creating a new copy of this vector
3498-                # each time.
3499-
3500-                # This bug is fixed in foolscap-0.2.6, and even though this
3501-                # version of Tahoe requires foolscap-0.3.1 or newer, we are
3502-                # supposed to be able to interoperate with older versions of
3503-                # Tahoe which are allowed to use older versions of foolscap,
3504-                # including foolscap-0.2.5 . In addition, I've seen other
3505-                # foolscap problems triggered by 'reference' tokens (see #541
3506-                # for details). So we must keep this workaround in place.
3507-
3508-                #testv = (0, 1, 'eq', "")
3509-                testv = tuple([0, 1, 'eq', ""])
3510-
3511-            testvs = [testv]
3512-            # the write vector is simply the share
3513-            writev = [(0, self.shares[shnum])]
3514-
3515-            if peerid not in all_tw_vectors:
3516-                all_tw_vectors[peerid] = {}
3517-                # maps shnum to (testvs, writevs, new_length)
3518-            assert shnum not in all_tw_vectors[peerid]
3519-
3520-            all_tw_vectors[peerid][shnum] = (testvs, writev, None)
3521-
3522-        # we read the checkstring back from each share, however we only use
3523-        # it to detect whether there was a new share that we didn't know
3524-        # about. The success or failure of the write will tell us whether
3525-        # there was a collision or not. If there is a collision, the first
3526-        # thing we'll do is update the servermap, which will find out what
3527-        # happened. We could conceivably reduce a roundtrip by using the
3528-        # readv checkstring to populate the servermap, but really we'd have
3529-        # to read enough data to validate the signatures too, so it wouldn't
3530-        # be an overall win.
3531-        read_vector = [(0, struct.calcsize(SIGNED_PREFIX))]
3532-
3533-        # ok, send the messages!
3534-        self.log("sending %d shares" % len(all_tw_vectors), level=log.NOISY)
3535-        started = time.time()
3536-        for (peerid, tw_vectors) in all_tw_vectors.items():
3537-
3538-            write_enabler = self._node.get_write_enabler(peerid)
3539-            renew_secret = self._node.get_renewal_secret(peerid)
3540-            cancel_secret = self._node.get_cancel_secret(peerid)
3541-            secrets = (write_enabler, renew_secret, cancel_secret)
3542-            shnums = tw_vectors.keys()
3543-
3544-            for shnum in shnums:
3545-                self.outstanding.add( (peerid, shnum) )
3546+        elapsed = now - started
3547 
3548hunk ./src/allmydata/mutable/publish.py 973
3549-            d = self._do_testreadwrite(peerid, secrets,
3550-                                       tw_vectors, read_vector)
3551-            d.addCallbacks(self._got_write_answer, self._got_write_error,
3552-                           callbackArgs=(peerid, shnums, started),
3553-                           errbackArgs=(peerid, shnums, started))
3554-            # tolerate immediate errback, like with DeadReferenceError
3555-            d.addBoth(fireEventually)
3556-            d.addCallback(self.loop)
3557-            d.addErrback(self._fatal_error)
3558+        self._status.add_per_server_time(peerid, elapsed)
3559 
3560hunk ./src/allmydata/mutable/publish.py 975
3561-        self._update_status()
3562-        self.log("%d shares sent" % len(all_tw_vectors), level=log.NOISY)
3563+        wrote, read_data = answer
3564 
3565hunk ./src/allmydata/mutable/publish.py 977
3566-    def _do_testreadwrite(self, peerid, secrets,
3567-                          tw_vectors, read_vector):
3568-        storage_index = self._storage_index
3569-        ss = self.connections[peerid]
3570+        surprise_shares = set(read_data.keys()) - set([writer.shnum])
3571 
3572hunk ./src/allmydata/mutable/publish.py 979
3573-        #print "SS[%s] is %s" % (idlib.shortnodeid_b2a(peerid), ss), ss.tracker.interfaceName
3574-        d = ss.callRemote("slot_testv_and_readv_and_writev",
3575-                          storage_index,
3576-                          secrets,
3577-                          tw_vectors,
3578-                          read_vector)
3579-        return d
3580+        # We need to remove from surprise_shares any shares that we are
3581+        # knowingly also writing to that peer from other writers.
3582 
3583hunk ./src/allmydata/mutable/publish.py 982
3584-    def _got_write_answer(self, answer, peerid, shnums, started):
3585-        lp = self.log("_got_write_answer from %s" %
3586-                      idlib.shortnodeid_b2a(peerid))
3587-        for shnum in shnums:
3588-            self.outstanding.discard( (peerid, shnum) )
3589+        # TODO: Precompute this.
3590+        known_shnums = [x.shnum for x in self.writers.values()
3591+                        if x.peerid == peerid]
3592+        surprise_shares -= set(known_shnums)
3593+        self.log("found the following surprise shares: %s" %
3594+                 str(surprise_shares))
3595 
3596hunk ./src/allmydata/mutable/publish.py 989
3597-        now = time.time()
3598-        elapsed = now - started
3599-        self._status.add_per_server_time(peerid, elapsed)
3600-
3601-        wrote, read_data = answer
3602-
3603-        surprise_shares = set(read_data.keys()) - set(shnums)
3604+        # Now surprise shares contains all of the shares that we did not
3605+        # expect to be there.
3606 
3607         surprised = False
3608         for shnum in surprise_shares:
3609hunk ./src/allmydata/mutable/publish.py 996
3610             # read_data is a dict mapping shnum to checkstring (SIGNED_PREFIX)
3611             checkstring = read_data[shnum][0]
3612-            their_version_info = unpack_checkstring(checkstring)
3613-            if their_version_info == self._new_version_info:
3614+            # What we want to do here is to see if their (seqnum,
3615+            # roothash, salt) is the same as our (seqnum, roothash,
3616+            # salt), or the equivalent for MDMF. The best way to do this
3617+            # is to store a packed representation of our checkstring
3618+            # somewhere, then not bother unpacking the other
3619+            # checkstring.
3620+            if checkstring == self._checkstring:
3621                 # they have the right share, somehow
3622 
3623                 if (peerid,shnum) in self.goal:
3624hunk ./src/allmydata/mutable/publish.py 1081
3625             self.log("our testv failed, so the write did not happen",
3626                      parent=lp, level=log.WEIRD, umid="8sc26g")
3627             self.surprised = True
3628-            self.bad_peers.add(peerid) # don't ask them again
3629+            self.bad_peers.add(writer) # don't ask them again
3630             # use the checkstring to add information to the log message
3631             for (shnum,readv) in read_data.items():
3632                 checkstring = readv[0]
3633hunk ./src/allmydata/mutable/publish.py 1103
3634                 # if expected_version==None, then we didn't expect to see a
3635                 # share on that peer, and the 'surprise_shares' clause above
3636                 # will have logged it.
3637-            # self.loop() will take care of finding new homes
3638             return
3639 
3640hunk ./src/allmydata/mutable/publish.py 1105
3641-        for shnum in shnums:
3642-            self.placed.add( (peerid, shnum) )
3643-            # and update the servermap
3644-            self._servermap.add_new_share(peerid, shnum,
3645+        # and update the servermap
3646+        # self.versioninfo is set during the last phase of publishing.
3647+        # If we get there, we know that responses correspond to placed
3648+        # shares, and can safely execute these statements.
3649+        if self.versioninfo:
3650+            self.log("wrote successfully: adding new share to servermap")
3651+            self._servermap.add_new_share(peerid, writer.shnum,
3652                                           self.versioninfo, started)
3653hunk ./src/allmydata/mutable/publish.py 1113
3654-
3655-        # self.loop() will take care of checking to see if we're done
3656+            self.placed.add( (peerid, writer.shnum) )
3657+        self._update_status()
3658+        # the next method in the deferred chain will check to see if
3659+        # we're done and successful.
3660         return
3661 
3662hunk ./src/allmydata/mutable/publish.py 1119
3663-    def _got_write_error(self, f, peerid, shnums, started):
3664-        for shnum in shnums:
3665-            self.outstanding.discard( (peerid, shnum) )
3666-        self.bad_peers.add(peerid)
3667-        if self._first_write_error is None:
3668-            self._first_write_error = f
3669-        self.log(format="error while writing shares %(shnums)s to peerid %(peerid)s",
3670-                 shnums=list(shnums), peerid=idlib.shortnodeid_b2a(peerid),
3671-                 failure=f,
3672-                 level=log.UNUSUAL)
3673-        # self.loop() will take care of checking to see if we're done
3674-        return
3675-
3676 
3677     def _done(self, res):
3678         if not self._running:
3679hunk ./src/allmydata/mutable/publish.py 1126
3680         self._running = False
3681         now = time.time()
3682         self._status.timings["total"] = now - self._started
3683+
3684+        elapsed = now - self._started_pushing
3685+        self._status.timings['push'] = elapsed
3686+
3687         self._status.set_active(False)
3688hunk ./src/allmydata/mutable/publish.py 1131
3689-        if isinstance(res, failure.Failure):
3690-            self.log("Publish done, with failure", failure=res,
3691-                     level=log.WEIRD, umid="nRsR9Q")
3692-            self._status.set_status("Failed")
3693-        elif self.surprised:
3694-            self.log("Publish done, UncoordinatedWriteError", level=log.UNUSUAL)
3695-            self._status.set_status("UncoordinatedWriteError")
3696-            # deliver a failure
3697-            res = failure.Failure(UncoordinatedWriteError())
3698-            # TODO: recovery
3699-        else:
3700-            self.log("Publish done, success")
3701-            self._status.set_status("Finished")
3702-            self._status.set_progress(1.0)
3703+        self.log("Publish done, success")
3704+        self._status.set_status("Finished")
3705+        self._status.set_progress(1.0)
3706         eventually(self.done_deferred.callback, res)
3707 
3708hunk ./src/allmydata/mutable/publish.py 1136
3709+    def _failure(self):
3710+
3711+        if not self.surprised:
3712+            # We ran out of servers
3713+            self.log("Publish ran out of good servers, "
3714+                     "last failure was: %s" % str(self._last_failure))
3715+            e = NotEnoughServersError("Ran out of non-bad servers, "
3716+                                      "last failure was %s" %
3717+                                      str(self._last_failure))
3718+        else:
3719+            # We ran into shares that we didn't recognize, which means
3720+            # that we need to return an UncoordinatedWriteError.
3721+            self.log("Publish failed with UncoordinatedWriteError")
3722+            e = UncoordinatedWriteError()
3723+        f = failure.Failure(e)
3724+        eventually(self.done_deferred.callback, f)
3725+
3726+
3727+class MutableFileHandle:
3728+    """
3729+    I am a mutable uploadable built around a filehandle-like object,
3730+    usually either a StringIO instance or a handle to an actual file.
3731+    """
3732+    implements(IMutableUploadable)
3733+
3734+    def __init__(self, filehandle):
3735+        # The filehandle is defined as a generally file-like object that
3736+        # has these two methods. We don't care beyond that.
3737+        assert hasattr(filehandle, "read")
3738+        assert hasattr(filehandle, "close")
3739+
3740+        self._filehandle = filehandle
3741+        # We must start reading at the beginning of the file, or we risk
3742+        # encountering errors when the data read does not match the size
3743+        # reported to the uploader.
3744+        self._filehandle.seek(0)
3745+
3746+        # We have not yet read anything, so our position is 0.
3747+        self._marker = 0
3748+
3749+
3750+    def get_size(self):
3751+        """
3752+        I return the amount of data in my filehandle.
3753+        """
3754+        if not hasattr(self, "_size"):
3755+            old_position = self._filehandle.tell()
3756+            # Seek to the end of the file by seeking 0 bytes from the
3757+            # file's end
3758+            self._filehandle.seek(0, 2) # 2 == os.SEEK_END in 2.5+
3759+            self._size = self._filehandle.tell()
3760+            # Restore the previous position, in case this was called
3761+            # after a read.
3762+            self._filehandle.seek(old_position)
3763+            assert self._filehandle.tell() == old_position
3764+
3765+        assert hasattr(self, "_size")
3766+        return self._size
3767+
3768+
3769+    def pos(self):
3770+        """
3771+        I return the position of my read marker -- i.e., how much data I
3772+        have already read and returned to callers.
3773+        """
3774+        return self._marker
3775+
3776+
3777+    def read(self, length):
3778+        """
3779+        I return some data (up to length bytes) from my filehandle.
3780+
3781+        In most cases, I return length bytes, but sometimes I won't --
3782+        for example, if I am asked to read beyond the end of a file, or
3783+        an error occurs.
3784+        """
3785+        results = self._filehandle.read(length)
3786+        self._marker += len(results)
3787+        return [results]
3788+
3789+
3790+    def close(self):
3791+        """
3792+        I close the underlying filehandle. Any further operations on the
3793+        filehandle fail at this point.
3794+        """
3795+        self._filehandle.close()
3796+
3797+
3798+class MutableData(MutableFileHandle):
3799+    """
3800+    I am a mutable uploadable built around a string, which I then cast
3801+    into a StringIO and treat as a filehandle.
3802+    """
3803+
3804+    def __init__(self, s):
3805+        # Take a string and return a file-like uploadable.
3806+        assert isinstance(s, str)
3807+
3808+        MutableFileHandle.__init__(self, StringIO(s))
3809+
3810+
3811+class TransformingUploadable:
3812+    """
3813+    I am an IMutableUploadable that wraps another IMutableUploadable,
3814+    and some segments that are already on the grid. When I am called to
3815+    read, I handle merging of boundary segments.
3816+    """
3817+    implements(IMutableUploadable)
3818+
3819+
3820+    def __init__(self, data, offset, segment_size, start, end):
3821+        assert IMutableUploadable.providedBy(data)
3822+
3823+        self._newdata = data
3824+        self._offset = offset
3825+        self._segment_size = segment_size
3826+        self._start = start
3827+        self._end = end
3828+
3829+        self._read_marker = 0
3830+
3831+        self._first_segment_offset = offset % segment_size
3832+
3833+        num = self.log("TransformingUploadable: starting", parent=None)
3834+        self._log_number = num
3835+        self.log("got fso: %d" % self._first_segment_offset)
3836+        self.log("got offset: %d" % self._offset)
3837+
3838+
3839+    def log(self, *args, **kwargs):
3840+        if 'parent' not in kwargs:
3841+            kwargs['parent'] = self._log_number
3842+        if "facility" not in kwargs:
3843+            kwargs["facility"] = "tahoe.mutable.transforminguploadable"
3844+        return log.msg(*args, **kwargs)
3845+
3846+
3847+    def get_size(self):
3848+        return self._offset + self._newdata.get_size()
3849+
3850+
3851+    def read(self, length):
3852+        # We can get data from 3 sources here.
3853+        #   1. The first of the segments provided to us.
3854+        #   2. The data that we're replacing things with.
3855+        #   3. The last of the segments provided to us.
3856+
3857+        # are we in state 0?
3858+        self.log("reading %d bytes" % length)
3859+
3860+        old_start_data = ""
3861+        old_data_length = self._first_segment_offset - self._read_marker
3862+        if old_data_length > 0:
3863+            if old_data_length > length:
3864+                old_data_length = length
3865+            self.log("returning %d bytes of old start data" % old_data_length)
3866+
3867+            old_data_end = old_data_length + self._read_marker
3868+            old_start_data = self._start[self._read_marker:old_data_end]
3869+            length -= old_data_length
3870+        else:
3871+            # otherwise calculations later get screwed up.
3872+            old_data_length = 0
3873+
3874+        # Is there enough new data to satisfy this read? If not, we need
3875+        # to pad the end of the data with data from our last segment.
3876+        old_end_length = length - \
3877+            (self._newdata.get_size() - self._newdata.pos())
3878+        old_end_data = ""
3879+        if old_end_length > 0:
3880+            self.log("reading %d bytes of old end data" % old_end_length)
3881+
3882+            # TODO: We're not explicitly checking for tail segment size
3883+            # here. Is that a problem?
3884+            old_data_offset = (length - old_end_length + \
3885+                               old_data_length) % self._segment_size
3886+            self.log("reading at offset %d" % old_data_offset)
3887+            old_end = old_data_offset + old_end_length
3888+            old_end_data = self._end[old_data_offset:old_end]
3889+            length -= old_end_length
3890+            assert length == self._newdata.get_size() - self._newdata.pos()
3891+
3892+        self.log("reading %d bytes of new data" % length)
3893+        new_data = self._newdata.read(length)
3894+        new_data = "".join(new_data)
3895+
3896+        self._read_marker += len(old_start_data + new_data + old_end_data)
3897+
3898+        return old_start_data + new_data + old_end_data
3899 
3900hunk ./src/allmydata/mutable/publish.py 1327
3901+    def close(self):
3902+        pass
3903}
3904[nodemaker.py: Make nodemaker expose a way to create MDMF files
3905Kevan Carstensen <kevan@isnotajoke.com>**20100819003509
3906 Ignore-this: a6701746d6b992fc07bc0556a2b4a61d
3907] {
3908hunk ./src/allmydata/nodemaker.py 3
3909 import weakref
3910 from zope.interface import implements
3911-from allmydata.interfaces import INodeMaker
3912+from allmydata.util.assertutil import precondition
3913+from allmydata.interfaces import INodeMaker, SDMF_VERSION
3914 from allmydata.immutable.literal import LiteralFileNode
3915 from allmydata.immutable.filenode import ImmutableFileNode, CiphertextFileNode
3916 from allmydata.immutable.upload import Data
3917hunk ./src/allmydata/nodemaker.py 9
3918 from allmydata.mutable.filenode import MutableFileNode
3919+from allmydata.mutable.publish import MutableData
3920 from allmydata.dirnode import DirectoryNode, pack_children
3921 from allmydata.unknown import UnknownNode
3922 from allmydata import uri
3923hunk ./src/allmydata/nodemaker.py 92
3924             return self._create_dirnode(filenode)
3925         return None
3926 
3927-    def create_mutable_file(self, contents=None, keysize=None):
3928+    def create_mutable_file(self, contents=None, keysize=None,
3929+                            version=SDMF_VERSION):
3930         n = MutableFileNode(self.storage_broker, self.secret_holder,
3931                             self.default_encoding_parameters, self.history)
3932hunk ./src/allmydata/nodemaker.py 96
3933+        n.set_version(version)
3934         d = self.key_generator.generate(keysize)
3935         d.addCallback(n.create_with_keys, contents)
3936         d.addCallback(lambda res: n)
3937hunk ./src/allmydata/nodemaker.py 103
3938         return d
3939 
3940     def create_new_mutable_directory(self, initial_children={}):
3941+        # mutable directories will always be SDMF for now, to help
3942+        # compatibility with older clients.
3943+        version = SDMF_VERSION
3944+        # initial_children must have metadata (i.e. {} instead of None)
3945+        for (name, (node, metadata)) in initial_children.iteritems():
3946+            precondition(isinstance(metadata, dict),
3947+                         "create_new_mutable_directory requires metadata to be a dict, not None", metadata)
3948+            node.raise_error()
3949         d = self.create_mutable_file(lambda n:
3950hunk ./src/allmydata/nodemaker.py 112
3951-                                     pack_children(initial_children, n.get_writekey()))
3952+                                     MutableData(pack_children(initial_children,
3953+                                                    n.get_writekey())),
3954+                                     version=version)
3955         d.addCallback(self._create_dirnode)
3956         return d
3957 
3958}
3959[docs: update docs to mention MDMF
3960Kevan Carstensen <kevan@isnotajoke.com>**20100814225644
3961 Ignore-this: 1c3caa3cd44831007dcfbef297814308
3962] {
3963merger 0.0 (
3964replace ./docs/configuration.rst [A-Za-z_0-9\-\.] Tahoe Tahoe-LAFS
3965merger 0.0 (
3966hunk ./docs/configuration.rst 383
3967-shares.needed = (int, optional) aka "k", default 3
3968-shares.total = (int, optional) aka "N", N >= k, default 10
3969-shares.happy = (int, optional) 1 <= happy <= N, default 7
3970-
3971- These three values set the default encoding parameters. Each time a new file
3972- is uploaded, erasure-coding is used to break the ciphertext into separate
3973- pieces. There will be "N" (i.e. shares.total) pieces created, and the file
3974- will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved.
3975- The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10).
3976- Setting k to 1 is equivalent to simple replication (uploading N copies of
3977- the file).
3978-
3979- These values control the tradeoff between storage overhead, performance, and
3980- reliability. To a first approximation, a 1MB file will use (1MB*N/k) of
3981- backend storage space (the actual value will be a bit more, because of other
3982- forms of overhead). Up to N-k shares can be lost before the file becomes
3983- unrecoverable, so assuming there are at least N servers, up to N-k servers
3984- can be offline without losing the file. So large N/k ratios are more
3985- reliable, and small N/k ratios use less disk space. Clearly, k must never be
3986- smaller than N.
3987-
3988- Large values of N will slow down upload operations slightly, since more
3989- servers must be involved, and will slightly increase storage overhead due to
3990- the hash trees that are created. Large values of k will cause downloads to
3991- be marginally slower, because more servers must be involved. N cannot be
3992- larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe
3993- uses.
3994-
3995- shares.happy allows you control over the distribution of your immutable file.
3996- For a successful upload, shares are guaranteed to be initially placed on
3997- at least 'shares.happy' distinct servers, the correct functioning of any
3998- k of which is sufficient to guarantee the availability of the uploaded file.
3999- This value should not be larger than the number of servers on your grid.
4000-
4001- A value of shares.happy <= k is allowed, but does not provide any redundancy
4002- if some servers fail or lose shares.
4003-
4004- (Mutable files use a different share placement algorithm that does not
4005-  consider this parameter.)
4006-
4007-
4008-== Storage Server Configuration ==
4009-
4010-[storage]
4011-enabled = (boolean, optional)
4012-
4013- If this is True, the node will run a storage server, offering space to other
4014- clients. If it is False, the node will not run a storage server, meaning
4015- that no shares will be stored on this node. Use False this for clients who
4016- do not wish to provide storage service. The default value is True.
4017-
4018-readonly = (boolean, optional)
4019-
4020- If True, the node will run a storage server but will not accept any shares,
4021- making it effectively read-only. Use this for storage servers which are
4022- being decommissioned: the storage/ directory could be mounted read-only,
4023- while shares are moved to other servers. Note that this currently only
4024- affects immutable shares. Mutable shares (used for directories) will be
4025- written and modified anyway. See ticket #390 for the current status of this
4026- bug. The default value is False.
4027-
4028-reserved_space = (str, optional)
4029-
4030- If provided, this value defines how much disk space is reserved: the storage
4031- server will not accept any share which causes the amount of free disk space
4032- to drop below this value. (The free space is measured by a call to statvfs(2)
4033- on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the
4034- user account under which the storage server runs.)
4035-
4036- This string contains a number, with an optional case-insensitive scale
4037- suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So
4038- "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same
4039- thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing.
4040-
4041-expire.enabled =
4042-expire.mode =
4043-expire.override_lease_duration =
4044-expire.cutoff_date =
4045-expire.immutable =
4046-expire.mutable =
4047-
4048- These settings control garbage-collection, in which the server will delete
4049- shares that no longer have an up-to-date lease on them. Please see the
4050- neighboring "garbage-collection.txt" document for full details.
4051-
4052-
4053-== Running A Helper ==
4054+Running A Helper
4055+================
4056hunk ./docs/configuration.rst 423
4057+mutable.format = sdmf or mdmf
4058+
4059+ This value tells Tahoe-LAFS what the default mutable file format should
4060+ be. If mutable.format=sdmf, then newly created mutable files will be in
4061+ the old SDMF format. This is desirable for clients that operate on
4062+ grids where some peers run older versions of Tahoe-LAFS, as these older
4063+ versions cannot read the new MDMF mutable file format. If
4064+ mutable.format = mdmf, then newly created mutable files will use the
4065+ new MDMF format, which supports efficient in-place modification and
4066+ streaming downloads. You can overwrite this value using a special
4067+ mutable-type parameter in the webapi. If you do not specify a value
4068+ here, Tahoe-LAFS will use SDMF for all newly-created mutable files.
4069+
4070+ Note that this parameter only applies to mutable files. Mutable
4071+ directories, which are stored as mutable files, are not controlled by
4072+ this parameter and will always use SDMF. We may revisit this decision
4073+ in future versions of Tahoe-LAFS.
4074)
4075)
4076hunk ./docs/frontends/webapi.rst 363
4077  writeable mutable file, that file's contents will be overwritten in-place. If
4078  it is a read-cap for a mutable file, an error will occur. If it is an
4079  immutable file, the old file will be discarded, and a new one will be put in
4080- its place.
4081+ its place. If the target file is a writable mutable file, you may also
4082+ specify an "offset" parameter -- a byte offset that determines where in
4083+ the mutable file the data from the HTTP request body is placed. This
4084+ operation is relatively efficient for MDMF mutable files, and is
4085+ relatively inefficient (but still supported) for SDMF mutable files.
4086 
4087  When creating a new file, if "mutable=true" is in the query arguments, the
4088  operation will create a mutable file instead of an immutable one.
4089hunk ./docs/frontends/webapi.rst 388
4090 
4091  If "mutable=true" is in the query arguments, the operation will create a
4092  mutable file, and return its write-cap in the HTTP respose. The default is
4093- to create an immutable file, returning the read-cap as a response.
4094+ to create an immutable file, returning the read-cap as a response. If
4095+ you create a mutable file, you can also use the "mutable-type" query
4096+ parameter. If "mutable-type=sdmf", then the mutable file will be created
4097+ in the old SDMF mutable file format. This is desirable for files that
4098+ need to be read by old clients. If "mutable-type=mdmf", then the file
4099+ will be created in the new MDMF mutable file format. MDMF mutable files
4100+ can be downloaded more efficiently, and modified in-place efficiently,
4101+ but are not compatible with older versions of Tahoe-LAFS. If no
4102+ "mutable-type" argument is given, the file is created in whatever
4103+ format was configured in tahoe.cfg.
4104 
4105 Creating A New Directory
4106 ------------------------
4107hunk ./docs/frontends/webapi.rst 1082
4108  If a "mutable=true" argument is provided, the operation will create a
4109  mutable file, and the response body will contain the write-cap instead of
4110  the upload results page. The default is to create an immutable file,
4111- returning the upload results page as a response.
4112+ returning the upload results page as a response. If you create a
4113+ mutable file, you may choose to specify the format of that mutable file
4114+ with the "mutable-type" parameter. If "mutable-type=mdmf", then the
4115+ file will be created as an MDMF mutable file. If "mutable-type=sdmf",
4116+ then the file will be created as an SDMF mutable file. If no value is
4117+ specified, the file will be created in whatever format is specified in
4118+ tahoe.cfg.
4119 
4120 
4121 ``POST /uri/$DIRCAP/[SUBDIRS../]?t=upload``
4122}
4123[mutable/layout.py and interfaces.py: add MDMF writer and reader
4124Kevan Carstensen <kevan@isnotajoke.com>**20100819003304
4125 Ignore-this: 44400fec923987b62830da2ed5075fb4
4126 
4127 The MDMF writer is responsible for keeping state as plaintext is
4128 gradually processed into share data by the upload process. When the
4129 upload finishes, it will write all of its share data to a remote server,
4130 reporting its status back to the publisher.
4131 
4132 The MDMF reader is responsible for abstracting an MDMF file as it sits
4133 on the grid from the downloader; specifically, by receiving and
4134 responding to requests for arbitrary data within the MDMF file.
4135 
4136 The interfaces.py file has also been modified to contain an interface
4137 for the writer.
4138] {
4139hunk ./src/allmydata/interfaces.py 7
4140      ChoiceOf, IntegerConstraint, Any, RemoteInterface, Referenceable
4141 
4142 HASH_SIZE=32
4143+SALT_SIZE=16
4144+
4145+SDMF_VERSION=0
4146+MDMF_VERSION=1
4147 
4148 Hash = StringConstraint(maxLength=HASH_SIZE,
4149                         minLength=HASH_SIZE)# binary format 32-byte SHA256 hash
4150hunk ./src/allmydata/interfaces.py 424
4151         """
4152 
4153 
4154+class IMutableSlotWriter(Interface):
4155+    """
4156+    The interface for a writer around a mutable slot on a remote server.
4157+    """
4158+    def set_checkstring(checkstring, *args):
4159+        """
4160+        Set the checkstring that I will pass to the remote server when
4161+        writing.
4162+
4163+            @param checkstring A packed checkstring to use.
4164+
4165+        Note that implementations can differ in which semantics they
4166+        wish to support for set_checkstring -- they can, for example,
4167+        build the checkstring themselves from its constituents, or
4168+        some other thing.
4169+        """
4170+
4171+    def get_checkstring():
4172+        """
4173+        Get the checkstring that I think currently exists on the remote
4174+        server.
4175+        """
4176+
4177+    def put_block(data, segnum, salt):
4178+        """
4179+        Add a block and salt to the share.
4180+        """
4181+
4182+    def put_encprivey(encprivkey):
4183+        """
4184+        Add the encrypted private key to the share.
4185+        """
4186+
4187+    def put_blockhashes(blockhashes=list):
4188+        """
4189+        Add the block hash tree to the share.
4190+        """
4191+
4192+    def put_sharehashes(sharehashes=dict):
4193+        """
4194+        Add the share hash chain to the share.
4195+        """
4196+
4197+    def get_signable():
4198+        """
4199+        Return the part of the share that needs to be signed.
4200+        """
4201+
4202+    def put_signature(signature):
4203+        """
4204+        Add the signature to the share.
4205+        """
4206+
4207+    def put_verification_key(verification_key):
4208+        """
4209+        Add the verification key to the share.
4210+        """
4211+
4212+    def finish_publishing():
4213+        """
4214+        Do anything necessary to finish writing the share to a remote
4215+        server. I require that no further publishing needs to take place
4216+        after this method has been called.
4217+        """
4218+
4219+
4220 class IURI(Interface):
4221     def init_from_string(uri):
4222         """Accept a string (as created by my to_string() method) and populate
4223hunk ./src/allmydata/mutable/layout.py 4
4224 
4225 import struct
4226 from allmydata.mutable.common import NeedMoreDataError, UnknownVersionError
4227+from allmydata.interfaces import HASH_SIZE, SALT_SIZE, SDMF_VERSION, \
4228+                                 MDMF_VERSION, IMutableSlotWriter
4229+from allmydata.util import mathutil, observer
4230+from twisted.python import failure
4231+from twisted.internet import defer
4232+from zope.interface import implements
4233+
4234+
4235+# These strings describe the format of the packed structs they help process
4236+# Here's what they mean:
4237+#
4238+#  PREFIX:
4239+#    >: Big-endian byte order; the most significant byte is first (leftmost).
4240+#    B: The version information; an 8 bit version identifier. Stored as
4241+#       an unsigned char. This is currently 00 00 00 00; our modifications
4242+#       will turn it into 00 00 00 01.
4243+#    Q: The sequence number; this is sort of like a revision history for
4244+#       mutable files; they start at 1 and increase as they are changed after
4245+#       being uploaded. Stored as an unsigned long long, which is 8 bytes in
4246+#       length.
4247+#  32s: The root hash of the share hash tree. We use sha-256d, so we use 32
4248+#       characters = 32 bytes to store the value.
4249+#  16s: The salt for the readkey. This is a 16-byte random value, stored as
4250+#       16 characters.
4251+#
4252+#  SIGNED_PREFIX additions, things that are covered by the signature:
4253+#    B: The "k" encoding parameter. We store this as an 8-bit character,
4254+#       which is convenient because our erasure coding scheme cannot
4255+#       encode if you ask for more than 255 pieces.
4256+#    B: The "N" encoding parameter. Stored as an 8-bit character for the
4257+#       same reasons as above.
4258+#    Q: The segment size of the uploaded file. This will essentially be the
4259+#       length of the file in SDMF. An unsigned long long, so we can store
4260+#       files of quite large size.
4261+#    Q: The data length of the uploaded file. Modulo padding, this will be
4262+#       the same of the data length field. Like the data length field, it is
4263+#       an unsigned long long and can be quite large.
4264+#
4265+#   HEADER additions:
4266+#     L: The offset of the signature of this. An unsigned long.
4267+#     L: The offset of the share hash chain. An unsigned long.
4268+#     L: The offset of the block hash tree. An unsigned long.
4269+#     L: The offset of the share data. An unsigned long.
4270+#     Q: The offset of the encrypted private key. An unsigned long long, to
4271+#        account for the possibility of a lot of share data.
4272+#     Q: The offset of the EOF. An unsigned long long, to account for the
4273+#        possibility of a lot of share data.
4274+#
4275+#  After all of these, we have the following:
4276+#    - The verification key: Occupies the space between the end of the header
4277+#      and the start of the signature (i.e.: data[HEADER_LENGTH:o['signature']].
4278+#    - The signature, which goes from the signature offset to the share hash
4279+#      chain offset.
4280+#    - The share hash chain, which goes from the share hash chain offset to
4281+#      the block hash tree offset.
4282+#    - The share data, which goes from the share data offset to the encrypted
4283+#      private key offset.
4284+#    - The encrypted private key offset, which goes until the end of the file.
4285+#
4286+#  The block hash tree in this encoding has only one share, so the offset of
4287+#  the share data will be 32 bits more than the offset of the block hash tree.
4288+#  Given this, we may need to check to see how many bytes a reasonably sized
4289+#  block hash tree will take up.
4290 
4291 PREFIX = ">BQ32s16s" # each version has a different prefix
4292 SIGNED_PREFIX = ">BQ32s16s BBQQ" # this is covered by the signature
4293hunk ./src/allmydata/mutable/layout.py 73
4294 SIGNED_PREFIX_LENGTH = struct.calcsize(SIGNED_PREFIX)
4295 HEADER = ">BQ32s16s BBQQ LLLLQQ" # includes offsets
4296 HEADER_LENGTH = struct.calcsize(HEADER)
4297+OFFSETS = ">LLLLQQ"
4298+OFFSETS_LENGTH = struct.calcsize(OFFSETS)
4299 
4300hunk ./src/allmydata/mutable/layout.py 76
4301+# These are still used for some tests.
4302 def unpack_header(data):
4303     o = {}
4304     (version,
4305hunk ./src/allmydata/mutable/layout.py 92
4306      o['EOF']) = struct.unpack(HEADER, data[:HEADER_LENGTH])
4307     return (version, seqnum, root_hash, IV, k, N, segsize, datalen, o)
4308 
4309-def unpack_prefix_and_signature(data):
4310-    assert len(data) >= HEADER_LENGTH, len(data)
4311-    prefix = data[:SIGNED_PREFIX_LENGTH]
4312-
4313-    (version,
4314-     seqnum,
4315-     root_hash,
4316-     IV,
4317-     k, N, segsize, datalen,
4318-     o) = unpack_header(data)
4319-
4320-    if version != 0:
4321-        raise UnknownVersionError("got mutable share version %d, but I only understand version 0" % version)
4322-
4323-    if len(data) < o['share_hash_chain']:
4324-        raise NeedMoreDataError(o['share_hash_chain'],
4325-                                o['enc_privkey'], o['EOF']-o['enc_privkey'])
4326-
4327-    pubkey_s = data[HEADER_LENGTH:o['signature']]
4328-    signature = data[o['signature']:o['share_hash_chain']]
4329-
4330-    return (seqnum, root_hash, IV, k, N, segsize, datalen,
4331-            pubkey_s, signature, prefix)
4332-
4333 def unpack_share(data):
4334     assert len(data) >= HEADER_LENGTH
4335     o = {}
4336hunk ./src/allmydata/mutable/layout.py 139
4337             pubkey, signature, share_hash_chain, block_hash_tree,
4338             share_data, enc_privkey)
4339 
4340-def unpack_share_data(verinfo, hash_and_data):
4341-    (seqnum, root_hash, IV, segsize, datalength, k, N, prefix, o_t) = verinfo
4342-
4343-    # hash_and_data starts with the share_hash_chain, so figure out what the
4344-    # offsets really are
4345-    o = dict(o_t)
4346-    o_share_hash_chain = 0
4347-    o_block_hash_tree = o['block_hash_tree'] - o['share_hash_chain']
4348-    o_share_data = o['share_data'] - o['share_hash_chain']
4349-    o_enc_privkey = o['enc_privkey'] - o['share_hash_chain']
4350-
4351-    share_hash_chain_s = hash_and_data[o_share_hash_chain:o_block_hash_tree]
4352-    share_hash_format = ">H32s"
4353-    hsize = struct.calcsize(share_hash_format)
4354-    assert len(share_hash_chain_s) % hsize == 0, len(share_hash_chain_s)
4355-    share_hash_chain = []
4356-    for i in range(0, len(share_hash_chain_s), hsize):
4357-        chunk = share_hash_chain_s[i:i+hsize]
4358-        (hid, h) = struct.unpack(share_hash_format, chunk)
4359-        share_hash_chain.append( (hid, h) )
4360-    share_hash_chain = dict(share_hash_chain)
4361-    block_hash_tree_s = hash_and_data[o_block_hash_tree:o_share_data]
4362-    assert len(block_hash_tree_s) % 32 == 0, len(block_hash_tree_s)
4363-    block_hash_tree = []
4364-    for i in range(0, len(block_hash_tree_s), 32):
4365-        block_hash_tree.append(block_hash_tree_s[i:i+32])
4366-
4367-    share_data = hash_and_data[o_share_data:o_enc_privkey]
4368-
4369-    return (share_hash_chain, block_hash_tree, share_data)
4370-
4371-
4372-def pack_checkstring(seqnum, root_hash, IV):
4373-    return struct.pack(PREFIX,
4374-                       0, # version,
4375-                       seqnum,
4376-                       root_hash,
4377-                       IV)
4378-
4379 def unpack_checkstring(checkstring):
4380     cs_len = struct.calcsize(PREFIX)
4381     version, seqnum, root_hash, IV = struct.unpack(PREFIX, checkstring[:cs_len])
4382hunk ./src/allmydata/mutable/layout.py 146
4383         raise UnknownVersionError("got mutable share version %d, but I only understand version 0" % version)
4384     return (seqnum, root_hash, IV)
4385 
4386-def pack_prefix(seqnum, root_hash, IV,
4387-                required_shares, total_shares,
4388-                segment_size, data_length):
4389-    prefix = struct.pack(SIGNED_PREFIX,
4390-                         0, # version,
4391-                         seqnum,
4392-                         root_hash,
4393-                         IV,
4394-
4395-                         required_shares,
4396-                         total_shares,
4397-                         segment_size,
4398-                         data_length,
4399-                         )
4400-    return prefix
4401 
4402 def pack_offsets(verification_key_length, signature_length,
4403                  share_hash_chain_length, block_hash_tree_length,
4404hunk ./src/allmydata/mutable/layout.py 192
4405                            encprivkey])
4406     return final_share
4407 
4408+def pack_prefix(seqnum, root_hash, IV,
4409+                required_shares, total_shares,
4410+                segment_size, data_length):
4411+    prefix = struct.pack(SIGNED_PREFIX,
4412+                         0, # version,
4413+                         seqnum,
4414+                         root_hash,
4415+                         IV,
4416+                         required_shares,
4417+                         total_shares,
4418+                         segment_size,
4419+                         data_length,
4420+                         )
4421+    return prefix
4422+
4423+
4424+class SDMFSlotWriteProxy:
4425+    implements(IMutableSlotWriter)
4426+    """
4427+    I represent a remote write slot for an SDMF mutable file. I build a
4428+    share in memory, and then write it in one piece to the remote
4429+    server. This mimics how SDMF shares were built before MDMF (and the
4430+    new MDMF uploader), but provides that functionality in a way that
4431+    allows the MDMF uploader to be built without much special-casing for
4432+    file format, which makes the uploader code more readable.
4433+    """
4434+    def __init__(self,
4435+                 shnum,
4436+                 rref, # a remote reference to a storage server
4437+                 storage_index,
4438+                 secrets, # (write_enabler, renew_secret, cancel_secret)
4439+                 seqnum, # the sequence number of the mutable file
4440+                 required_shares,
4441+                 total_shares,
4442+                 segment_size,
4443+                 data_length): # the length of the original file
4444+        self.shnum = shnum
4445+        self._rref = rref
4446+        self._storage_index = storage_index
4447+        self._secrets = secrets
4448+        self._seqnum = seqnum
4449+        self._required_shares = required_shares
4450+        self._total_shares = total_shares
4451+        self._segment_size = segment_size
4452+        self._data_length = data_length
4453+
4454+        # This is an SDMF file, so it should have only one segment, so,
4455+        # modulo padding of the data length, the segment size and the
4456+        # data length should be the same.
4457+        expected_segment_size = mathutil.next_multiple(data_length,
4458+                                                       self._required_shares)
4459+        assert expected_segment_size == segment_size
4460+
4461+        self._block_size = self._segment_size / self._required_shares
4462+
4463+        # This is meant to mimic how SDMF files were built before MDMF
4464+        # entered the picture: we generate each share in its entirety,
4465+        # then push it off to the storage server in one write. When
4466+        # callers call set_*, they are just populating this dict.
4467+        # finish_publishing will stitch these pieces together into a
4468+        # coherent share, and then write the coherent share to the
4469+        # storage server.
4470+        self._share_pieces = {}
4471+
4472+        # This tells the write logic what checkstring to use when
4473+        # writing remote shares.
4474+        self._testvs = []
4475+
4476+        self._readvs = [(0, struct.calcsize(PREFIX))]
4477+
4478+
4479+    def set_checkstring(self, checkstring_or_seqnum,
4480+                              root_hash=None,
4481+                              salt=None):
4482+        """
4483+        Set the checkstring that I will pass to the remote server when
4484+        writing.
4485+
4486+            @param checkstring_or_seqnum: A packed checkstring to use,
4487+                   or a sequence number. I will treat this as a checkstr
4488+
4489+        Note that implementations can differ in which semantics they
4490+        wish to support for set_checkstring -- they can, for example,
4491+        build the checkstring themselves from its constituents, or
4492+        some other thing.
4493+        """
4494+        if root_hash and salt:
4495+            checkstring = struct.pack(PREFIX,
4496+                                      0,
4497+                                      checkstring_or_seqnum,
4498+                                      root_hash,
4499+                                      salt)
4500+        else:
4501+            checkstring = checkstring_or_seqnum
4502+        self._testvs = [(0, len(checkstring), "eq", checkstring)]
4503+
4504+
4505+    def get_checkstring(self):
4506+        """
4507+        Get the checkstring that I think currently exists on the remote
4508+        server.
4509+        """
4510+        if self._testvs:
4511+            return self._testvs[0][3]
4512+        return ""
4513+
4514+
4515+    def put_block(self, data, segnum, salt):
4516+        """
4517+        Add a block and salt to the share.
4518+        """
4519+        # SDMF files have only one segment
4520+        assert segnum == 0
4521+        assert len(data) == self._block_size
4522+        assert len(salt) == SALT_SIZE
4523+
4524+        self._share_pieces['sharedata'] = data
4525+        self._share_pieces['salt'] = salt
4526+
4527+        # TODO: Figure out something intelligent to return.
4528+        return defer.succeed(None)
4529+
4530+
4531+    def put_encprivkey(self, encprivkey):
4532+        """
4533+        Add the encrypted private key to the share.
4534+        """
4535+        self._share_pieces['encprivkey'] = encprivkey
4536+
4537+        return defer.succeed(None)
4538+
4539+
4540+    def put_blockhashes(self, blockhashes):
4541+        """
4542+        Add the block hash tree to the share.
4543+        """
4544+        assert isinstance(blockhashes, list)
4545+        for h in blockhashes:
4546+            assert len(h) == HASH_SIZE
4547+
4548+        # serialize the blockhashes, then set them.
4549+        blockhashes_s = "".join(blockhashes)
4550+        self._share_pieces['block_hash_tree'] = blockhashes_s
4551+
4552+        return defer.succeed(None)
4553+
4554+
4555+    def put_sharehashes(self, sharehashes):
4556+        """
4557+        Add the share hash chain to the share.
4558+        """
4559+        assert isinstance(sharehashes, dict)
4560+        for h in sharehashes.itervalues():
4561+            assert len(h) == HASH_SIZE
4562+
4563+        # serialize the sharehashes, then set them.
4564+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
4565+                                 for i in sorted(sharehashes.keys())])
4566+        self._share_pieces['share_hash_chain'] = sharehashes_s
4567+
4568+        return defer.succeed(None)
4569+
4570+
4571+    def put_root_hash(self, root_hash):
4572+        """
4573+        Add the root hash to the share.
4574+        """
4575+        assert len(root_hash) == HASH_SIZE
4576+
4577+        self._share_pieces['root_hash'] = root_hash
4578+
4579+        return defer.succeed(None)
4580+
4581+
4582+    def put_salt(self, salt):
4583+        """
4584+        Add a salt to an empty SDMF file.
4585+        """
4586+        assert len(salt) == SALT_SIZE
4587+
4588+        self._share_pieces['salt'] = salt
4589+        self._share_pieces['sharedata'] = ""
4590+
4591+
4592+    def get_signable(self):
4593+        """
4594+        Return the part of the share that needs to be signed.
4595+
4596+        SDMF writers need to sign the packed representation of the
4597+        first eight fields of the remote share, that is:
4598+            - version number (0)
4599+            - sequence number
4600+            - root of the share hash tree
4601+            - salt
4602+            - k
4603+            - n
4604+            - segsize
4605+            - datalen
4606+
4607+        This method is responsible for returning that to callers.
4608+        """
4609+        return struct.pack(SIGNED_PREFIX,
4610+                           0,
4611+                           self._seqnum,
4612+                           self._share_pieces['root_hash'],
4613+                           self._share_pieces['salt'],
4614+                           self._required_shares,
4615+                           self._total_shares,
4616+                           self._segment_size,
4617+                           self._data_length)
4618+
4619+
4620+    def put_signature(self, signature):
4621+        """
4622+        Add the signature to the share.
4623+        """
4624+        self._share_pieces['signature'] = signature
4625+
4626+        return defer.succeed(None)
4627+
4628+
4629+    def put_verification_key(self, verification_key):
4630+        """
4631+        Add the verification key to the share.
4632+        """
4633+        self._share_pieces['verification_key'] = verification_key
4634+
4635+        return defer.succeed(None)
4636+
4637+
4638+    def get_verinfo(self):
4639+        """
4640+        I return my verinfo tuple. This is used by the ServermapUpdater
4641+        to keep track of versions of mutable files.
4642+
4643+        The verinfo tuple for MDMF files contains:
4644+            - seqnum
4645+            - root hash
4646+            - a blank (nothing)
4647+            - segsize
4648+            - datalen
4649+            - k
4650+            - n
4651+            - prefix (the thing that you sign)
4652+            - a tuple of offsets
4653+
4654+        We include the nonce in MDMF to simplify processing of version
4655+        information tuples.
4656+
4657+        The verinfo tuple for SDMF files is the same, but contains a
4658+        16-byte IV instead of a hash of salts.
4659+        """
4660+        return (self._seqnum,
4661+                self._share_pieces['root_hash'],
4662+                self._share_pieces['salt'],
4663+                self._segment_size,
4664+                self._data_length,
4665+                self._required_shares,
4666+                self._total_shares,
4667+                self.get_signable(),
4668+                self._get_offsets_tuple())
4669+
4670+    def _get_offsets_dict(self):
4671+        post_offset = HEADER_LENGTH
4672+        offsets = {}
4673+
4674+        verification_key_length = len(self._share_pieces['verification_key'])
4675+        o1 = offsets['signature'] = post_offset + verification_key_length
4676+
4677+        signature_length = len(self._share_pieces['signature'])
4678+        o2 = offsets['share_hash_chain'] = o1 + signature_length
4679+
4680+        share_hash_chain_length = len(self._share_pieces['share_hash_chain'])
4681+        o3 = offsets['block_hash_tree'] = o2 + share_hash_chain_length
4682+
4683+        block_hash_tree_length = len(self._share_pieces['block_hash_tree'])
4684+        o4 = offsets['share_data'] = o3 + block_hash_tree_length
4685+
4686+        share_data_length = len(self._share_pieces['sharedata'])
4687+        o5 = offsets['enc_privkey'] = o4 + share_data_length
4688+
4689+        encprivkey_length = len(self._share_pieces['encprivkey'])
4690+        offsets['EOF'] = o5 + encprivkey_length
4691+        return offsets
4692+
4693+
4694+    def _get_offsets_tuple(self):
4695+        offsets = self._get_offsets_dict()
4696+        return tuple([(key, value) for key, value in offsets.items()])
4697+
4698+
4699+    def _pack_offsets(self):
4700+        offsets = self._get_offsets_dict()
4701+        return struct.pack(">LLLLQQ",
4702+                           offsets['signature'],
4703+                           offsets['share_hash_chain'],
4704+                           offsets['block_hash_tree'],
4705+                           offsets['share_data'],
4706+                           offsets['enc_privkey'],
4707+                           offsets['EOF'])
4708+
4709+
4710+    def finish_publishing(self):
4711+        """
4712+        Do anything necessary to finish writing the share to a remote
4713+        server. I require that no further publishing needs to take place
4714+        after this method has been called.
4715+        """
4716+        for k in ["sharedata", "encprivkey", "signature", "verification_key",
4717+                  "share_hash_chain", "block_hash_tree"]:
4718+            assert k in self._share_pieces
4719+        # This is the only method that actually writes something to the
4720+        # remote server.
4721+        # First, we need to pack the share into data that we can write
4722+        # to the remote server in one write.
4723+        offsets = self._pack_offsets()
4724+        prefix = self.get_signable()
4725+        final_share = "".join([prefix,
4726+                               offsets,
4727+                               self._share_pieces['verification_key'],
4728+                               self._share_pieces['signature'],
4729+                               self._share_pieces['share_hash_chain'],
4730+                               self._share_pieces['block_hash_tree'],
4731+                               self._share_pieces['sharedata'],
4732+                               self._share_pieces['encprivkey']])
4733+
4734+        # Our only data vector is going to be writing the final share,
4735+        # in its entirely.
4736+        datavs = [(0, final_share)]
4737+
4738+        if not self._testvs:
4739+            # Our caller has not provided us with another checkstring
4740+            # yet, so we assume that we are writing a new share, and set
4741+            # a test vector that will allow a new share to be written.
4742+            self._testvs = []
4743+            self._testvs.append(tuple([0, 1, "eq", ""]))
4744+
4745+        tw_vectors = {}
4746+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
4747+        return self._rref.callRemote("slot_testv_and_readv_and_writev",
4748+                                     self._storage_index,
4749+                                     self._secrets,
4750+                                     tw_vectors,
4751+                                     # TODO is it useful to read something?
4752+                                     self._readvs)
4753+
4754+
4755+MDMFHEADER = ">BQ32sBBQQ QQQQQQ"
4756+MDMFHEADERWITHOUTOFFSETS = ">BQ32sBBQQ"
4757+MDMFHEADERSIZE = struct.calcsize(MDMFHEADER)
4758+MDMFHEADERWITHOUTOFFSETSSIZE = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
4759+MDMFCHECKSTRING = ">BQ32s"
4760+MDMFSIGNABLEHEADER = ">BQ32sBBQQ"
4761+MDMFOFFSETS = ">QQQQQQ"
4762+MDMFOFFSETS_LENGTH = struct.calcsize(MDMFOFFSETS)
4763+
4764+class MDMFSlotWriteProxy:
4765+    implements(IMutableSlotWriter)
4766+
4767+    """
4768+    I represent a remote write slot for an MDMF mutable file.
4769+
4770+    I abstract away from my caller the details of block and salt
4771+    management, and the implementation of the on-disk format for MDMF
4772+    shares.
4773+    """
4774+    # Expected layout, MDMF:
4775+    # offset:     size:       name:
4776+    #-- signed part --
4777+    # 0           1           version number (01)
4778+    # 1           8           sequence number
4779+    # 9           32          share tree root hash
4780+    # 41          1           The "k" encoding parameter
4781+    # 42          1           The "N" encoding parameter
4782+    # 43          8           The segment size of the uploaded file
4783+    # 51          8           The data length of the original plaintext
4784+    #-- end signed part --
4785+    # 59          8           The offset of the encrypted private key
4786+    # 83          8           The offset of the signature
4787+    # 91          8           The offset of the verification key
4788+    # 67          8           The offset of the block hash tree
4789+    # 75          8           The offset of the share hash chain
4790+    # 99          8           The offset of the EOF
4791+    #
4792+    # followed by salts and share data, the encrypted private key, the
4793+    # block hash tree, the salt hash tree, the share hash chain, a
4794+    # signature over the first eight fields, and a verification key.
4795+    #
4796+    # The checkstring is the first three fields -- the version number,
4797+    # sequence number, root hash and root salt hash. This is consistent
4798+    # in meaning to what we have with SDMF files, except now instead of
4799+    # using the literal salt, we use a value derived from all of the
4800+    # salts -- the share hash root.
4801+    #
4802+    # The salt is stored before the block for each segment. The block
4803+    # hash tree is computed over the combination of block and salt for
4804+    # each segment. In this way, we get integrity checking for both
4805+    # block and salt with the current block hash tree arrangement.
4806+    #
4807+    # The ordering of the offsets is different to reflect the dependencies
4808+    # that we'll run into with an MDMF file. The expected write flow is
4809+    # something like this:
4810+    #
4811+    #   0: Initialize with the sequence number, encoding parameters and
4812+    #      data length. From this, we can deduce the number of segments,
4813+    #      and where they should go.. We can also figure out where the
4814+    #      encrypted private key should go, because we can figure out how
4815+    #      big the share data will be.
4816+    #
4817+    #   1: Encrypt, encode, and upload the file in chunks. Do something
4818+    #      like
4819+    #
4820+    #       put_block(data, segnum, salt)
4821+    #
4822+    #      to write a block and a salt to the disk. We can do both of
4823+    #      these operations now because we have enough of the offsets to
4824+    #      know where to put them.
4825+    #
4826+    #   2: Put the encrypted private key. Use:
4827+    #
4828+    #        put_encprivkey(encprivkey)
4829+    #
4830+    #      Now that we know the length of the private key, we can fill
4831+    #      in the offset for the block hash tree.
4832+    #
4833+    #   3: We're now in a position to upload the block hash tree for
4834+    #      a share. Put that using something like:
4835+    #       
4836+    #        put_blockhashes(block_hash_tree)
4837+    #
4838+    #      Note that block_hash_tree is a list of hashes -- we'll take
4839+    #      care of the details of serializing that appropriately. When
4840+    #      we get the block hash tree, we are also in a position to
4841+    #      calculate the offset for the share hash chain, and fill that
4842+    #      into the offsets table.
4843+    #
4844+    #   4: At the same time, we're in a position to upload the salt hash
4845+    #      tree. This is a Merkle tree over all of the salts. We use a
4846+    #      Merkle tree so that we can validate each block,salt pair as
4847+    #      we download them later. We do this using
4848+    #
4849+    #        put_salthashes(salt_hash_tree)
4850+    #
4851+    #      When you do this, I automatically put the root of the tree
4852+    #      (the hash at index 0 of the list) in its appropriate slot in
4853+    #      the signed prefix of the share.
4854+    #
4855+    #   5: We're now in a position to upload the share hash chain for
4856+    #      a share. Do that with something like:
4857+    #     
4858+    #        put_sharehashes(share_hash_chain)
4859+    #
4860+    #      share_hash_chain should be a dictionary mapping shnums to
4861+    #      32-byte hashes -- the wrapper handles serialization.
4862+    #      We'll know where to put the signature at this point, also.
4863+    #      The root of this tree will be put explicitly in the next
4864+    #      step.
4865+    #
4866+    #      TODO: Why? Why not just include it in the tree here?
4867+    #
4868+    #   6: Before putting the signature, we must first put the
4869+    #      root_hash. Do this with:
4870+    #
4871+    #        put_root_hash(root_hash).
4872+    #     
4873+    #      In terms of knowing where to put this value, it was always
4874+    #      possible to place it, but it makes sense semantically to
4875+    #      place it after the share hash tree, so that's why you do it
4876+    #      in this order.
4877+    #
4878+    #   6: With the root hash put, we can now sign the header. Use:
4879+    #
4880+    #        get_signable()
4881+    #
4882+    #      to get the part of the header that you want to sign, and use:
4883+    #       
4884+    #        put_signature(signature)
4885+    #
4886+    #      to write your signature to the remote server.
4887+    #
4888+    #   6: Add the verification key, and finish. Do:
4889+    #
4890+    #        put_verification_key(key)
4891+    #
4892+    #      and
4893+    #
4894+    #        finish_publish()
4895+    #
4896+    # Checkstring management:
4897+    #
4898+    # To write to a mutable slot, we have to provide test vectors to ensure
4899+    # that we are writing to the same data that we think we are. These
4900+    # vectors allow us to detect uncoordinated writes; that is, writes
4901+    # where both we and some other shareholder are writing to the
4902+    # mutable slot, and to report those back to the parts of the program
4903+    # doing the writing.
4904+    #
4905+    # With SDMF, this was easy -- all of the share data was written in
4906+    # one go, so it was easy to detect uncoordinated writes, and we only
4907+    # had to do it once. With MDMF, not all of the file is written at
4908+    # once.
4909+    #
4910+    # If a share is new, we write out as much of the header as we can
4911+    # before writing out anything else. This gives other writers a
4912+    # canary that they can use to detect uncoordinated writes, and, if
4913+    # they do the same thing, gives us the same canary. We them update
4914+    # the share. We won't be able to write out two fields of the header
4915+    # -- the share tree hash and the salt hash -- until we finish
4916+    # writing out the share. We only require the writer to provide the
4917+    # initial checkstring, and keep track of what it should be after
4918+    # updates ourselves.
4919+    #
4920+    # If we haven't written anything yet, then on the first write (which
4921+    # will probably be a block + salt of a share), we'll also write out
4922+    # the header. On subsequent passes, we'll expect to see the header.
4923+    # This changes in two places:
4924+    #
4925+    #   - When we write out the salt hash
4926+    #   - When we write out the root of the share hash tree
4927+    #
4928+    # since these values will change the header. It is possible that we
4929+    # can just make those be written in one operation to minimize
4930+    # disruption.
4931+    def __init__(self,
4932+                 shnum,
4933+                 rref, # a remote reference to a storage server
4934+                 storage_index,
4935+                 secrets, # (write_enabler, renew_secret, cancel_secret)
4936+                 seqnum, # the sequence number of the mutable file
4937+                 required_shares,
4938+                 total_shares,
4939+                 segment_size,
4940+                 data_length): # the length of the original file
4941+        self.shnum = shnum
4942+        self._rref = rref
4943+        self._storage_index = storage_index
4944+        self._seqnum = seqnum
4945+        self._required_shares = required_shares
4946+        assert self.shnum >= 0 and self.shnum < total_shares
4947+        self._total_shares = total_shares
4948+        # We build up the offset table as we write things. It is the
4949+        # last thing we write to the remote server.
4950+        self._offsets = {}
4951+        self._testvs = []
4952+        # This is a list of write vectors that will be sent to our
4953+        # remote server once we are directed to write things there.
4954+        self._writevs = []
4955+        self._secrets = secrets
4956+        # The segment size needs to be a multiple of the k parameter --
4957+        # any padding should have been carried out by the publisher
4958+        # already.
4959+        assert segment_size % required_shares == 0
4960+        self._segment_size = segment_size
4961+        self._data_length = data_length
4962+
4963+        # These are set later -- we define them here so that we can
4964+        # check for their existence easily
4965+
4966+        # This is the root of the share hash tree -- the Merkle tree
4967+        # over the roots of the block hash trees computed for shares in
4968+        # this upload.
4969+        self._root_hash = None
4970+
4971+        # We haven't yet written anything to the remote bucket. By
4972+        # setting this, we tell the _write method as much. The write
4973+        # method will then know that it also needs to add a write vector
4974+        # for the checkstring (or what we have of it) to the first write
4975+        # request. We'll then record that value for future use.  If
4976+        # we're expecting something to be there already, we need to call
4977+        # set_checkstring before we write anything to tell the first
4978+        # write about that.
4979+        self._written = False
4980+
4981+        # When writing data to the storage servers, we get a read vector
4982+        # for free. We'll read the checkstring, which will help us
4983+        # figure out what's gone wrong if a write fails.
4984+        self._readv = [(0, struct.calcsize(MDMFCHECKSTRING))]
4985+
4986+        # We calculate the number of segments because it tells us
4987+        # where the salt part of the file ends/share segment begins,
4988+        # and also because it provides a useful amount of bounds checking.
4989+        self._num_segments = mathutil.div_ceil(self._data_length,
4990+                                               self._segment_size)
4991+        self._block_size = self._segment_size / self._required_shares
4992+        # We also calculate the share size, to help us with block
4993+        # constraints later.
4994+        tail_size = self._data_length % self._segment_size
4995+        if not tail_size:
4996+            self._tail_block_size = self._block_size
4997+        else:
4998+            self._tail_block_size = mathutil.next_multiple(tail_size,
4999+                                                           self._required_shares)
5000+            self._tail_block_size /= self._required_shares
5001+
5002+        # We already know where the sharedata starts; right after the end
5003+        # of the header (which is defined as the signable part + the offsets)
5004+        # We can also calculate where the encrypted private key begins
5005+        # from what we know know.
5006+        self._actual_block_size = self._block_size + SALT_SIZE
5007+        data_size = self._actual_block_size * (self._num_segments - 1)
5008+        data_size += self._tail_block_size
5009+        data_size += SALT_SIZE
5010+        self._offsets['enc_privkey'] = MDMFHEADERSIZE
5011+        self._offsets['enc_privkey'] += data_size
5012+        # We'll wait for the rest. Callers can now call my "put_block" and
5013+        # "set_checkstring" methods.
5014+
5015+
5016+    def set_checkstring(self,
5017+                        seqnum_or_checkstring,
5018+                        root_hash=None,
5019+                        salt=None):
5020+        """
5021+        Set checkstring checkstring for the given shnum.
5022+
5023+        This can be invoked in one of two ways.
5024+
5025+        With one argument, I assume that you are giving me a literal
5026+        checkstring -- e.g., the output of get_checkstring. I will then
5027+        set that checkstring as it is. This form is used by unit tests.
5028+
5029+        With two arguments, I assume that you are giving me a sequence
5030+        number and root hash to make a checkstring from. In that case, I
5031+        will build a checkstring and set it for you. This form is used
5032+        by the publisher.
5033+
5034+        By default, I assume that I am writing new shares to the grid.
5035+        If you don't explcitly set your own checkstring, I will use
5036+        one that requires that the remote share not exist. You will want
5037+        to use this method if you are updating a share in-place;
5038+        otherwise, writes will fail.
5039+        """
5040+        # You're allowed to overwrite checkstrings with this method;
5041+        # I assume that users know what they are doing when they call
5042+        # it.
5043+        if root_hash:
5044+            checkstring = struct.pack(MDMFCHECKSTRING,
5045+                                      1,
5046+                                      seqnum_or_checkstring,
5047+                                      root_hash)
5048+        else:
5049+            checkstring = seqnum_or_checkstring
5050+
5051+        if checkstring == "":
5052+            # We special-case this, since len("") = 0, but we need
5053+            # length of 1 for the case of an empty share to work on the
5054+            # storage server, which is what a checkstring that is the
5055+            # empty string means.
5056+            self._testvs = []
5057+        else:
5058+            self._testvs = []
5059+            self._testvs.append((0, len(checkstring), "eq", checkstring))
5060+
5061+
5062+    def __repr__(self):
5063+        return "MDMFSlotWriteProxy for share %d" % self.shnum
5064+
5065+
5066+    def get_checkstring(self):
5067+        """
5068+        Given a share number, I return a representation of what the
5069+        checkstring for that share on the server will look like.
5070+
5071+        I am mostly used for tests.
5072+        """
5073+        if self._root_hash:
5074+            roothash = self._root_hash
5075+        else:
5076+            roothash = "\x00" * 32
5077+        return struct.pack(MDMFCHECKSTRING,
5078+                           1,
5079+                           self._seqnum,
5080+                           roothash)
5081+
5082+
5083+    def put_block(self, data, segnum, salt):
5084+        """
5085+        I queue a write vector for the data, salt, and segment number
5086+        provided to me. I return None, as I do not actually cause
5087+        anything to be written yet.
5088+        """
5089+        if segnum >= self._num_segments:
5090+            raise LayoutInvalid("I won't overwrite the private key")
5091+        if len(salt) != SALT_SIZE:
5092+            raise LayoutInvalid("I was given a salt of size %d, but "
5093+                                "I wanted a salt of size %d")
5094+        if segnum + 1 == self._num_segments:
5095+            if len(data) != self._tail_block_size:
5096+                raise LayoutInvalid("I was given the wrong size block to write")
5097+        elif len(data) != self._block_size:
5098+            raise LayoutInvalid("I was given the wrong size block to write")
5099+
5100+        # We want to write at len(MDMFHEADER) + segnum * block_size.
5101+
5102+        offset = MDMFHEADERSIZE + (self._actual_block_size * segnum)
5103+        data = salt + data
5104+
5105+        self._writevs.append(tuple([offset, data]))
5106+
5107+
5108+    def put_encprivkey(self, encprivkey):
5109+        """
5110+        I queue a write vector for the encrypted private key provided to
5111+        me.
5112+        """
5113+        assert self._offsets
5114+        assert self._offsets['enc_privkey']
5115+        # You shouldn't re-write the encprivkey after the block hash
5116+        # tree is written, since that could cause the private key to run
5117+        # into the block hash tree. Before it writes the block hash
5118+        # tree, the block hash tree writing method writes the offset of
5119+        # the salt hash tree. So that's a good indicator of whether or
5120+        # not the block hash tree has been written.
5121+        if "share_hash_chain" in self._offsets:
5122+            raise LayoutInvalid("You must write this before the block hash tree")
5123+
5124+        self._offsets['block_hash_tree'] = self._offsets['enc_privkey'] + \
5125+            len(encprivkey)
5126+        self._writevs.append(tuple([self._offsets['enc_privkey'], encprivkey]))
5127+
5128+
5129+    def put_blockhashes(self, blockhashes):
5130+        """
5131+        I queue a write vector to put the block hash tree in blockhashes
5132+        onto the remote server.
5133+
5134+        The encrypted private key must be queued before the block hash
5135+        tree, since we need to know how large it is to know where the
5136+        block hash tree should go. The block hash tree must be put
5137+        before the salt hash tree, since its size determines the
5138+        offset of the share hash chain.
5139+        """
5140+        assert self._offsets
5141+        assert isinstance(blockhashes, list)
5142+        if "block_hash_tree" not in self._offsets:
5143+            raise LayoutInvalid("You must put the encrypted private key "
5144+                                "before you put the block hash tree")
5145+        # If written, the share hash chain causes the signature offset
5146+        # to be defined.
5147+        if "signature" in self._offsets:
5148+            raise LayoutInvalid("You must put the block hash tree before "
5149+                                "you put the share hash chain")
5150+        blockhashes_s = "".join(blockhashes)
5151+        self._offsets['share_hash_chain'] = self._offsets['block_hash_tree'] + len(blockhashes_s)
5152+
5153+        self._writevs.append(tuple([self._offsets['block_hash_tree'],
5154+                                  blockhashes_s]))
5155+
5156+
5157+    def put_sharehashes(self, sharehashes):
5158+        """
5159+        I queue a write vector to put the share hash chain in my
5160+        argument onto the remote server.
5161+
5162+        The salt hash tree must be queued before the share hash chain,
5163+        since we need to know where the salt hash tree ends before we
5164+        can know where the share hash chain starts. The share hash chain
5165+        must be put before the signature, since the length of the packed
5166+        share hash chain determines the offset of the signature. Also,
5167+        semantically, you must know what the root of the salt hash tree
5168+        is before you can generate a valid signature.
5169+        """
5170+        assert isinstance(sharehashes, dict)
5171+        if "share_hash_chain" not in self._offsets:
5172+            raise LayoutInvalid("You need to put the salt hash tree before "
5173+                                "you can put the share hash chain")
5174+        # The signature comes after the share hash chain. If the
5175+        # signature has already been written, we must not write another
5176+        # share hash chain. The signature writes the verification key
5177+        # offset when it gets sent to the remote server, so we look for
5178+        # that.
5179+        if "verification_key" in self._offsets:
5180+            raise LayoutInvalid("You must write the share hash chain "
5181+                                "before you write the signature")
5182+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
5183+                                  for i in sorted(sharehashes.keys())])
5184+        self._offsets['signature'] = self._offsets['share_hash_chain'] + len(sharehashes_s)
5185+        self._writevs.append(tuple([self._offsets['share_hash_chain'],
5186+                            sharehashes_s]))
5187+
5188+
5189+    def put_root_hash(self, roothash):
5190+        """
5191+        Put the root hash (the root of the share hash tree) in the
5192+        remote slot.
5193+        """
5194+        # It does not make sense to be able to put the root
5195+        # hash without first putting the share hashes, since you need
5196+        # the share hashes to generate the root hash.
5197+        #
5198+        # Signature is defined by the routine that places the share hash
5199+        # chain, so it's a good thing to look for in finding out whether
5200+        # or not the share hash chain exists on the remote server.
5201+        if "signature" not in self._offsets:
5202+            raise LayoutInvalid("You need to put the share hash chain "
5203+                                "before you can put the root share hash")
5204+        if len(roothash) != HASH_SIZE:
5205+            raise LayoutInvalid("hashes and salts must be exactly %d bytes"
5206+                                 % HASH_SIZE)
5207+        self._root_hash = roothash
5208+        # To write both of these values, we update the checkstring on
5209+        # the remote server, which includes them
5210+        checkstring = self.get_checkstring()
5211+        self._writevs.append(tuple([0, checkstring]))
5212+        # This write, if successful, changes the checkstring, so we need
5213+        # to update our internal checkstring to be consistent with the
5214+        # one on the server.
5215+
5216+
5217+    def get_signable(self):
5218+        """
5219+        Get the first seven fields of the mutable file; the parts that
5220+        are signed.
5221+        """
5222+        if not self._root_hash:
5223+            raise LayoutInvalid("You need to set the root hash "
5224+                                "before getting something to "
5225+                                "sign")
5226+        return struct.pack(MDMFSIGNABLEHEADER,
5227+                           1,
5228+                           self._seqnum,
5229+                           self._root_hash,
5230+                           self._required_shares,
5231+                           self._total_shares,
5232+                           self._segment_size,
5233+                           self._data_length)
5234+
5235+
5236+    def put_signature(self, signature):
5237+        """
5238+        I queue a write vector for the signature of the MDMF share.
5239+
5240+        I require that the root hash and share hash chain have been put
5241+        to the grid before I will write the signature to the grid.
5242+        """
5243+        if "signature" not in self._offsets:
5244+            raise LayoutInvalid("You must put the share hash chain "
5245+        # It does not make sense to put a signature without first
5246+        # putting the root hash and the salt hash (since otherwise
5247+        # the signature would be incomplete), so we don't allow that.
5248+                       "before putting the signature")
5249+        if not self._root_hash:
5250+            raise LayoutInvalid("You must complete the signed prefix "
5251+                                "before computing a signature")
5252+        # If we put the signature after we put the verification key, we
5253+        # could end up running into the verification key, and will
5254+        # probably screw up the offsets as well. So we don't allow that.
5255+        # The method that writes the verification key defines the EOF
5256+        # offset before writing the verification key, so look for that.
5257+        if "EOF" in self._offsets:
5258+            raise LayoutInvalid("You must write the signature before the verification key")
5259+
5260+        self._offsets['verification_key'] = self._offsets['signature'] + len(signature)
5261+        self._writevs.append(tuple([self._offsets['signature'], signature]))
5262+
5263+
5264+    def put_verification_key(self, verification_key):
5265+        """
5266+        I queue a write vector for the verification key.
5267+
5268+        I require that the signature have been written to the storage
5269+        server before I allow the verification key to be written to the
5270+        remote server.
5271+        """
5272+        if "verification_key" not in self._offsets:
5273+            raise LayoutInvalid("You must put the signature before you "
5274+                                "can put the verification key")
5275+        self._offsets['EOF'] = self._offsets['verification_key'] + len(verification_key)
5276+        self._writevs.append(tuple([self._offsets['verification_key'],
5277+                            verification_key]))
5278+
5279+
5280+    def _get_offsets_tuple(self):
5281+        return tuple([(key, value) for key, value in self._offsets.items()])
5282+
5283+
5284+    def get_verinfo(self):
5285+        return (self._seqnum,
5286+                self._root_hash,
5287+                self._required_shares,
5288+                self._total_shares,
5289+                self._segment_size,
5290+                self._data_length,
5291+                self.get_signable(),
5292+                self._get_offsets_tuple())
5293+
5294+
5295+    def finish_publishing(self):
5296+        """
5297+        I add a write vector for the offsets table, and then cause all
5298+        of the write vectors that I've dealt with so far to be published
5299+        to the remote server, ending the write process.
5300+        """
5301+        if "EOF" not in self._offsets:
5302+            raise LayoutInvalid("You must put the verification key before "
5303+                                "you can publish the offsets")
5304+        offsets_offset = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
5305+        offsets = struct.pack(MDMFOFFSETS,
5306+                              self._offsets['enc_privkey'],
5307+                              self._offsets['block_hash_tree'],
5308+                              self._offsets['share_hash_chain'],
5309+                              self._offsets['signature'],
5310+                              self._offsets['verification_key'],
5311+                              self._offsets['EOF'])
5312+        self._writevs.append(tuple([offsets_offset, offsets]))
5313+        encoding_parameters_offset = struct.calcsize(MDMFCHECKSTRING)
5314+        params = struct.pack(">BBQQ",
5315+                             self._required_shares,
5316+                             self._total_shares,
5317+                             self._segment_size,
5318+                             self._data_length)
5319+        self._writevs.append(tuple([encoding_parameters_offset, params]))
5320+        return self._write(self._writevs)
5321+
5322+
5323+    def _write(self, datavs, on_failure=None, on_success=None):
5324+        """I write the data vectors in datavs to the remote slot."""
5325+        tw_vectors = {}
5326+        if not self._testvs:
5327+            self._testvs = []
5328+            self._testvs.append(tuple([0, 1, "eq", ""]))
5329+        if not self._written:
5330+            # Write a new checkstring to the share when we write it, so
5331+            # that we have something to check later.
5332+            new_checkstring = self.get_checkstring()
5333+            datavs.append((0, new_checkstring))
5334+            def _first_write():
5335+                self._written = True
5336+                self._testvs = [(0, len(new_checkstring), "eq", new_checkstring)]
5337+            on_success = _first_write
5338+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
5339+        d = self._rref.callRemote("slot_testv_and_readv_and_writev",
5340+                                  self._storage_index,
5341+                                  self._secrets,
5342+                                  tw_vectors,
5343+                                  self._readv)
5344+        def _result(results):
5345+            if isinstance(results, failure.Failure) or not results[0]:
5346+                # Do nothing; the write was unsuccessful.
5347+                if on_failure: on_failure()
5348+            else:
5349+                if on_success: on_success()
5350+            return results
5351+        d.addCallback(_result)
5352+        return d
5353+
5354+
5355+class MDMFSlotReadProxy:
5356+    """
5357+    I read from a mutable slot filled with data written in the MDMF data
5358+    format (which is described above).
5359+
5360+    I can be initialized with some amount of data, which I will use (if
5361+    it is valid) to eliminate some of the need to fetch it from servers.
5362+    """
5363+    def __init__(self,
5364+                 rref,
5365+                 storage_index,
5366+                 shnum,
5367+                 data=""):
5368+        # Start the initialization process.
5369+        self._rref = rref
5370+        self._storage_index = storage_index
5371+        self.shnum = shnum
5372+
5373+        # Before doing anything, the reader is probably going to want to
5374+        # verify that the signature is correct. To do that, they'll need
5375+        # the verification key, and the signature. To get those, we'll
5376+        # need the offset table. So fetch the offset table on the
5377+        # assumption that that will be the first thing that a reader is
5378+        # going to do.
5379+
5380+        # The fact that these encoding parameters are None tells us
5381+        # that we haven't yet fetched them from the remote share, so we
5382+        # should. We could just not set them, but the checks will be
5383+        # easier to read if we don't have to use hasattr.
5384+        self._version_number = None
5385+        self._sequence_number = None
5386+        self._root_hash = None
5387+        # Filled in if we're dealing with an SDMF file. Unused
5388+        # otherwise.
5389+        self._salt = None
5390+        self._required_shares = None
5391+        self._total_shares = None
5392+        self._segment_size = None
5393+        self._data_length = None
5394+        self._offsets = None
5395+
5396+        # If the user has chosen to initialize us with some data, we'll
5397+        # try to satisfy subsequent data requests with that data before
5398+        # asking the storage server for it. If
5399+        self._data = data
5400+        # The way callers interact with cache in the filenode returns
5401+        # None if there isn't any cached data, but the way we index the
5402+        # cached data requires a string, so convert None to "".
5403+        if self._data == None:
5404+            self._data = ""
5405+
5406+        self._queue_observers = observer.ObserverList()
5407+        self._queue_errbacks = observer.ObserverList()
5408+        self._readvs = []
5409+
5410+
5411+    def _maybe_fetch_offsets_and_header(self, force_remote=False):
5412+        """
5413+        I fetch the offset table and the header from the remote slot if
5414+        I don't already have them. If I do have them, I do nothing and
5415+        return an empty Deferred.
5416+        """
5417+        if self._offsets:
5418+            return defer.succeed(None)
5419+        # At this point, we may be either SDMF or MDMF. Fetching 107
5420+        # bytes will be enough to get header and offsets for both SDMF and
5421+        # MDMF, though we'll be left with 4 more bytes than we
5422+        # need if this ends up being MDMF. This is probably less
5423+        # expensive than the cost of a second roundtrip.
5424+        readvs = [(0, 107)]
5425+        d = self._read(readvs, force_remote)
5426+        d.addCallback(self._process_encoding_parameters)
5427+        d.addCallback(self._process_offsets)
5428+        return d
5429+
5430+
5431+    def _process_encoding_parameters(self, encoding_parameters):
5432+        assert self.shnum in encoding_parameters
5433+        encoding_parameters = encoding_parameters[self.shnum][0]
5434+        # The first byte is the version number. It will tell us what
5435+        # to do next.
5436+        (verno,) = struct.unpack(">B", encoding_parameters[:1])
5437+        if verno == MDMF_VERSION:
5438+            read_size = MDMFHEADERWITHOUTOFFSETSSIZE
5439+            (verno,
5440+             seqnum,
5441+             root_hash,
5442+             k,
5443+             n,
5444+             segsize,
5445+             datalen) = struct.unpack(MDMFHEADERWITHOUTOFFSETS,
5446+                                      encoding_parameters[:read_size])
5447+            if segsize == 0 and datalen == 0:
5448+                # Empty file, no segments.
5449+                self._num_segments = 0
5450+            else:
5451+                self._num_segments = mathutil.div_ceil(datalen, segsize)
5452+
5453+        elif verno == SDMF_VERSION:
5454+            read_size = SIGNED_PREFIX_LENGTH
5455+            (verno,
5456+             seqnum,
5457+             root_hash,
5458+             salt,
5459+             k,
5460+             n,
5461+             segsize,
5462+             datalen) = struct.unpack(">BQ32s16s BBQQ",
5463+                                encoding_parameters[:SIGNED_PREFIX_LENGTH])
5464+            self._salt = salt
5465+            if segsize == 0 and datalen == 0:
5466+                # empty file
5467+                self._num_segments = 0
5468+            else:
5469+                # non-empty SDMF files have one segment.
5470+                self._num_segments = 1
5471+        else:
5472+            raise UnknownVersionError("You asked me to read mutable file "
5473+                                      "version %d, but I only understand "
5474+                                      "%d and %d" % (verno, SDMF_VERSION,
5475+                                                     MDMF_VERSION))
5476+
5477+        self._version_number = verno
5478+        self._sequence_number = seqnum
5479+        self._root_hash = root_hash
5480+        self._required_shares = k
5481+        self._total_shares = n
5482+        self._segment_size = segsize
5483+        self._data_length = datalen
5484+
5485+        self._block_size = self._segment_size / self._required_shares
5486+        # We can upload empty files, and need to account for this fact
5487+        # so as to avoid zero-division and zero-modulo errors.
5488+        if datalen > 0:
5489+            tail_size = self._data_length % self._segment_size
5490+        else:
5491+            tail_size = 0
5492+        if not tail_size:
5493+            self._tail_block_size = self._block_size
5494+        else:
5495+            self._tail_block_size = mathutil.next_multiple(tail_size,
5496+                                                    self._required_shares)
5497+            self._tail_block_size /= self._required_shares
5498+
5499+        return encoding_parameters
5500+
5501+
5502+    def _process_offsets(self, offsets):
5503+        if self._version_number == 0:
5504+            read_size = OFFSETS_LENGTH
5505+            read_offset = SIGNED_PREFIX_LENGTH
5506+            end = read_size + read_offset
5507+            (signature,
5508+             share_hash_chain,
5509+             block_hash_tree,
5510+             share_data,
5511+             enc_privkey,
5512+             EOF) = struct.unpack(">LLLLQQ",
5513+                                  offsets[read_offset:end])
5514+            self._offsets = {}
5515+            self._offsets['signature'] = signature
5516+            self._offsets['share_data'] = share_data
5517+            self._offsets['block_hash_tree'] = block_hash_tree
5518+            self._offsets['share_hash_chain'] = share_hash_chain
5519+            self._offsets['enc_privkey'] = enc_privkey
5520+            self._offsets['EOF'] = EOF
5521+
5522+        elif self._version_number == 1:
5523+            read_offset = MDMFHEADERWITHOUTOFFSETSSIZE
5524+            read_length = MDMFOFFSETS_LENGTH
5525+            end = read_offset + read_length
5526+            (encprivkey,
5527+             blockhashes,
5528+             sharehashes,
5529+             signature,
5530+             verification_key,
5531+             eof) = struct.unpack(MDMFOFFSETS,
5532+                                  offsets[read_offset:end])
5533+            self._offsets = {}
5534+            self._offsets['enc_privkey'] = encprivkey
5535+            self._offsets['block_hash_tree'] = blockhashes
5536+            self._offsets['share_hash_chain'] = sharehashes
5537+            self._offsets['signature'] = signature
5538+            self._offsets['verification_key'] = verification_key
5539+            self._offsets['EOF'] = eof
5540+
5541+
5542+    def get_block_and_salt(self, segnum, queue=False):
5543+        """
5544+        I return (block, salt), where block is the block data and
5545+        salt is the salt used to encrypt that segment.
5546+        """
5547+        d = self._maybe_fetch_offsets_and_header()
5548+        def _then(ignored):
5549+            if self._version_number == 1:
5550+                base_share_offset = MDMFHEADERSIZE
5551+            else:
5552+                base_share_offset = self._offsets['share_data']
5553+
5554+            if segnum + 1 > self._num_segments:
5555+                raise LayoutInvalid("Not a valid segment number")
5556+
5557+            if self._version_number == 0:
5558+                share_offset = base_share_offset + self._block_size * segnum
5559+            else:
5560+                share_offset = base_share_offset + (self._block_size + \
5561+                                                    SALT_SIZE) * segnum
5562+            if segnum + 1 == self._num_segments:
5563+                data = self._tail_block_size
5564+            else:
5565+                data = self._block_size
5566+
5567+            if self._version_number == 1:
5568+                data += SALT_SIZE
5569+
5570+            readvs = [(share_offset, data)]
5571+            return readvs
5572+        d.addCallback(_then)
5573+        d.addCallback(lambda readvs:
5574+            self._read(readvs, queue=queue))
5575+        def _process_results(results):
5576+            assert self.shnum in results
5577+            if self._version_number == 0:
5578+                # We only read the share data, but we know the salt from
5579+                # when we fetched the header
5580+                data = results[self.shnum]
5581+                if not data:
5582+                    data = ""
5583+                else:
5584+                    assert len(data) == 1
5585+                    data = data[0]
5586+                salt = self._salt
5587+            else:
5588+                data = results[self.shnum]
5589+                if not data:
5590+                    salt = data = ""
5591+                else:
5592+                    salt_and_data = results[self.shnum][0]
5593+                    salt = salt_and_data[:SALT_SIZE]
5594+                    data = salt_and_data[SALT_SIZE:]
5595+            return data, salt
5596+        d.addCallback(_process_results)
5597+        return d
5598+
5599+
5600+    def get_blockhashes(self, needed=None, queue=False, force_remote=False):
5601+        """
5602+        I return the block hash tree
5603+
5604+        I take an optional argument, needed, which is a set of indices
5605+        correspond to hashes that I should fetch. If this argument is
5606+        missing, I will fetch the entire block hash tree; otherwise, I
5607+        may attempt to fetch fewer hashes, based on what needed says
5608+        that I should do. Note that I may fetch as many hashes as I
5609+        want, so long as the set of hashes that I do fetch is a superset
5610+        of the ones that I am asked for, so callers should be prepared
5611+        to tolerate additional hashes.
5612+        """
5613+        # TODO: Return only the parts of the block hash tree necessary
5614+        # to validate the blocknum provided?
5615+        # This is a good idea, but it is hard to implement correctly. It
5616+        # is bad to fetch any one block hash more than once, so we
5617+        # probably just want to fetch the whole thing at once and then
5618+        # serve it.
5619+        if needed == set([]):
5620+            return defer.succeed([])
5621+        d = self._maybe_fetch_offsets_and_header()
5622+        def _then(ignored):
5623+            blockhashes_offset = self._offsets['block_hash_tree']
5624+            if self._version_number == 1:
5625+                blockhashes_length = self._offsets['share_hash_chain'] - blockhashes_offset
5626+            else:
5627+                blockhashes_length = self._offsets['share_data'] - blockhashes_offset
5628+            readvs = [(blockhashes_offset, blockhashes_length)]
5629+            return readvs
5630+        d.addCallback(_then)
5631+        d.addCallback(lambda readvs:
5632+            self._read(readvs, queue=queue, force_remote=force_remote))
5633+        def _build_block_hash_tree(results):
5634+            assert self.shnum in results
5635+
5636+            rawhashes = results[self.shnum][0]
5637+            results = [rawhashes[i:i+HASH_SIZE]
5638+                       for i in range(0, len(rawhashes), HASH_SIZE)]
5639+            return results
5640+        d.addCallback(_build_block_hash_tree)
5641+        return d
5642+
5643+
5644+    def get_sharehashes(self, needed=None, queue=False, force_remote=False):
5645+        """
5646+        I return the part of the share hash chain placed to validate
5647+        this share.
5648+
5649+        I take an optional argument, needed. Needed is a set of indices
5650+        that correspond to the hashes that I should fetch. If needed is
5651+        not present, I will fetch and return the entire share hash
5652+        chain. Otherwise, I may fetch and return any part of the share
5653+        hash chain that is a superset of the part that I am asked to
5654+        fetch. Callers should be prepared to deal with more hashes than
5655+        they've asked for.
5656+        """
5657+        if needed == set([]):
5658+            return defer.succeed([])
5659+        d = self._maybe_fetch_offsets_and_header()
5660+
5661+        def _make_readvs(ignored):
5662+            sharehashes_offset = self._offsets['share_hash_chain']
5663+            if self._version_number == 0:
5664+                sharehashes_length = self._offsets['block_hash_tree'] - sharehashes_offset
5665+            else:
5666+                sharehashes_length = self._offsets['signature'] - sharehashes_offset
5667+            readvs = [(sharehashes_offset, sharehashes_length)]
5668+            return readvs
5669+        d.addCallback(_make_readvs)
5670+        d.addCallback(lambda readvs:
5671+            self._read(readvs, queue=queue, force_remote=force_remote))
5672+        def _build_share_hash_chain(results):
5673+            assert self.shnum in results
5674+
5675+            sharehashes = results[self.shnum][0]
5676+            results = [sharehashes[i:i+(HASH_SIZE + 2)]
5677+                       for i in range(0, len(sharehashes), HASH_SIZE + 2)]
5678+            results = dict([struct.unpack(">H32s", data)
5679+                            for data in results])
5680+            return results
5681+        d.addCallback(_build_share_hash_chain)
5682+        return d
5683+
5684+
5685+    def get_encprivkey(self, queue=False):
5686+        """
5687+        I return the encrypted private key.
5688+        """
5689+        d = self._maybe_fetch_offsets_and_header()
5690+
5691+        def _make_readvs(ignored):
5692+            privkey_offset = self._offsets['enc_privkey']
5693+            if self._version_number == 0:
5694+                privkey_length = self._offsets['EOF'] - privkey_offset
5695+            else:
5696+                privkey_length = self._offsets['block_hash_tree'] - privkey_offset
5697+            readvs = [(privkey_offset, privkey_length)]
5698+            return readvs
5699+        d.addCallback(_make_readvs)
5700+        d.addCallback(lambda readvs:
5701+            self._read(readvs, queue=queue))
5702+        def _process_results(results):
5703+            assert self.shnum in results
5704+            privkey = results[self.shnum][0]
5705+            return privkey
5706+        d.addCallback(_process_results)
5707+        return d
5708+
5709+
5710+    def get_signature(self, queue=False):
5711+        """
5712+        I return the signature of my share.
5713+        """
5714+        d = self._maybe_fetch_offsets_and_header()
5715+
5716+        def _make_readvs(ignored):
5717+            signature_offset = self._offsets['signature']
5718+            if self._version_number == 1:
5719+                signature_length = self._offsets['verification_key'] - signature_offset
5720+            else:
5721+                signature_length = self._offsets['share_hash_chain'] - signature_offset
5722+            readvs = [(signature_offset, signature_length)]
5723+            return readvs
5724+        d.addCallback(_make_readvs)
5725+        d.addCallback(lambda readvs:
5726+            self._read(readvs, queue=queue))
5727+        def _process_results(results):
5728+            assert self.shnum in results
5729+            signature = results[self.shnum][0]
5730+            return signature
5731+        d.addCallback(_process_results)
5732+        return d
5733+
5734+
5735+    def get_verification_key(self, queue=False):
5736+        """
5737+        I return the verification key.
5738+        """
5739+        d = self._maybe_fetch_offsets_and_header()
5740+
5741+        def _make_readvs(ignored):
5742+            if self._version_number == 1:
5743+                vk_offset = self._offsets['verification_key']
5744+                vk_length = self._offsets['EOF'] - vk_offset
5745+            else:
5746+                vk_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
5747+                vk_length = self._offsets['signature'] - vk_offset
5748+            readvs = [(vk_offset, vk_length)]
5749+            return readvs
5750+        d.addCallback(_make_readvs)
5751+        d.addCallback(lambda readvs:
5752+            self._read(readvs, queue=queue))
5753+        def _process_results(results):
5754+            assert self.shnum in results
5755+            verification_key = results[self.shnum][0]
5756+            return verification_key
5757+        d.addCallback(_process_results)
5758+        return d
5759+
5760+
5761+    def get_encoding_parameters(self):
5762+        """
5763+        I return (k, n, segsize, datalen)
5764+        """
5765+        d = self._maybe_fetch_offsets_and_header()
5766+        d.addCallback(lambda ignored:
5767+            (self._required_shares,
5768+             self._total_shares,
5769+             self._segment_size,
5770+             self._data_length))
5771+        return d
5772+
5773+
5774+    def get_seqnum(self):
5775+        """
5776+        I return the sequence number for this share.
5777+        """
5778+        d = self._maybe_fetch_offsets_and_header()
5779+        d.addCallback(lambda ignored:
5780+            self._sequence_number)
5781+        return d
5782+
5783+
5784+    def get_root_hash(self):
5785+        """
5786+        I return the root of the block hash tree
5787+        """
5788+        d = self._maybe_fetch_offsets_and_header()
5789+        d.addCallback(lambda ignored: self._root_hash)
5790+        return d
5791+
5792+
5793+    def get_checkstring(self):
5794+        """
5795+        I return the packed representation of the following:
5796+
5797+            - version number
5798+            - sequence number
5799+            - root hash
5800+            - salt hash
5801+
5802+        which my users use as a checkstring to detect other writers.
5803+        """
5804+        d = self._maybe_fetch_offsets_and_header()
5805+        def _build_checkstring(ignored):
5806+            if self._salt:
5807+                checkstring = struct.pack(PREFIX,
5808+                                          self._version_number,
5809+                                          self._sequence_number,
5810+                                          self._root_hash,
5811+                                          self._salt)
5812+            else:
5813+                checkstring = struct.pack(MDMFCHECKSTRING,
5814+                                          self._version_number,
5815+                                          self._sequence_number,
5816+                                          self._root_hash)
5817+
5818+            return checkstring
5819+        d.addCallback(_build_checkstring)
5820+        return d
5821+
5822+
5823+    def get_prefix(self, force_remote):
5824+        d = self._maybe_fetch_offsets_and_header(force_remote)
5825+        d.addCallback(lambda ignored:
5826+            self._build_prefix())
5827+        return d
5828+
5829+
5830+    def _build_prefix(self):
5831+        # The prefix is another name for the part of the remote share
5832+        # that gets signed. It consists of everything up to and
5833+        # including the datalength, packed by struct.
5834+        if self._version_number == SDMF_VERSION:
5835+            return struct.pack(SIGNED_PREFIX,
5836+                           self._version_number,
5837+                           self._sequence_number,
5838+                           self._root_hash,
5839+                           self._salt,
5840+                           self._required_shares,
5841+                           self._total_shares,
5842+                           self._segment_size,
5843+                           self._data_length)
5844+
5845+        else:
5846+            return struct.pack(MDMFSIGNABLEHEADER,
5847+                           self._version_number,
5848+                           self._sequence_number,
5849+                           self._root_hash,
5850+                           self._required_shares,
5851+                           self._total_shares,
5852+                           self._segment_size,
5853+                           self._data_length)
5854+
5855+
5856+    def _get_offsets_tuple(self):
5857+        # The offsets tuple is another component of the version
5858+        # information tuple. It is basically our offsets dictionary,
5859+        # itemized and in a tuple.
5860+        return self._offsets.copy()
5861+
5862+
5863+    def get_verinfo(self):
5864+        """
5865+        I return my verinfo tuple. This is used by the ServermapUpdater
5866+        to keep track of versions of mutable files.
5867+
5868+        The verinfo tuple for MDMF files contains:
5869+            - seqnum
5870+            - root hash
5871+            - a blank (nothing)
5872+            - segsize
5873+            - datalen
5874+            - k
5875+            - n
5876+            - prefix (the thing that you sign)
5877+            - a tuple of offsets
5878+
5879+        We include the nonce in MDMF to simplify processing of version
5880+        information tuples.
5881+
5882+        The verinfo tuple for SDMF files is the same, but contains a
5883+        16-byte IV instead of a hash of salts.
5884+        """
5885+        d = self._maybe_fetch_offsets_and_header()
5886+        def _build_verinfo(ignored):
5887+            if self._version_number == SDMF_VERSION:
5888+                salt_to_use = self._salt
5889+            else:
5890+                salt_to_use = None
5891+            return (self._sequence_number,
5892+                    self._root_hash,
5893+                    salt_to_use,
5894+                    self._segment_size,
5895+                    self._data_length,
5896+                    self._required_shares,
5897+                    self._total_shares,
5898+                    self._build_prefix(),
5899+                    self._get_offsets_tuple())
5900+        d.addCallback(_build_verinfo)
5901+        return d
5902+
5903+
5904+    def flush(self):
5905+        """
5906+        I flush my queue of read vectors.
5907+        """
5908+        d = self._read(self._readvs)
5909+        def _then(results):
5910+            self._readvs = []
5911+            if isinstance(results, failure.Failure):
5912+                self._queue_errbacks.notify(results)
5913+            else:
5914+                self._queue_observers.notify(results)
5915+            self._queue_observers = observer.ObserverList()
5916+            self._queue_errbacks = observer.ObserverList()
5917+        d.addBoth(_then)
5918+
5919+
5920+    def _read(self, readvs, force_remote=False, queue=False):
5921+        unsatisfiable = filter(lambda x: x[0] + x[1] > len(self._data), readvs)
5922+        # TODO: It's entirely possible to tweak this so that it just
5923+        # fulfills the requests that it can, and not demand that all
5924+        # requests are satisfiable before running it.
5925+        if not unsatisfiable and not force_remote:
5926+            results = [self._data[offset:offset+length]
5927+                       for (offset, length) in readvs]
5928+            results = {self.shnum: results}
5929+            return defer.succeed(results)
5930+        else:
5931+            if queue:
5932+                start = len(self._readvs)
5933+                self._readvs += readvs
5934+                end = len(self._readvs)
5935+                def _get_results(results, start, end):
5936+                    if not self.shnum in results:
5937+                        return {self._shnum: [""]}
5938+                    return {self.shnum: results[self.shnum][start:end]}
5939+                d = defer.Deferred()
5940+                d.addCallback(_get_results, start, end)
5941+                self._queue_observers.subscribe(d.callback)
5942+                self._queue_errbacks.subscribe(d.errback)
5943+                return d
5944+            return self._rref.callRemote("slot_readv",
5945+                                         self._storage_index,
5946+                                         [self.shnum],
5947+                                         readvs)
5948+
5949+
5950+    def is_sdmf(self):
5951+        """I tell my caller whether or not my remote file is SDMF or MDMF
5952+        """
5953+        d = self._maybe_fetch_offsets_and_header()
5954+        d.addCallback(lambda ignored:
5955+            self._version_number == 0)
5956+        return d
5957+
5958+
5959+class LayoutInvalid(Exception):
5960+    """
5961+    This isn't a valid MDMF mutable file
5962+    """
5963merger 0.0 (
5964hunk ./src/allmydata/test/test_storage.py 3
5965-from allmydata.util import log
5966-
5967merger 0.0 (
5968hunk ./src/allmydata/test/test_storage.py 3
5969-import time, os.path, stat, re, simplejson, struct
5970+from allmydata.util import log
5971+
5972+import mock
5973hunk ./src/allmydata/test/test_storage.py 3
5974-import time, os.path, stat, re, simplejson, struct
5975+import time, os.path, stat, re, simplejson, struct, shutil
5976)
5977)
5978hunk ./src/allmydata/test/test_storage.py 23
5979 from allmydata.storage.expirer import LeaseCheckingCrawler
5980 from allmydata.immutable.layout import WriteBucketProxy, WriteBucketProxy_v2, \
5981      ReadBucketProxy
5982-from allmydata.interfaces import BadWriteEnablerError
5983-from allmydata.test.common import LoggingServiceParent
5984+from allmydata.mutable.layout import MDMFSlotWriteProxy, MDMFSlotReadProxy, \
5985+                                     LayoutInvalid, MDMFSIGNABLEHEADER, \
5986+                                     SIGNED_PREFIX, MDMFHEADER, \
5987+                                     MDMFOFFSETS, SDMFSlotWriteProxy
5988+from allmydata.interfaces import BadWriteEnablerError, MDMF_VERSION, \
5989+                                 SDMF_VERSION
5990+from allmydata.test.common import LoggingServiceParent, ShouldFailMixin
5991 from allmydata.test.common_web import WebRenderingMixin
5992 from allmydata.web.storage import StorageStatus, remove_prefix
5993 
5994hunk ./src/allmydata/test/test_storage.py 107
5995 
5996 class RemoteBucket:
5997 
5998+    def __init__(self):
5999+        self.read_count = 0
6000+        self.write_count = 0
6001+
6002     def callRemote(self, methname, *args, **kwargs):
6003         def _call():
6004             meth = getattr(self.target, "remote_" + methname)
6005hunk ./src/allmydata/test/test_storage.py 115
6006             return meth(*args, **kwargs)
6007+
6008+        if methname == "slot_readv":
6009+            self.read_count += 1
6010+        if "writev" in methname:
6011+            self.write_count += 1
6012+
6013         return defer.maybeDeferred(_call)
6014 
6015hunk ./src/allmydata/test/test_storage.py 123
6016+
6017 class BucketProxy(unittest.TestCase):
6018     def make_bucket(self, name, size):
6019         basedir = os.path.join("storage", "BucketProxy", name)
6020hunk ./src/allmydata/test/test_storage.py 1306
6021         self.failUnless(os.path.exists(prefixdir), prefixdir)
6022         self.failIf(os.path.exists(bucketdir), bucketdir)
6023 
6024+
6025+class MDMFProxies(unittest.TestCase, ShouldFailMixin):
6026+    def setUp(self):
6027+        self.sparent = LoggingServiceParent()
6028+        self._lease_secret = itertools.count()
6029+        self.ss = self.create("MDMFProxies storage test server")
6030+        self.rref = RemoteBucket()
6031+        self.rref.target = self.ss
6032+        self.secrets = (self.write_enabler("we_secret"),
6033+                        self.renew_secret("renew_secret"),
6034+                        self.cancel_secret("cancel_secret"))
6035+        self.segment = "aaaaaa"
6036+        self.block = "aa"
6037+        self.salt = "a" * 16
6038+        self.block_hash = "a" * 32
6039+        self.block_hash_tree = [self.block_hash for i in xrange(6)]
6040+        self.share_hash = self.block_hash
6041+        self.share_hash_chain = dict([(i, self.share_hash) for i in xrange(6)])
6042+        self.signature = "foobarbaz"
6043+        self.verification_key = "vvvvvv"
6044+        self.encprivkey = "private"
6045+        self.root_hash = self.block_hash
6046+        self.salt_hash = self.root_hash
6047+        self.salt_hash_tree = [self.salt_hash for i in xrange(6)]
6048+        self.block_hash_tree_s = self.serialize_blockhashes(self.block_hash_tree)
6049+        self.share_hash_chain_s = self.serialize_sharehashes(self.share_hash_chain)
6050+        # blockhashes and salt hashes are serialized in the same way,
6051+        # only we lop off the first element and store that in the
6052+        # header.
6053+        self.salt_hash_tree_s = self.serialize_blockhashes(self.salt_hash_tree[1:])
6054+
6055+
6056+    def tearDown(self):
6057+        self.sparent.stopService()
6058+        shutil.rmtree(self.workdir("MDMFProxies storage test server"))
6059+
6060+
6061+    def write_enabler(self, we_tag):
6062+        return hashutil.tagged_hash("we_blah", we_tag)
6063+
6064+
6065+    def renew_secret(self, tag):
6066+        return hashutil.tagged_hash("renew_blah", str(tag))
6067+
6068+
6069+    def cancel_secret(self, tag):
6070+        return hashutil.tagged_hash("cancel_blah", str(tag))
6071+
6072+
6073+    def workdir(self, name):
6074+        basedir = os.path.join("storage", "MutableServer", name)
6075+        return basedir
6076+
6077+
6078+    def create(self, name):
6079+        workdir = self.workdir(name)
6080+        ss = StorageServer(workdir, "\x00" * 20)
6081+        ss.setServiceParent(self.sparent)
6082+        return ss
6083+
6084+
6085+    def build_test_mdmf_share(self, tail_segment=False, empty=False):
6086+        # Start with the checkstring
6087+        data = struct.pack(">BQ32s",
6088+                           1,
6089+                           0,
6090+                           self.root_hash)
6091+        self.checkstring = data
6092+        # Next, the encoding parameters
6093+        if tail_segment:
6094+            data += struct.pack(">BBQQ",
6095+                                3,
6096+                                10,
6097+                                6,
6098+                                33)
6099+        elif empty:
6100+            data += struct.pack(">BBQQ",
6101+                                3,
6102+                                10,
6103+                                0,
6104+                                0)
6105+        else:
6106+            data += struct.pack(">BBQQ",
6107+                                3,
6108+                                10,
6109+                                6,
6110+                                36)
6111+        # Now we'll build the offsets.
6112+        sharedata = ""
6113+        if not tail_segment and not empty:
6114+            for i in xrange(6):
6115+                sharedata += self.salt + self.block
6116+        elif tail_segment:
6117+            for i in xrange(5):
6118+                sharedata += self.salt + self.block
6119+            sharedata += self.salt + "a"
6120+
6121+        # The encrypted private key comes after the shares + salts
6122+        offset_size = struct.calcsize(MDMFOFFSETS)
6123+        encrypted_private_key_offset = len(data) + offset_size + len(sharedata)
6124+        # The blockhashes come after the private key
6125+        blockhashes_offset = encrypted_private_key_offset + len(self.encprivkey)
6126+        # The sharehashes come after the salt hashes
6127+        sharehashes_offset = blockhashes_offset + len(self.block_hash_tree_s)
6128+        # The signature comes after the share hash chain
6129+        signature_offset = sharehashes_offset + len(self.share_hash_chain_s)
6130+        # The verification key comes after the signature
6131+        verification_offset = signature_offset + len(self.signature)
6132+        # The EOF comes after the verification key
6133+        eof_offset = verification_offset + len(self.verification_key)
6134+        data += struct.pack(MDMFOFFSETS,
6135+                            encrypted_private_key_offset,
6136+                            blockhashes_offset,
6137+                            sharehashes_offset,
6138+                            signature_offset,
6139+                            verification_offset,
6140+                            eof_offset)
6141+        self.offsets = {}
6142+        self.offsets['enc_privkey'] = encrypted_private_key_offset
6143+        self.offsets['block_hash_tree'] = blockhashes_offset
6144+        self.offsets['share_hash_chain'] = sharehashes_offset
6145+        self.offsets['signature'] = signature_offset
6146+        self.offsets['verification_key'] = verification_offset
6147+        self.offsets['EOF'] = eof_offset
6148+        # Next, we'll add in the salts and share data,
6149+        data += sharedata
6150+        # the private key,
6151+        data += self.encprivkey
6152+        # the block hash tree,
6153+        data += self.block_hash_tree_s
6154+        # the share hash chain,
6155+        data += self.share_hash_chain_s
6156+        # the signature,
6157+        data += self.signature
6158+        # and the verification key
6159+        data += self.verification_key
6160+        return data
6161+
6162+
6163+    def write_test_share_to_server(self,
6164+                                   storage_index,
6165+                                   tail_segment=False,
6166+                                   empty=False):
6167+        """
6168+        I write some data for the read tests to read to self.ss
6169+
6170+        If tail_segment=True, then I will write a share that has a
6171+        smaller tail segment than other segments.
6172+        """
6173+        write = self.ss.remote_slot_testv_and_readv_and_writev
6174+        data = self.build_test_mdmf_share(tail_segment, empty)
6175+        # Finally, we write the whole thing to the storage server in one
6176+        # pass.
6177+        testvs = [(0, 1, "eq", "")]
6178+        tws = {}
6179+        tws[0] = (testvs, [(0, data)], None)
6180+        readv = [(0, 1)]
6181+        results = write(storage_index, self.secrets, tws, readv)
6182+        self.failUnless(results[0])
6183+
6184+
6185+    def build_test_sdmf_share(self, empty=False):
6186+        if empty:
6187+            sharedata = ""
6188+        else:
6189+            sharedata = self.segment * 6
6190+        self.sharedata = sharedata
6191+        blocksize = len(sharedata) / 3
6192+        block = sharedata[:blocksize]
6193+        self.blockdata = block
6194+        prefix = struct.pack(">BQ32s16s BBQQ",
6195+                             0, # version,
6196+                             0,
6197+                             self.root_hash,
6198+                             self.salt,
6199+                             3,
6200+                             10,
6201+                             len(sharedata),
6202+                             len(sharedata),
6203+                            )
6204+        post_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
6205+        signature_offset = post_offset + len(self.verification_key)
6206+        sharehashes_offset = signature_offset + len(self.signature)
6207+        blockhashes_offset = sharehashes_offset + len(self.share_hash_chain_s)
6208+        sharedata_offset = blockhashes_offset + len(self.block_hash_tree_s)
6209+        encprivkey_offset = sharedata_offset + len(block)
6210+        eof_offset = encprivkey_offset + len(self.encprivkey)
6211+        offsets = struct.pack(">LLLLQQ",
6212+                              signature_offset,
6213+                              sharehashes_offset,
6214+                              blockhashes_offset,
6215+                              sharedata_offset,
6216+                              encprivkey_offset,
6217+                              eof_offset)
6218+        final_share = "".join([prefix,
6219+                           offsets,
6220+                           self.verification_key,
6221+                           self.signature,
6222+                           self.share_hash_chain_s,
6223+                           self.block_hash_tree_s,
6224+                           block,
6225+                           self.encprivkey])
6226+        self.offsets = {}
6227+        self.offsets['signature'] = signature_offset
6228+        self.offsets['share_hash_chain'] = sharehashes_offset
6229+        self.offsets['block_hash_tree'] = blockhashes_offset
6230+        self.offsets['share_data'] = sharedata_offset
6231+        self.offsets['enc_privkey'] = encprivkey_offset
6232+        self.offsets['EOF'] = eof_offset
6233+        return final_share
6234+
6235+
6236+    def write_sdmf_share_to_server(self,
6237+                                   storage_index,
6238+                                   empty=False):
6239+        # Some tests need SDMF shares to verify that we can still
6240+        # read them. This method writes one, which resembles but is not
6241+        assert self.rref
6242+        write = self.ss.remote_slot_testv_and_readv_and_writev
6243+        share = self.build_test_sdmf_share(empty)
6244+        testvs = [(0, 1, "eq", "")]
6245+        tws = {}
6246+        tws[0] = (testvs, [(0, share)], None)
6247+        readv = []
6248+        results = write(storage_index, self.secrets, tws, readv)
6249+        self.failUnless(results[0])
6250+
6251+
6252+    def test_read(self):
6253+        self.write_test_share_to_server("si1")
6254+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6255+        # Check that every method equals what we expect it to.
6256+        d = defer.succeed(None)
6257+        def _check_block_and_salt((block, salt)):
6258+            self.failUnlessEqual(block, self.block)
6259+            self.failUnlessEqual(salt, self.salt)
6260+
6261+        for i in xrange(6):
6262+            d.addCallback(lambda ignored, i=i:
6263+                mr.get_block_and_salt(i))
6264+            d.addCallback(_check_block_and_salt)
6265+
6266+        d.addCallback(lambda ignored:
6267+            mr.get_encprivkey())
6268+        d.addCallback(lambda encprivkey:
6269+            self.failUnlessEqual(self.encprivkey, encprivkey))
6270+
6271+        d.addCallback(lambda ignored:
6272+            mr.get_blockhashes())
6273+        d.addCallback(lambda blockhashes:
6274+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
6275+
6276+        d.addCallback(lambda ignored:
6277+            mr.get_sharehashes())
6278+        d.addCallback(lambda sharehashes:
6279+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
6280+
6281+        d.addCallback(lambda ignored:
6282+            mr.get_signature())
6283+        d.addCallback(lambda signature:
6284+            self.failUnlessEqual(signature, self.signature))
6285+
6286+        d.addCallback(lambda ignored:
6287+            mr.get_verification_key())
6288+        d.addCallback(lambda verification_key:
6289+            self.failUnlessEqual(verification_key, self.verification_key))
6290+
6291+        d.addCallback(lambda ignored:
6292+            mr.get_seqnum())
6293+        d.addCallback(lambda seqnum:
6294+            self.failUnlessEqual(seqnum, 0))
6295+
6296+        d.addCallback(lambda ignored:
6297+            mr.get_root_hash())
6298+        d.addCallback(lambda root_hash:
6299+            self.failUnlessEqual(self.root_hash, root_hash))
6300+
6301+        d.addCallback(lambda ignored:
6302+            mr.get_seqnum())
6303+        d.addCallback(lambda seqnum:
6304+            self.failUnlessEqual(0, seqnum))
6305+
6306+        d.addCallback(lambda ignored:
6307+            mr.get_encoding_parameters())
6308+        def _check_encoding_parameters((k, n, segsize, datalen)):
6309+            self.failUnlessEqual(k, 3)
6310+            self.failUnlessEqual(n, 10)
6311+            self.failUnlessEqual(segsize, 6)
6312+            self.failUnlessEqual(datalen, 36)
6313+        d.addCallback(_check_encoding_parameters)
6314+
6315+        d.addCallback(lambda ignored:
6316+            mr.get_checkstring())
6317+        d.addCallback(lambda checkstring:
6318+            self.failUnlessEqual(checkstring, checkstring))
6319+        return d
6320+
6321+
6322+    def test_read_with_different_tail_segment_size(self):
6323+        self.write_test_share_to_server("si1", tail_segment=True)
6324+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6325+        d = mr.get_block_and_salt(5)
6326+        def _check_tail_segment(results):
6327+            block, salt = results
6328+            self.failUnlessEqual(len(block), 1)
6329+            self.failUnlessEqual(block, "a")
6330+        d.addCallback(_check_tail_segment)
6331+        return d
6332+
6333+
6334+    def test_get_block_with_invalid_segnum(self):
6335+        self.write_test_share_to_server("si1")
6336+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6337+        d = defer.succeed(None)
6338+        d.addCallback(lambda ignored:
6339+            self.shouldFail(LayoutInvalid, "test invalid segnum",
6340+                            None,
6341+                            mr.get_block_and_salt, 7))
6342+        return d
6343+
6344+
6345+    def test_get_encoding_parameters_first(self):
6346+        self.write_test_share_to_server("si1")
6347+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6348+        d = mr.get_encoding_parameters()
6349+        def _check_encoding_parameters((k, n, segment_size, datalen)):
6350+            self.failUnlessEqual(k, 3)
6351+            self.failUnlessEqual(n, 10)
6352+            self.failUnlessEqual(segment_size, 6)
6353+            self.failUnlessEqual(datalen, 36)
6354+        d.addCallback(_check_encoding_parameters)
6355+        return d
6356+
6357+
6358+    def test_get_seqnum_first(self):
6359+        self.write_test_share_to_server("si1")
6360+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6361+        d = mr.get_seqnum()
6362+        d.addCallback(lambda seqnum:
6363+            self.failUnlessEqual(seqnum, 0))
6364+        return d
6365+
6366+
6367+    def test_get_root_hash_first(self):
6368+        self.write_test_share_to_server("si1")
6369+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6370+        d = mr.get_root_hash()
6371+        d.addCallback(lambda root_hash:
6372+            self.failUnlessEqual(root_hash, self.root_hash))
6373+        return d
6374+
6375+
6376+    def test_get_checkstring_first(self):
6377+        self.write_test_share_to_server("si1")
6378+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6379+        d = mr.get_checkstring()
6380+        d.addCallback(lambda checkstring:
6381+            self.failUnlessEqual(checkstring, self.checkstring))
6382+        return d
6383+
6384+
6385+    def test_write_read_vectors(self):
6386+        # When writing for us, the storage server will return to us a
6387+        # read vector, along with its result. If a write fails because
6388+        # the test vectors failed, this read vector can help us to
6389+        # diagnose the problem. This test ensures that the read vector
6390+        # is working appropriately.
6391+        mw = self._make_new_mw("si1", 0)
6392+
6393+        for i in xrange(6):
6394+            mw.put_block(self.block, i, self.salt)
6395+        mw.put_encprivkey(self.encprivkey)
6396+        mw.put_blockhashes(self.block_hash_tree)
6397+        mw.put_sharehashes(self.share_hash_chain)
6398+        mw.put_root_hash(self.root_hash)
6399+        mw.put_signature(self.signature)
6400+        mw.put_verification_key(self.verification_key)
6401+        d = mw.finish_publishing()
6402+        def _then(results):
6403+            self.failUnless(len(results), 2)
6404+            result, readv = results
6405+            self.failUnless(result)
6406+            self.failIf(readv)
6407+            self.old_checkstring = mw.get_checkstring()
6408+            mw.set_checkstring("")
6409+        d.addCallback(_then)
6410+        d.addCallback(lambda ignored:
6411+            mw.finish_publishing())
6412+        def _then_again(results):
6413+            self.failUnlessEqual(len(results), 2)
6414+            result, readvs = results
6415+            self.failIf(result)
6416+            self.failUnlessIn(0, readvs)
6417+            readv = readvs[0][0]
6418+            self.failUnlessEqual(readv, self.old_checkstring)
6419+        d.addCallback(_then_again)
6420+        # The checkstring remains the same for the rest of the process.
6421+        return d
6422+
6423+
6424+    def test_blockhashes_after_share_hash_chain(self):
6425+        mw = self._make_new_mw("si1", 0)
6426+        d = defer.succeed(None)
6427+        # Put everything up to and including the share hash chain
6428+        for i in xrange(6):
6429+            d.addCallback(lambda ignored, i=i:
6430+                mw.put_block(self.block, i, self.salt))
6431+        d.addCallback(lambda ignored:
6432+            mw.put_encprivkey(self.encprivkey))
6433+        d.addCallback(lambda ignored:
6434+            mw.put_blockhashes(self.block_hash_tree))
6435+        d.addCallback(lambda ignored:
6436+            mw.put_sharehashes(self.share_hash_chain))
6437+
6438+        # Now try to put the block hash tree again.
6439+        d.addCallback(lambda ignored:
6440+            self.shouldFail(LayoutInvalid, "test repeat salthashes",
6441+                            None,
6442+                            mw.put_blockhashes, self.block_hash_tree))
6443+        return d
6444+
6445+
6446+    def test_encprivkey_after_blockhashes(self):
6447+        mw = self._make_new_mw("si1", 0)
6448+        d = defer.succeed(None)
6449+        # Put everything up to and including the block hash tree
6450+        for i in xrange(6):
6451+            d.addCallback(lambda ignored, i=i:
6452+                mw.put_block(self.block, i, self.salt))
6453+        d.addCallback(lambda ignored:
6454+            mw.put_encprivkey(self.encprivkey))
6455+        d.addCallback(lambda ignored:
6456+            mw.put_blockhashes(self.block_hash_tree))
6457+        d.addCallback(lambda ignored:
6458+            self.shouldFail(LayoutInvalid, "out of order private key",
6459+                            None,
6460+                            mw.put_encprivkey, self.encprivkey))
6461+        return d
6462+
6463+
6464+    def test_share_hash_chain_after_signature(self):
6465+        mw = self._make_new_mw("si1", 0)
6466+        d = defer.succeed(None)
6467+        # Put everything up to and including the signature
6468+        for i in xrange(6):
6469+            d.addCallback(lambda ignored, i=i:
6470+                mw.put_block(self.block, i, self.salt))
6471+        d.addCallback(lambda ignored:
6472+            mw.put_encprivkey(self.encprivkey))
6473+        d.addCallback(lambda ignored:
6474+            mw.put_blockhashes(self.block_hash_tree))
6475+        d.addCallback(lambda ignored:
6476+            mw.put_sharehashes(self.share_hash_chain))
6477+        d.addCallback(lambda ignored:
6478+            mw.put_root_hash(self.root_hash))
6479+        d.addCallback(lambda ignored:
6480+            mw.put_signature(self.signature))
6481+        # Now try to put the share hash chain again. This should fail
6482+        d.addCallback(lambda ignored:
6483+            self.shouldFail(LayoutInvalid, "out of order share hash chain",
6484+                            None,
6485+                            mw.put_sharehashes, self.share_hash_chain))
6486+        return d
6487+
6488+
6489+    def test_signature_after_verification_key(self):
6490+        mw = self._make_new_mw("si1", 0)
6491+        d = defer.succeed(None)
6492+        # Put everything up to and including the verification key.
6493+        for i in xrange(6):
6494+            d.addCallback(lambda ignored, i=i:
6495+                mw.put_block(self.block, i, self.salt))
6496+        d.addCallback(lambda ignored:
6497+            mw.put_encprivkey(self.encprivkey))
6498+        d.addCallback(lambda ignored:
6499+            mw.put_blockhashes(self.block_hash_tree))
6500+        d.addCallback(lambda ignored:
6501+            mw.put_sharehashes(self.share_hash_chain))
6502+        d.addCallback(lambda ignored:
6503+            mw.put_root_hash(self.root_hash))
6504+        d.addCallback(lambda ignored:
6505+            mw.put_signature(self.signature))
6506+        d.addCallback(lambda ignored:
6507+            mw.put_verification_key(self.verification_key))
6508+        # Now try to put the signature again. This should fail
6509+        d.addCallback(lambda ignored:
6510+            self.shouldFail(LayoutInvalid, "signature after verification",
6511+                            None,
6512+                            mw.put_signature, self.signature))
6513+        return d
6514+
6515+
6516+    def test_uncoordinated_write(self):
6517+        # Make two mutable writers, both pointing to the same storage
6518+        # server, both at the same storage index, and try writing to the
6519+        # same share.
6520+        mw1 = self._make_new_mw("si1", 0)
6521+        mw2 = self._make_new_mw("si1", 0)
6522+
6523+        def _check_success(results):
6524+            result, readvs = results
6525+            self.failUnless(result)
6526+
6527+        def _check_failure(results):
6528+            result, readvs = results
6529+            self.failIf(result)
6530+
6531+        def _write_share(mw):
6532+            for i in xrange(6):
6533+                mw.put_block(self.block, i, self.salt)
6534+            mw.put_encprivkey(self.encprivkey)
6535+            mw.put_blockhashes(self.block_hash_tree)
6536+            mw.put_sharehashes(self.share_hash_chain)
6537+            mw.put_root_hash(self.root_hash)
6538+            mw.put_signature(self.signature)
6539+            mw.put_verification_key(self.verification_key)
6540+            return mw.finish_publishing()
6541+        d = _write_share(mw1)
6542+        d.addCallback(_check_success)
6543+        d.addCallback(lambda ignored:
6544+            _write_share(mw2))
6545+        d.addCallback(_check_failure)
6546+        return d
6547+
6548+
6549+    def test_invalid_salt_size(self):
6550+        # Salts need to be 16 bytes in size. Writes that attempt to
6551+        # write more or less than this should be rejected.
6552+        mw = self._make_new_mw("si1", 0)
6553+        invalid_salt = "a" * 17 # 17 bytes
6554+        another_invalid_salt = "b" * 15 # 15 bytes
6555+        d = defer.succeed(None)
6556+        d.addCallback(lambda ignored:
6557+            self.shouldFail(LayoutInvalid, "salt too big",
6558+                            None,
6559+                            mw.put_block, self.block, 0, invalid_salt))
6560+        d.addCallback(lambda ignored:
6561+            self.shouldFail(LayoutInvalid, "salt too small",
6562+                            None,
6563+                            mw.put_block, self.block, 0,
6564+                            another_invalid_salt))
6565+        return d
6566+
6567+
6568+    def test_write_test_vectors(self):
6569+        # If we give the write proxy a bogus test vector at
6570+        # any point during the process, it should fail to write when we
6571+        # tell it to write.
6572+        def _check_failure(results):
6573+            self.failUnlessEqual(len(results), 2)
6574+            res, d = results
6575+            self.failIf(res)
6576+
6577+        def _check_success(results):
6578+            self.failUnlessEqual(len(results), 2)
6579+            res, d = results
6580+            self.failUnless(results)
6581+
6582+        mw = self._make_new_mw("si1", 0)
6583+        mw.set_checkstring("this is a lie")
6584+        for i in xrange(6):
6585+            mw.put_block(self.block, i, self.salt)
6586+        mw.put_encprivkey(self.encprivkey)
6587+        mw.put_blockhashes(self.block_hash_tree)
6588+        mw.put_sharehashes(self.share_hash_chain)
6589+        mw.put_root_hash(self.root_hash)
6590+        mw.put_signature(self.signature)
6591+        mw.put_verification_key(self.verification_key)
6592+        d = mw.finish_publishing()
6593+        d.addCallback(_check_failure)
6594+        d.addCallback(lambda ignored:
6595+            mw.set_checkstring(""))
6596+        d.addCallback(lambda ignored:
6597+            mw.finish_publishing())
6598+        d.addCallback(_check_success)
6599+        return d
6600+
6601+
6602+    def serialize_blockhashes(self, blockhashes):
6603+        return "".join(blockhashes)
6604+
6605+
6606+    def serialize_sharehashes(self, sharehashes):
6607+        ret = "".join([struct.pack(">H32s", i, sharehashes[i])
6608+                        for i in sorted(sharehashes.keys())])
6609+        return ret
6610+
6611+
6612+    def test_write(self):
6613+        # This translates to a file with 6 6-byte segments, and with 2-byte
6614+        # blocks.
6615+        mw = self._make_new_mw("si1", 0)
6616+        # Test writing some blocks.
6617+        read = self.ss.remote_slot_readv
6618+        expected_sharedata_offset = struct.calcsize(MDMFHEADER)
6619+        written_block_size = 2 + len(self.salt)
6620+        written_block = self.block + self.salt
6621+        for i in xrange(6):
6622+            mw.put_block(self.block, i, self.salt)
6623+
6624+        mw.put_encprivkey(self.encprivkey)
6625+        mw.put_blockhashes(self.block_hash_tree)
6626+        mw.put_sharehashes(self.share_hash_chain)
6627+        mw.put_root_hash(self.root_hash)
6628+        mw.put_signature(self.signature)
6629+        mw.put_verification_key(self.verification_key)
6630+        d = mw.finish_publishing()
6631+        def _check_publish(results):
6632+            self.failUnlessEqual(len(results), 2)
6633+            result, ign = results
6634+            self.failUnless(result, "publish failed")
6635+            for i in xrange(6):
6636+                self.failUnlessEqual(read("si1", [0], [(expected_sharedata_offset + (i * written_block_size), written_block_size)]),
6637+                                {0: [written_block]})
6638+
6639+            expected_private_key_offset = expected_sharedata_offset + \
6640+                                      len(written_block) * 6
6641+            self.failUnlessEqual(len(self.encprivkey), 7)
6642+            self.failUnlessEqual(read("si1", [0], [(expected_private_key_offset, 7)]),
6643+                                 {0: [self.encprivkey]})
6644+
6645+            expected_block_hash_offset = expected_private_key_offset + len(self.encprivkey)
6646+            self.failUnlessEqual(len(self.block_hash_tree_s), 32 * 6)
6647+            self.failUnlessEqual(read("si1", [0], [(expected_block_hash_offset, 32 * 6)]),
6648+                                 {0: [self.block_hash_tree_s]})
6649+
6650+            expected_share_hash_offset = expected_block_hash_offset + len(self.block_hash_tree_s)
6651+            self.failUnlessEqual(read("si1", [0],[(expected_share_hash_offset, (32 + 2) * 6)]),
6652+                                 {0: [self.share_hash_chain_s]})
6653+
6654+            self.failUnlessEqual(read("si1", [0], [(9, 32)]),
6655+                                 {0: [self.root_hash]})
6656+            expected_signature_offset = expected_share_hash_offset + len(self.share_hash_chain_s)
6657+            self.failUnlessEqual(len(self.signature), 9)
6658+            self.failUnlessEqual(read("si1", [0], [(expected_signature_offset, 9)]),
6659+                                 {0: [self.signature]})
6660+
6661+            expected_verification_key_offset = expected_signature_offset + len(self.signature)
6662+            self.failUnlessEqual(len(self.verification_key), 6)
6663+            self.failUnlessEqual(read("si1", [0], [(expected_verification_key_offset, 6)]),
6664+                                 {0: [self.verification_key]})
6665+
6666+            signable = mw.get_signable()
6667+            verno, seq, roothash, k, n, segsize, datalen = \
6668+                                            struct.unpack(">BQ32sBBQQ",
6669+                                                          signable)
6670+            self.failUnlessEqual(verno, 1)
6671+            self.failUnlessEqual(seq, 0)
6672+            self.failUnlessEqual(roothash, self.root_hash)
6673+            self.failUnlessEqual(k, 3)
6674+            self.failUnlessEqual(n, 10)
6675+            self.failUnlessEqual(segsize, 6)
6676+            self.failUnlessEqual(datalen, 36)
6677+            expected_eof_offset = expected_verification_key_offset + len(self.verification_key)
6678+
6679+            # Check the version number to make sure that it is correct.
6680+            expected_version_number = struct.pack(">B", 1)
6681+            self.failUnlessEqual(read("si1", [0], [(0, 1)]),
6682+                                 {0: [expected_version_number]})
6683+            # Check the sequence number to make sure that it is correct
6684+            expected_sequence_number = struct.pack(">Q", 0)
6685+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
6686+                                 {0: [expected_sequence_number]})
6687+            # Check that the encoding parameters (k, N, segement size, data
6688+            # length) are what they should be. These are  3, 10, 6, 36
6689+            expected_k = struct.pack(">B", 3)
6690+            self.failUnlessEqual(read("si1", [0], [(41, 1)]),
6691+                                 {0: [expected_k]})
6692+            expected_n = struct.pack(">B", 10)
6693+            self.failUnlessEqual(read("si1", [0], [(42, 1)]),
6694+                                 {0: [expected_n]})
6695+            expected_segment_size = struct.pack(">Q", 6)
6696+            self.failUnlessEqual(read("si1", [0], [(43, 8)]),
6697+                                 {0: [expected_segment_size]})
6698+            expected_data_length = struct.pack(">Q", 36)
6699+            self.failUnlessEqual(read("si1", [0], [(51, 8)]),
6700+                                 {0: [expected_data_length]})
6701+            expected_offset = struct.pack(">Q", expected_private_key_offset)
6702+            self.failUnlessEqual(read("si1", [0], [(59, 8)]),
6703+                                 {0: [expected_offset]})
6704+            expected_offset = struct.pack(">Q", expected_block_hash_offset)
6705+            self.failUnlessEqual(read("si1", [0], [(67, 8)]),
6706+                                 {0: [expected_offset]})
6707+            expected_offset = struct.pack(">Q", expected_share_hash_offset)
6708+            self.failUnlessEqual(read("si1", [0], [(75, 8)]),
6709+                                 {0: [expected_offset]})
6710+            expected_offset = struct.pack(">Q", expected_signature_offset)
6711+            self.failUnlessEqual(read("si1", [0], [(83, 8)]),
6712+                                 {0: [expected_offset]})
6713+            expected_offset = struct.pack(">Q", expected_verification_key_offset)
6714+            self.failUnlessEqual(read("si1", [0], [(91, 8)]),
6715+                                 {0: [expected_offset]})
6716+            expected_offset = struct.pack(">Q", expected_eof_offset)
6717+            self.failUnlessEqual(read("si1", [0], [(99, 8)]),
6718+                                 {0: [expected_offset]})
6719+        d.addCallback(_check_publish)
6720+        return d
6721+
6722+    def _make_new_mw(self, si, share, datalength=36):
6723+        # This is a file of size 36 bytes. Since it has a segment
6724+        # size of 6, we know that it has 6 byte segments, which will
6725+        # be split into blocks of 2 bytes because our FEC k
6726+        # parameter is 3.
6727+        mw = MDMFSlotWriteProxy(share, self.rref, si, self.secrets, 0, 3, 10,
6728+                                6, datalength)
6729+        return mw
6730+
6731+
6732+    def test_write_rejected_with_too_many_blocks(self):
6733+        mw = self._make_new_mw("si0", 0)
6734+
6735+        # Try writing too many blocks. We should not be able to write
6736+        # more than 6
6737+        # blocks into each share.
6738+        d = defer.succeed(None)
6739+        for i in xrange(6):
6740+            d.addCallback(lambda ignored, i=i:
6741+                mw.put_block(self.block, i, self.salt))
6742+        d.addCallback(lambda ignored:
6743+            self.shouldFail(LayoutInvalid, "too many blocks",
6744+                            None,
6745+                            mw.put_block, self.block, 7, self.salt))
6746+        return d
6747+
6748+
6749+    def test_write_rejected_with_invalid_salt(self):
6750+        # Try writing an invalid salt. Salts are 16 bytes -- any more or
6751+        # less should cause an error.
6752+        mw = self._make_new_mw("si1", 0)
6753+        bad_salt = "a" * 17 # 17 bytes
6754+        d = defer.succeed(None)
6755+        d.addCallback(lambda ignored:
6756+            self.shouldFail(LayoutInvalid, "test_invalid_salt",
6757+                            None, mw.put_block, self.block, 7, bad_salt))
6758+        return d
6759+
6760+
6761+    def test_write_rejected_with_invalid_root_hash(self):
6762+        # Try writing an invalid root hash. This should be SHA256d, and
6763+        # 32 bytes long as a result.
6764+        mw = self._make_new_mw("si2", 0)
6765+        # 17 bytes != 32 bytes
6766+        invalid_root_hash = "a" * 17
6767+        d = defer.succeed(None)
6768+        # Before this test can work, we need to put some blocks + salts,
6769+        # a block hash tree, and a share hash tree. Otherwise, we'll see
6770+        # failures that match what we are looking for, but are caused by
6771+        # the constraints imposed on operation ordering.
6772+        for i in xrange(6):
6773+            d.addCallback(lambda ignored, i=i:
6774+                mw.put_block(self.block, i, self.salt))
6775+        d.addCallback(lambda ignored:
6776+            mw.put_encprivkey(self.encprivkey))
6777+        d.addCallback(lambda ignored:
6778+            mw.put_blockhashes(self.block_hash_tree))
6779+        d.addCallback(lambda ignored:
6780+            mw.put_sharehashes(self.share_hash_chain))
6781+        d.addCallback(lambda ignored:
6782+            self.shouldFail(LayoutInvalid, "invalid root hash",
6783+                            None, mw.put_root_hash, invalid_root_hash))
6784+        return d
6785+
6786+
6787+    def test_write_rejected_with_invalid_blocksize(self):
6788+        # The blocksize implied by the writer that we get from
6789+        # _make_new_mw is 2bytes -- any more or any less than this
6790+        # should be cause for failure, unless it is the tail segment, in
6791+        # which case it may not be failure.
6792+        invalid_block = "a"
6793+        mw = self._make_new_mw("si3", 0, 33) # implies a tail segment with
6794+                                             # one byte blocks
6795+        # 1 bytes != 2 bytes
6796+        d = defer.succeed(None)
6797+        d.addCallback(lambda ignored, invalid_block=invalid_block:
6798+            self.shouldFail(LayoutInvalid, "test blocksize too small",
6799+                            None, mw.put_block, invalid_block, 0,
6800+                            self.salt))
6801+        invalid_block = invalid_block * 3
6802+        # 3 bytes != 2 bytes
6803+        d.addCallback(lambda ignored:
6804+            self.shouldFail(LayoutInvalid, "test blocksize too large",
6805+                            None,
6806+                            mw.put_block, invalid_block, 0, self.salt))
6807+        for i in xrange(5):
6808+            d.addCallback(lambda ignored, i=i:
6809+                mw.put_block(self.block, i, self.salt))
6810+        # Try to put an invalid tail segment
6811+        d.addCallback(lambda ignored:
6812+            self.shouldFail(LayoutInvalid, "test invalid tail segment",
6813+                            None,
6814+                            mw.put_block, self.block, 5, self.salt))
6815+        valid_block = "a"
6816+        d.addCallback(lambda ignored:
6817+            mw.put_block(valid_block, 5, self.salt))
6818+        return d
6819+
6820+
6821+    def test_write_enforces_order_constraints(self):
6822+        # We require that the MDMFSlotWriteProxy be interacted with in a
6823+        # specific way.
6824+        # That way is:
6825+        # 0: __init__
6826+        # 1: write blocks and salts
6827+        # 2: Write the encrypted private key
6828+        # 3: Write the block hashes
6829+        # 4: Write the share hashes
6830+        # 5: Write the root hash and salt hash
6831+        # 6: Write the signature and verification key
6832+        # 7: Write the file.
6833+        #
6834+        # Some of these can be performed out-of-order, and some can't.
6835+        # The dependencies that I want to test here are:
6836+        #  - Private key before block hashes
6837+        #  - share hashes and block hashes before root hash
6838+        #  - root hash before signature
6839+        #  - signature before verification key
6840+        mw0 = self._make_new_mw("si0", 0)
6841+        # Write some shares
6842+        d = defer.succeed(None)
6843+        for i in xrange(6):
6844+            d.addCallback(lambda ignored, i=i:
6845+                mw0.put_block(self.block, i, self.salt))
6846+        # Try to write the block hashes before writing the encrypted
6847+        # private key
6848+        d.addCallback(lambda ignored:
6849+            self.shouldFail(LayoutInvalid, "block hashes before key",
6850+                            None, mw0.put_blockhashes,
6851+                            self.block_hash_tree))
6852+
6853+        # Write the private key.
6854+        d.addCallback(lambda ignored:
6855+            mw0.put_encprivkey(self.encprivkey))
6856+
6857+
6858+        # Try to write the share hash chain without writing the block
6859+        # hash tree
6860+        d.addCallback(lambda ignored:
6861+            self.shouldFail(LayoutInvalid, "share hash chain before "
6862+                                           "salt hash tree",
6863+                            None,
6864+                            mw0.put_sharehashes, self.share_hash_chain))
6865+
6866+        # Try to write the root hash and without writing either the
6867+        # block hashes or the or the share hashes
6868+        d.addCallback(lambda ignored:
6869+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
6870+                            None,
6871+                            mw0.put_root_hash, self.root_hash))
6872+
6873+        # Now write the block hashes and try again
6874+        d.addCallback(lambda ignored:
6875+            mw0.put_blockhashes(self.block_hash_tree))
6876+
6877+        d.addCallback(lambda ignored:
6878+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
6879+                            None, mw0.put_root_hash, self.root_hash))
6880+
6881+        # We haven't yet put the root hash on the share, so we shouldn't
6882+        # be able to sign it.
6883+        d.addCallback(lambda ignored:
6884+            self.shouldFail(LayoutInvalid, "signature before root hash",
6885+                            None, mw0.put_signature, self.signature))
6886+
6887+        d.addCallback(lambda ignored:
6888+            self.failUnlessRaises(LayoutInvalid, mw0.get_signable))
6889+
6890+        # ..and, since that fails, we also shouldn't be able to put the
6891+        # verification key.
6892+        d.addCallback(lambda ignored:
6893+            self.shouldFail(LayoutInvalid, "key before signature",
6894+                            None, mw0.put_verification_key,
6895+                            self.verification_key))
6896+
6897+        # Now write the share hashes.
6898+        d.addCallback(lambda ignored:
6899+            mw0.put_sharehashes(self.share_hash_chain))
6900+        # We should be able to write the root hash now too
6901+        d.addCallback(lambda ignored:
6902+            mw0.put_root_hash(self.root_hash))
6903+
6904+        # We should still be unable to put the verification key
6905+        d.addCallback(lambda ignored:
6906+            self.shouldFail(LayoutInvalid, "key before signature",
6907+                            None, mw0.put_verification_key,
6908+                            self.verification_key))
6909+
6910+        d.addCallback(lambda ignored:
6911+            mw0.put_signature(self.signature))
6912+
6913+        # We shouldn't be able to write the offsets to the remote server
6914+        # until the offset table is finished; IOW, until we have written
6915+        # the verification key.
6916+        d.addCallback(lambda ignored:
6917+            self.shouldFail(LayoutInvalid, "offsets before verification key",
6918+                            None,
6919+                            mw0.finish_publishing))
6920+
6921+        d.addCallback(lambda ignored:
6922+            mw0.put_verification_key(self.verification_key))
6923+        return d
6924+
6925+
6926+    def test_end_to_end(self):
6927+        mw = self._make_new_mw("si1", 0)
6928+        # Write a share using the mutable writer, and make sure that the
6929+        # reader knows how to read everything back to us.
6930+        d = defer.succeed(None)
6931+        for i in xrange(6):
6932+            d.addCallback(lambda ignored, i=i:
6933+                mw.put_block(self.block, i, self.salt))
6934+        d.addCallback(lambda ignored:
6935+            mw.put_encprivkey(self.encprivkey))
6936+        d.addCallback(lambda ignored:
6937+            mw.put_blockhashes(self.block_hash_tree))
6938+        d.addCallback(lambda ignored:
6939+            mw.put_sharehashes(self.share_hash_chain))
6940+        d.addCallback(lambda ignored:
6941+            mw.put_root_hash(self.root_hash))
6942+        d.addCallback(lambda ignored:
6943+            mw.put_signature(self.signature))
6944+        d.addCallback(lambda ignored:
6945+            mw.put_verification_key(self.verification_key))
6946+        d.addCallback(lambda ignored:
6947+            mw.finish_publishing())
6948+
6949+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6950+        def _check_block_and_salt((block, salt)):
6951+            self.failUnlessEqual(block, self.block)
6952+            self.failUnlessEqual(salt, self.salt)
6953+
6954+        for i in xrange(6):
6955+            d.addCallback(lambda ignored, i=i:
6956+                mr.get_block_and_salt(i))
6957+            d.addCallback(_check_block_and_salt)
6958+
6959+        d.addCallback(lambda ignored:
6960+            mr.get_encprivkey())
6961+        d.addCallback(lambda encprivkey:
6962+            self.failUnlessEqual(self.encprivkey, encprivkey))
6963+
6964+        d.addCallback(lambda ignored:
6965+            mr.get_blockhashes())
6966+        d.addCallback(lambda blockhashes:
6967+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
6968+
6969+        d.addCallback(lambda ignored:
6970+            mr.get_sharehashes())
6971+        d.addCallback(lambda sharehashes:
6972+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
6973+
6974+        d.addCallback(lambda ignored:
6975+            mr.get_signature())
6976+        d.addCallback(lambda signature:
6977+            self.failUnlessEqual(signature, self.signature))
6978+
6979+        d.addCallback(lambda ignored:
6980+            mr.get_verification_key())
6981+        d.addCallback(lambda verification_key:
6982+            self.failUnlessEqual(verification_key, self.verification_key))
6983+
6984+        d.addCallback(lambda ignored:
6985+            mr.get_seqnum())
6986+        d.addCallback(lambda seqnum:
6987+            self.failUnlessEqual(seqnum, 0))
6988+
6989+        d.addCallback(lambda ignored:
6990+            mr.get_root_hash())
6991+        d.addCallback(lambda root_hash:
6992+            self.failUnlessEqual(self.root_hash, root_hash))
6993+
6994+        d.addCallback(lambda ignored:
6995+            mr.get_encoding_parameters())
6996+        def _check_encoding_parameters((k, n, segsize, datalen)):
6997+            self.failUnlessEqual(k, 3)
6998+            self.failUnlessEqual(n, 10)
6999+            self.failUnlessEqual(segsize, 6)
7000+            self.failUnlessEqual(datalen, 36)
7001+        d.addCallback(_check_encoding_parameters)
7002+
7003+        d.addCallback(lambda ignored:
7004+            mr.get_checkstring())
7005+        d.addCallback(lambda checkstring:
7006+            self.failUnlessEqual(checkstring, mw.get_checkstring()))
7007+        return d
7008+
7009+
7010+    def test_is_sdmf(self):
7011+        # The MDMFSlotReadProxy should also know how to read SDMF files,
7012+        # since it will encounter them on the grid. Callers use the
7013+        # is_sdmf method to test this.
7014+        self.write_sdmf_share_to_server("si1")
7015+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7016+        d = mr.is_sdmf()
7017+        d.addCallback(lambda issdmf:
7018+            self.failUnless(issdmf))
7019+        return d
7020+
7021+
7022+    def test_reads_sdmf(self):
7023+        # The slot read proxy should, naturally, know how to tell us
7024+        # about data in the SDMF format
7025+        self.write_sdmf_share_to_server("si1")
7026+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7027+        d = defer.succeed(None)
7028+        d.addCallback(lambda ignored:
7029+            mr.is_sdmf())
7030+        d.addCallback(lambda issdmf:
7031+            self.failUnless(issdmf))
7032+
7033+        # What do we need to read?
7034+        #  - The sharedata
7035+        #  - The salt
7036+        d.addCallback(lambda ignored:
7037+            mr.get_block_and_salt(0))
7038+        def _check_block_and_salt(results):
7039+            block, salt = results
7040+            # Our original file is 36 bytes long. Then each share is 12
7041+            # bytes in size. The share is composed entirely of the
7042+            # letter a. self.block contains 2 as, so 6 * self.block is
7043+            # what we are looking for.
7044+            self.failUnlessEqual(block, self.block * 6)
7045+            self.failUnlessEqual(salt, self.salt)
7046+        d.addCallback(_check_block_and_salt)
7047+
7048+        #  - The blockhashes
7049+        d.addCallback(lambda ignored:
7050+            mr.get_blockhashes())
7051+        d.addCallback(lambda blockhashes:
7052+            self.failUnlessEqual(self.block_hash_tree,
7053+                                 blockhashes,
7054+                                 blockhashes))
7055+        #  - The sharehashes
7056+        d.addCallback(lambda ignored:
7057+            mr.get_sharehashes())
7058+        d.addCallback(lambda sharehashes:
7059+            self.failUnlessEqual(self.share_hash_chain,
7060+                                 sharehashes))
7061+        #  - The keys
7062+        d.addCallback(lambda ignored:
7063+            mr.get_encprivkey())
7064+        d.addCallback(lambda encprivkey:
7065+            self.failUnlessEqual(encprivkey, self.encprivkey, encprivkey))
7066+        d.addCallback(lambda ignored:
7067+            mr.get_verification_key())
7068+        d.addCallback(lambda verification_key:
7069+            self.failUnlessEqual(verification_key,
7070+                                 self.verification_key,
7071+                                 verification_key))
7072+        #  - The signature
7073+        d.addCallback(lambda ignored:
7074+            mr.get_signature())
7075+        d.addCallback(lambda signature:
7076+            self.failUnlessEqual(signature, self.signature, signature))
7077+
7078+        #  - The sequence number
7079+        d.addCallback(lambda ignored:
7080+            mr.get_seqnum())
7081+        d.addCallback(lambda seqnum:
7082+            self.failUnlessEqual(seqnum, 0, seqnum))
7083+
7084+        #  - The root hash
7085+        d.addCallback(lambda ignored:
7086+            mr.get_root_hash())
7087+        d.addCallback(lambda root_hash:
7088+            self.failUnlessEqual(root_hash, self.root_hash, root_hash))
7089+        return d
7090+
7091+
7092+    def test_only_reads_one_segment_sdmf(self):
7093+        # SDMF shares have only one segment, so it doesn't make sense to
7094+        # read more segments than that. The reader should know this and
7095+        # complain if we try to do that.
7096+        self.write_sdmf_share_to_server("si1")
7097+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7098+        d = defer.succeed(None)
7099+        d.addCallback(lambda ignored:
7100+            mr.is_sdmf())
7101+        d.addCallback(lambda issdmf:
7102+            self.failUnless(issdmf))
7103+        d.addCallback(lambda ignored:
7104+            self.shouldFail(LayoutInvalid, "test bad segment",
7105+                            None,
7106+                            mr.get_block_and_salt, 1))
7107+        return d
7108+
7109+
7110+    def test_read_with_prefetched_mdmf_data(self):
7111+        # The MDMFSlotReadProxy will prefill certain fields if you pass
7112+        # it data that you have already fetched. This is useful for
7113+        # cases like the Servermap, which prefetches ~2kb of data while
7114+        # finding out which shares are on the remote peer so that it
7115+        # doesn't waste round trips.
7116+        mdmf_data = self.build_test_mdmf_share()
7117+        self.write_test_share_to_server("si1")
7118+        def _make_mr(ignored, length):
7119+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, mdmf_data[:length])
7120+            return mr
7121+
7122+        d = defer.succeed(None)
7123+        # This should be enough to fill in both the encoding parameters
7124+        # and the table of offsets, which will complete the version
7125+        # information tuple.
7126+        d.addCallback(_make_mr, 107)
7127+        d.addCallback(lambda mr:
7128+            mr.get_verinfo())
7129+        def _check_verinfo(verinfo):
7130+            self.failUnless(verinfo)
7131+            self.failUnlessEqual(len(verinfo), 9)
7132+            (seqnum,
7133+             root_hash,
7134+             salt_hash,
7135+             segsize,
7136+             datalen,
7137+             k,
7138+             n,
7139+             prefix,
7140+             offsets) = verinfo
7141+            self.failUnlessEqual(seqnum, 0)
7142+            self.failUnlessEqual(root_hash, self.root_hash)
7143+            self.failUnlessEqual(segsize, 6)
7144+            self.failUnlessEqual(datalen, 36)
7145+            self.failUnlessEqual(k, 3)
7146+            self.failUnlessEqual(n, 10)
7147+            expected_prefix = struct.pack(MDMFSIGNABLEHEADER,
7148+                                          1,
7149+                                          seqnum,
7150+                                          root_hash,
7151+                                          k,
7152+                                          n,
7153+                                          segsize,
7154+                                          datalen)
7155+            self.failUnlessEqual(expected_prefix, prefix)
7156+            self.failUnlessEqual(self.rref.read_count, 0)
7157+        d.addCallback(_check_verinfo)
7158+        # This is not enough data to read a block and a share, so the
7159+        # wrapper should attempt to read this from the remote server.
7160+        d.addCallback(_make_mr, 107)
7161+        d.addCallback(lambda mr:
7162+            mr.get_block_and_salt(0))
7163+        def _check_block_and_salt((block, salt)):
7164+            self.failUnlessEqual(block, self.block)
7165+            self.failUnlessEqual(salt, self.salt)
7166+            self.failUnlessEqual(self.rref.read_count, 1)
7167+        # This should be enough data to read one block.
7168+        d.addCallback(_make_mr, 249)
7169+        d.addCallback(lambda mr:
7170+            mr.get_block_and_salt(0))
7171+        d.addCallback(_check_block_and_salt)
7172+        return d
7173+
7174+
7175+    def test_read_with_prefetched_sdmf_data(self):
7176+        sdmf_data = self.build_test_sdmf_share()
7177+        self.write_sdmf_share_to_server("si1")
7178+        def _make_mr(ignored, length):
7179+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, sdmf_data[:length])
7180+            return mr
7181+
7182+        d = defer.succeed(None)
7183+        # This should be enough to get us the encoding parameters,
7184+        # offset table, and everything else we need to build a verinfo
7185+        # string.
7186+        d.addCallback(_make_mr, 107)
7187+        d.addCallback(lambda mr:
7188+            mr.get_verinfo())
7189+        def _check_verinfo(verinfo):
7190+            self.failUnless(verinfo)
7191+            self.failUnlessEqual(len(verinfo), 9)
7192+            (seqnum,
7193+             root_hash,
7194+             salt,
7195+             segsize,
7196+             datalen,
7197+             k,
7198+             n,
7199+             prefix,
7200+             offsets) = verinfo
7201+            self.failUnlessEqual(seqnum, 0)
7202+            self.failUnlessEqual(root_hash, self.root_hash)
7203+            self.failUnlessEqual(salt, self.salt)
7204+            self.failUnlessEqual(segsize, 36)
7205+            self.failUnlessEqual(datalen, 36)
7206+            self.failUnlessEqual(k, 3)
7207+            self.failUnlessEqual(n, 10)
7208+            expected_prefix = struct.pack(SIGNED_PREFIX,
7209+                                          0,
7210+                                          seqnum,
7211+                                          root_hash,
7212+                                          salt,
7213+                                          k,
7214+                                          n,
7215+                                          segsize,
7216+                                          datalen)
7217+            self.failUnlessEqual(expected_prefix, prefix)
7218+            self.failUnlessEqual(self.rref.read_count, 0)
7219+        d.addCallback(_check_verinfo)
7220+        # This shouldn't be enough to read any share data.
7221+        d.addCallback(_make_mr, 107)
7222+        d.addCallback(lambda mr:
7223+            mr.get_block_and_salt(0))
7224+        def _check_block_and_salt((block, salt)):
7225+            self.failUnlessEqual(block, self.block * 6)
7226+            self.failUnlessEqual(salt, self.salt)
7227+            # TODO: Fix the read routine so that it reads only the data
7228+            #       that it has cached if it can't read all of it.
7229+            self.failUnlessEqual(self.rref.read_count, 2)
7230+
7231+        # This should be enough to read share data.
7232+        d.addCallback(_make_mr, self.offsets['share_data'])
7233+        d.addCallback(lambda mr:
7234+            mr.get_block_and_salt(0))
7235+        d.addCallback(_check_block_and_salt)
7236+        return d
7237+
7238+
7239+    def test_read_with_empty_mdmf_file(self):
7240+        # Some tests upload a file with no contents to test things
7241+        # unrelated to the actual handling of the content of the file.
7242+        # The reader should behave intelligently in these cases.
7243+        self.write_test_share_to_server("si1", empty=True)
7244+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7245+        # We should be able to get the encoding parameters, and they
7246+        # should be correct.
7247+        d = defer.succeed(None)
7248+        d.addCallback(lambda ignored:
7249+            mr.get_encoding_parameters())
7250+        def _check_encoding_parameters(params):
7251+            self.failUnlessEqual(len(params), 4)
7252+            k, n, segsize, datalen = params
7253+            self.failUnlessEqual(k, 3)
7254+            self.failUnlessEqual(n, 10)
7255+            self.failUnlessEqual(segsize, 0)
7256+            self.failUnlessEqual(datalen, 0)
7257+        d.addCallback(_check_encoding_parameters)
7258+
7259+        # We should not be able to fetch a block, since there are no
7260+        # blocks to fetch
7261+        d.addCallback(lambda ignored:
7262+            self.shouldFail(LayoutInvalid, "get block on empty file",
7263+                            None,
7264+                            mr.get_block_and_salt, 0))
7265+        return d
7266+
7267+
7268+    def test_read_with_empty_sdmf_file(self):
7269+        self.write_sdmf_share_to_server("si1", empty=True)
7270+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7271+        # We should be able to get the encoding parameters, and they
7272+        # should be correct
7273+        d = defer.succeed(None)
7274+        d.addCallback(lambda ignored:
7275+            mr.get_encoding_parameters())
7276+        def _check_encoding_parameters(params):
7277+            self.failUnlessEqual(len(params), 4)
7278+            k, n, segsize, datalen = params
7279+            self.failUnlessEqual(k, 3)
7280+            self.failUnlessEqual(n, 10)
7281+            self.failUnlessEqual(segsize, 0)
7282+            self.failUnlessEqual(datalen, 0)
7283+        d.addCallback(_check_encoding_parameters)
7284+
7285+        # It does not make sense to get a block in this format, so we
7286+        # should not be able to.
7287+        d.addCallback(lambda ignored:
7288+            self.shouldFail(LayoutInvalid, "get block on an empty file",
7289+                            None,
7290+                            mr.get_block_and_salt, 0))
7291+        return d
7292+
7293+
7294+    def test_verinfo_with_sdmf_file(self):
7295+        self.write_sdmf_share_to_server("si1")
7296+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7297+        # We should be able to get the version information.
7298+        d = defer.succeed(None)
7299+        d.addCallback(lambda ignored:
7300+            mr.get_verinfo())
7301+        def _check_verinfo(verinfo):
7302+            self.failUnless(verinfo)
7303+            self.failUnlessEqual(len(verinfo), 9)
7304+            (seqnum,
7305+             root_hash,
7306+             salt,
7307+             segsize,
7308+             datalen,
7309+             k,
7310+             n,
7311+             prefix,
7312+             offsets) = verinfo
7313+            self.failUnlessEqual(seqnum, 0)
7314+            self.failUnlessEqual(root_hash, self.root_hash)
7315+            self.failUnlessEqual(salt, self.salt)
7316+            self.failUnlessEqual(segsize, 36)
7317+            self.failUnlessEqual(datalen, 36)
7318+            self.failUnlessEqual(k, 3)
7319+            self.failUnlessEqual(n, 10)
7320+            expected_prefix = struct.pack(">BQ32s16s BBQQ",
7321+                                          0,
7322+                                          seqnum,
7323+                                          root_hash,
7324+                                          salt,
7325+                                          k,
7326+                                          n,
7327+                                          segsize,
7328+                                          datalen)
7329+            self.failUnlessEqual(prefix, expected_prefix)
7330+            self.failUnlessEqual(offsets, self.offsets)
7331+        d.addCallback(_check_verinfo)
7332+        return d
7333+
7334+
7335+    def test_verinfo_with_mdmf_file(self):
7336+        self.write_test_share_to_server("si1")
7337+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7338+        d = defer.succeed(None)
7339+        d.addCallback(lambda ignored:
7340+            mr.get_verinfo())
7341+        def _check_verinfo(verinfo):
7342+            self.failUnless(verinfo)
7343+            self.failUnlessEqual(len(verinfo), 9)
7344+            (seqnum,
7345+             root_hash,
7346+             IV,
7347+             segsize,
7348+             datalen,
7349+             k,
7350+             n,
7351+             prefix,
7352+             offsets) = verinfo
7353+            self.failUnlessEqual(seqnum, 0)
7354+            self.failUnlessEqual(root_hash, self.root_hash)
7355+            self.failIf(IV)
7356+            self.failUnlessEqual(segsize, 6)
7357+            self.failUnlessEqual(datalen, 36)
7358+            self.failUnlessEqual(k, 3)
7359+            self.failUnlessEqual(n, 10)
7360+            expected_prefix = struct.pack(">BQ32s BBQQ",
7361+                                          1,
7362+                                          seqnum,
7363+                                          root_hash,
7364+                                          k,
7365+                                          n,
7366+                                          segsize,
7367+                                          datalen)
7368+            self.failUnlessEqual(prefix, expected_prefix)
7369+            self.failUnlessEqual(offsets, self.offsets)
7370+        d.addCallback(_check_verinfo)
7371+        return d
7372+
7373+
7374+    def test_reader_queue(self):
7375+        self.write_test_share_to_server('si1')
7376+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7377+        d1 = mr.get_block_and_salt(0, queue=True)
7378+        d2 = mr.get_blockhashes(queue=True)
7379+        d3 = mr.get_sharehashes(queue=True)
7380+        d4 = mr.get_signature(queue=True)
7381+        d5 = mr.get_verification_key(queue=True)
7382+        dl = defer.DeferredList([d1, d2, d3, d4, d5])
7383+        mr.flush()
7384+        def _print(results):
7385+            self.failUnlessEqual(len(results), 5)
7386+            # We have one read for version information and offsets, and
7387+            # one for everything else.
7388+            self.failUnlessEqual(self.rref.read_count, 2)
7389+            block, salt = results[0][1] # results[0] is a boolean that says
7390+                                           # whether or not the operation
7391+                                           # worked.
7392+            self.failUnlessEqual(self.block, block)
7393+            self.failUnlessEqual(self.salt, salt)
7394+
7395+            blockhashes = results[1][1]
7396+            self.failUnlessEqual(self.block_hash_tree, blockhashes)
7397+
7398+            sharehashes = results[2][1]
7399+            self.failUnlessEqual(self.share_hash_chain, sharehashes)
7400+
7401+            signature = results[3][1]
7402+            self.failUnlessEqual(self.signature, signature)
7403+
7404+            verification_key = results[4][1]
7405+            self.failUnlessEqual(self.verification_key, verification_key)
7406+        dl.addCallback(_print)
7407+        return dl
7408+
7409+
7410+    def test_sdmf_writer(self):
7411+        # Go through the motions of writing an SDMF share to the storage
7412+        # server. Then read the storage server to see that the share got
7413+        # written in the way that we think it should have.
7414+
7415+        # We do this first so that the necessary instance variables get
7416+        # set the way we want them for the tests below.
7417+        data = self.build_test_sdmf_share()
7418+        sdmfr = SDMFSlotWriteProxy(0,
7419+                                   self.rref,
7420+                                   "si1",
7421+                                   self.secrets,
7422+                                   0, 3, 10, 36, 36)
7423+        # Put the block and salt.
7424+        sdmfr.put_block(self.blockdata, 0, self.salt)
7425+
7426+        # Put the encprivkey
7427+        sdmfr.put_encprivkey(self.encprivkey)
7428+
7429+        # Put the block and share hash chains
7430+        sdmfr.put_blockhashes(self.block_hash_tree)
7431+        sdmfr.put_sharehashes(self.share_hash_chain)
7432+        sdmfr.put_root_hash(self.root_hash)
7433+
7434+        # Put the signature
7435+        sdmfr.put_signature(self.signature)
7436+
7437+        # Put the verification key
7438+        sdmfr.put_verification_key(self.verification_key)
7439+
7440+        # Now check to make sure that nothing has been written yet.
7441+        self.failUnlessEqual(self.rref.write_count, 0)
7442+
7443+        # Now finish publishing
7444+        d = sdmfr.finish_publishing()
7445+        def _then(ignored):
7446+            self.failUnlessEqual(self.rref.write_count, 1)
7447+            read = self.ss.remote_slot_readv
7448+            self.failUnlessEqual(read("si1", [0], [(0, len(data))]),
7449+                                 {0: [data]})
7450+        d.addCallback(_then)
7451+        return d
7452+
7453+
7454+    def test_sdmf_writer_preexisting_share(self):
7455+        data = self.build_test_sdmf_share()
7456+        self.write_sdmf_share_to_server("si1")
7457+
7458+        # Now there is a share on the storage server. To successfully
7459+        # write, we need to set the checkstring correctly. When we
7460+        # don't, no write should occur.
7461+        sdmfw = SDMFSlotWriteProxy(0,
7462+                                   self.rref,
7463+                                   "si1",
7464+                                   self.secrets,
7465+                                   1, 3, 10, 36, 36)
7466+        sdmfw.put_block(self.blockdata, 0, self.salt)
7467+
7468+        # Put the encprivkey
7469+        sdmfw.put_encprivkey(self.encprivkey)
7470+
7471+        # Put the block and share hash chains
7472+        sdmfw.put_blockhashes(self.block_hash_tree)
7473+        sdmfw.put_sharehashes(self.share_hash_chain)
7474+
7475+        # Put the root hash
7476+        sdmfw.put_root_hash(self.root_hash)
7477+
7478+        # Put the signature
7479+        sdmfw.put_signature(self.signature)
7480+
7481+        # Put the verification key
7482+        sdmfw.put_verification_key(self.verification_key)
7483+
7484+        # We shouldn't have a checkstring yet
7485+        self.failUnlessEqual(sdmfw.get_checkstring(), "")
7486+
7487+        d = sdmfw.finish_publishing()
7488+        def _then(results):
7489+            self.failIf(results[0])
7490+            # this is the correct checkstring
7491+            self._expected_checkstring = results[1][0][0]
7492+            return self._expected_checkstring
7493+
7494+        d.addCallback(_then)
7495+        d.addCallback(sdmfw.set_checkstring)
7496+        d.addCallback(lambda ignored:
7497+            sdmfw.get_checkstring())
7498+        d.addCallback(lambda checkstring:
7499+            self.failUnlessEqual(checkstring, self._expected_checkstring))
7500+        d.addCallback(lambda ignored:
7501+            sdmfw.finish_publishing())
7502+        def _then_again(results):
7503+            self.failUnless(results[0])
7504+            read = self.ss.remote_slot_readv
7505+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
7506+                                 {0: [struct.pack(">Q", 1)]})
7507+            self.failUnlessEqual(read("si1", [0], [(9, len(data) - 9)]),
7508+                                 {0: [data[9:]]})
7509+        d.addCallback(_then_again)
7510+        return d
7511+
7512+
7513 class Stats(unittest.TestCase):
7514 
7515     def setUp(self):
7516}
7517[mutable/retrieve.py: Modify the retrieval process to support MDMF
7518Kevan Carstensen <kevan@isnotajoke.com>**20100819003409
7519 Ignore-this: c03f4e41aaa0366a9bf44847f2caf9db
7520 
7521 The logic behind a mutable file download had to be adapted to work with
7522 segmented mutable files; this patch performs those adaptations. It also
7523 exposes some decoding and decrypting functionality to make partial-file
7524 updates a little easier, and supports efficient random-access downloads
7525 of parts of an MDMF file.
7526] {
7527hunk ./src/allmydata/mutable/retrieve.py 2
7528 
7529-import struct, time
7530+import time
7531 from itertools import count
7532 from zope.interface import implements
7533 from twisted.internet import defer
7534merger 0.0 (
7535hunk ./src/allmydata/mutable/retrieve.py 10
7536+from allmydata.util.dictutil import DictOfSets
7537hunk ./src/allmydata/mutable/retrieve.py 7
7538-from foolscap.api import DeadReferenceError, eventually, fireEventually
7539-from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError
7540-from allmydata.util import hashutil, idlib, log
7541+from twisted.internet.interfaces import IPushProducer, IConsumer
7542+from foolscap.api import eventually, fireEventually
7543+from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError, \
7544+                                 MDMF_VERSION, SDMF_VERSION
7545+from allmydata.util import hashutil, log, mathutil
7546)
7547hunk ./src/allmydata/mutable/retrieve.py 16
7548 from pycryptopp.publickey import rsa
7549 
7550 from allmydata.mutable.common import CorruptShareError, UncoordinatedWriteError
7551-from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data
7552+from allmydata.mutable.layout import MDMFSlotReadProxy
7553 
7554 class RetrieveStatus:
7555     implements(IRetrieveStatus)
7556hunk ./src/allmydata/mutable/retrieve.py 83
7557     # times, and each will have a separate response chain. However the
7558     # Retrieve object will remain tied to a specific version of the file, and
7559     # will use a single ServerMap instance.
7560+    implements(IPushProducer)
7561 
7562hunk ./src/allmydata/mutable/retrieve.py 85
7563-    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False):
7564+    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False,
7565+                 verify=False):
7566         self._node = filenode
7567         assert self._node.get_pubkey()
7568         self._storage_index = filenode.get_storage_index()
7569hunk ./src/allmydata/mutable/retrieve.py 104
7570         self.verinfo = verinfo
7571         # during repair, we may be called upon to grab the private key, since
7572         # it wasn't picked up during a verify=False checker run, and we'll
7573-        # need it for repair to generate the a new version.
7574-        self._need_privkey = fetch_privkey
7575-        if self._node.get_privkey():
7576+        # need it for repair to generate a new version.
7577+        self._need_privkey = fetch_privkey or verify
7578+        if self._node.get_privkey() and not verify:
7579             self._need_privkey = False
7580 
7581hunk ./src/allmydata/mutable/retrieve.py 109
7582+        if self._need_privkey:
7583+            # TODO: Evaluate the need for this. We'll use it if we want
7584+            # to limit how many queries are on the wire for the privkey
7585+            # at once.
7586+            self._privkey_query_markers = [] # one Marker for each time we've
7587+                                             # tried to get the privkey.
7588+
7589+        # verify means that we are using the downloader logic to verify all
7590+        # of our shares. This tells the downloader a few things.
7591+        #
7592+        # 1. We need to download all of the shares.
7593+        # 2. We don't need to decode or decrypt the shares, since our
7594+        #    caller doesn't care about the plaintext, only the
7595+        #    information about which shares are or are not valid.
7596+        # 3. When we are validating readers, we need to validate the
7597+        #    signature on the prefix. Do we? We already do this in the
7598+        #    servermap update?
7599+        self._verify = False
7600+        if verify:
7601+            self._verify = True
7602+
7603         self._status = RetrieveStatus()
7604         self._status.set_storage_index(self._storage_index)
7605         self._status.set_helper(False)
7606hunk ./src/allmydata/mutable/retrieve.py 139
7607          offsets_tuple) = self.verinfo
7608         self._status.set_size(datalength)
7609         self._status.set_encoding(k, N)
7610+        self.readers = {}
7611+        self._paused = False
7612+        self._paused_deferred = None
7613+        self._offset = None
7614+        self._read_length = None
7615+        self.log("got seqnum %d" % self.verinfo[0])
7616+
7617 
7618     def get_status(self):
7619         return self._status
7620hunk ./src/allmydata/mutable/retrieve.py 157
7621             kwargs["facility"] = "tahoe.mutable.retrieve"
7622         return log.msg(*args, **kwargs)
7623 
7624-    def download(self):
7625+
7626+    ###################
7627+    # IPushProducer
7628+
7629+    def pauseProducing(self):
7630+        """
7631+        I am called by my download target if we have produced too much
7632+        data for it to handle. I make the downloader stop producing new
7633+        data until my resumeProducing method is called.
7634+        """
7635+        if self._paused:
7636+            return
7637+
7638+        # fired when the download is unpaused.
7639+        self._old_status = self._status.get_status()
7640+        self._status.set_status("Paused")
7641+
7642+        self._pause_deferred = defer.Deferred()
7643+        self._paused = True
7644+
7645+
7646+    def resumeProducing(self):
7647+        """
7648+        I am called by my download target once it is ready to begin
7649+        receiving data again.
7650+        """
7651+        if not self._paused:
7652+            return
7653+
7654+        self._paused = False
7655+        p = self._pause_deferred
7656+        self._pause_deferred = None
7657+        self._status.set_status(self._old_status)
7658+
7659+        eventually(p.callback, None)
7660+
7661+
7662+    def _check_for_paused(self, res):
7663+        """
7664+        I am called just before a write to the consumer. I return a
7665+        Deferred that eventually fires with the data that is to be
7666+        written to the consumer. If the download has not been paused,
7667+        the Deferred fires immediately. Otherwise, the Deferred fires
7668+        when the downloader is unpaused.
7669+        """
7670+        if self._paused:
7671+            d = defer.Deferred()
7672+            self._pause_defered.addCallback(lambda ignored: d.callback(res))
7673+            return d
7674+        return defer.succeed(res)
7675+
7676+
7677+    def download(self, consumer=None, offset=0, size=None):
7678+        assert IConsumer.providedBy(consumer) or self._verify
7679+
7680+        if consumer:
7681+            self._consumer = consumer
7682+            # we provide IPushProducer, so streaming=True, per
7683+            # IConsumer.
7684+            self._consumer.registerProducer(self, streaming=True)
7685+
7686         self._done_deferred = defer.Deferred()
7687         self._started = time.time()
7688         self._status.set_status("Retrieving Shares")
7689hunk ./src/allmydata/mutable/retrieve.py 222
7690 
7691+        self._offset = offset
7692+        self._read_length = size
7693+
7694         # first, which servers can we use?
7695         versionmap = self.servermap.make_versionmap()
7696         shares = versionmap[self.verinfo]
7697hunk ./src/allmydata/mutable/retrieve.py 232
7698         self.remaining_sharemap = DictOfSets()
7699         for (shnum, peerid, timestamp) in shares:
7700             self.remaining_sharemap.add(shnum, peerid)
7701+            # If the servermap update fetched anything, it fetched at least 1
7702+            # KiB, so we ask for that much.
7703+            # TODO: Change the cache methods to allow us to fetch all of the
7704+            # data that they have, then change this method to do that.
7705+            any_cache, timestamp = self._node._read_from_cache(self.verinfo,
7706+                                                               shnum,
7707+                                                               0,
7708+                                                               1000)
7709+            ss = self.servermap.connections[peerid]
7710+            reader = MDMFSlotReadProxy(ss,
7711+                                       self._storage_index,
7712+                                       shnum,
7713+                                       any_cache)
7714+            reader.peerid = peerid
7715+            self.readers[shnum] = reader
7716+
7717 
7718         self.shares = {} # maps shnum to validated blocks
7719hunk ./src/allmydata/mutable/retrieve.py 250
7720+        self._active_readers = [] # list of active readers for this dl.
7721+        self._validated_readers = set() # set of readers that we have
7722+                                        # validated the prefix of
7723+        self._block_hash_trees = {} # shnum => hashtree
7724 
7725         # how many shares do we need?
7726hunk ./src/allmydata/mutable/retrieve.py 256
7727-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
7728+        (seqnum,
7729+         root_hash,
7730+         IV,
7731+         segsize,
7732+         datalength,
7733+         k,
7734+         N,
7735+         prefix,
7736          offsets_tuple) = self.verinfo
7737hunk ./src/allmydata/mutable/retrieve.py 265
7738-        assert len(self.remaining_sharemap) >= k
7739-        # we start with the lowest shnums we have available, since FEC is
7740-        # faster if we're using "primary shares"
7741-        self.active_shnums = set(sorted(self.remaining_sharemap.keys())[:k])
7742-        for shnum in self.active_shnums:
7743-            # we use an arbitrary peer who has the share. If shares are
7744-            # doubled up (more than one share per peer), we could make this
7745-            # run faster by spreading the load among multiple peers. But the
7746-            # algorithm to do that is more complicated than I want to write
7747-            # right now, and a well-provisioned grid shouldn't have multiple
7748-            # shares per peer.
7749-            peerid = list(self.remaining_sharemap[shnum])[0]
7750-            self.get_data(shnum, peerid)
7751 
7752hunk ./src/allmydata/mutable/retrieve.py 266
7753-        # control flow beyond this point: state machine. Receiving responses
7754-        # from queries is the input. We might send out more queries, or we
7755-        # might produce a result.
7756 
7757hunk ./src/allmydata/mutable/retrieve.py 267
7758+        # We need one share hash tree for the entire file; its leaves
7759+        # are the roots of the block hash trees for the shares that
7760+        # comprise it, and its root is in the verinfo.
7761+        self.share_hash_tree = hashtree.IncompleteHashTree(N)
7762+        self.share_hash_tree.set_hashes({0: root_hash})
7763+
7764+        # This will set up both the segment decoder and the tail segment
7765+        # decoder, as well as a variety of other instance variables that
7766+        # the download process will use.
7767+        self._setup_encoding_parameters()
7768+        assert len(self.remaining_sharemap) >= k
7769+
7770+        self.log("starting download")
7771+        self._paused = False
7772+        self._started_fetching = time.time()
7773+
7774+        self._add_active_peers()
7775+        # The download process beyond this is a state machine.
7776+        # _add_active_peers will select the peers that we want to use
7777+        # for the download, and then attempt to start downloading. After
7778+        # each segment, it will check for doneness, reacting to broken
7779+        # peers and corrupt shares as necessary. If it runs out of good
7780+        # peers before downloading all of the segments, _done_deferred
7781+        # will errback.  Otherwise, it will eventually callback with the
7782+        # contents of the mutable file.
7783         return self._done_deferred
7784 
7785hunk ./src/allmydata/mutable/retrieve.py 294
7786-    def get_data(self, shnum, peerid):
7787-        self.log(format="sending sh#%(shnum)d request to [%(peerid)s]",
7788-                 shnum=shnum,
7789-                 peerid=idlib.shortnodeid_b2a(peerid),
7790-                 level=log.NOISY)
7791-        ss = self.servermap.connections[peerid]
7792-        started = time.time()
7793-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
7794+
7795+    def decode(self, blocks_and_salts, segnum):
7796+        """
7797+        I am a helper method that the mutable file update process uses
7798+        as a shortcut to decode and decrypt the segments that it needs
7799+        to fetch in order to perform a file update. I take in a
7800+        collection of blocks and salts, and pick some of those to make a
7801+        segment with. I return the plaintext associated with that
7802+        segment.
7803+        """
7804+        # shnum => block hash tree. Unusued, but setup_encoding_parameters will
7805+        # want to set this.
7806+        # XXX: Make it so that it won't set this if we're just decoding.
7807+        self._block_hash_trees = {}
7808+        self._setup_encoding_parameters()
7809+        # This is the form expected by decode.
7810+        blocks_and_salts = blocks_and_salts.items()
7811+        blocks_and_salts = [(True, [d]) for d in blocks_and_salts]
7812+
7813+        d = self._decode_blocks(blocks_and_salts, segnum)
7814+        d.addCallback(self._decrypt_segment)
7815+        return d
7816+
7817+
7818+    def _setup_encoding_parameters(self):
7819+        """
7820+        I set up the encoding parameters, including k, n, the number
7821+        of segments associated with this file, and the segment decoder.
7822+        """
7823+        (seqnum,
7824+         root_hash,
7825+         IV,
7826+         segsize,
7827+         datalength,
7828+         k,
7829+         n,
7830+         known_prefix,
7831          offsets_tuple) = self.verinfo
7832hunk ./src/allmydata/mutable/retrieve.py 332
7833-        offsets = dict(offsets_tuple)
7834+        self._required_shares = k
7835+        self._total_shares = n
7836+        self._segment_size = segsize
7837+        self._data_length = datalength
7838 
7839hunk ./src/allmydata/mutable/retrieve.py 337
7840-        # we read the checkstring, to make sure that the data we grab is from
7841-        # the right version.
7842-        readv = [ (0, struct.calcsize(SIGNED_PREFIX)) ]
7843+        if not IV:
7844+            self._version = MDMF_VERSION
7845+        else:
7846+            self._version = SDMF_VERSION
7847 
7848hunk ./src/allmydata/mutable/retrieve.py 342
7849-        # We also read the data, and the hashes necessary to validate them
7850-        # (share_hash_chain, block_hash_tree, share_data). We don't read the
7851-        # signature or the pubkey, since that was handled during the
7852-        # servermap phase, and we'll be comparing the share hash chain
7853-        # against the roothash that was validated back then.
7854+        if datalength and segsize:
7855+            self._num_segments = mathutil.div_ceil(datalength, segsize)
7856+            self._tail_data_size = datalength % segsize
7857+        else:
7858+            self._num_segments = 0
7859+            self._tail_data_size = 0
7860 
7861hunk ./src/allmydata/mutable/retrieve.py 349
7862-        readv.append( (offsets['share_hash_chain'],
7863-                       offsets['enc_privkey'] - offsets['share_hash_chain'] ) )
7864+        self._segment_decoder = codec.CRSDecoder()
7865+        self._segment_decoder.set_params(segsize, k, n)
7866 
7867hunk ./src/allmydata/mutable/retrieve.py 352
7868-        # if we need the private key (for repair), we also fetch that
7869-        if self._need_privkey:
7870-            readv.append( (offsets['enc_privkey'],
7871-                           offsets['EOF'] - offsets['enc_privkey']) )
7872+        if  not self._tail_data_size:
7873+            self._tail_data_size = segsize
7874+
7875+        self._tail_segment_size = mathutil.next_multiple(self._tail_data_size,
7876+                                                         self._required_shares)
7877+        if self._tail_segment_size == self._segment_size:
7878+            self._tail_decoder = self._segment_decoder
7879+        else:
7880+            self._tail_decoder = codec.CRSDecoder()
7881+            self._tail_decoder.set_params(self._tail_segment_size,
7882+                                          self._required_shares,
7883+                                          self._total_shares)
7884 
7885hunk ./src/allmydata/mutable/retrieve.py 365
7886-        m = Marker()
7887-        self._outstanding_queries[m] = (peerid, shnum, started)
7888+        self.log("got encoding parameters: "
7889+                 "k: %d "
7890+                 "n: %d "
7891+                 "%d segments of %d bytes each (%d byte tail segment)" % \
7892+                 (k, n, self._num_segments, self._segment_size,
7893+                  self._tail_segment_size))
7894 
7895         # ask the cache first
7896         got_from_cache = False
7897merger 0.0 (
7898hunk ./src/allmydata/mutable/retrieve.py 376
7899-            (data, timestamp) = self._node._read_from_cache(self.verinfo, shnum,
7900-                                                            offset, length)
7901+            data = self._node._read_from_cache(self.verinfo, shnum, offset, length)
7902hunk ./src/allmydata/mutable/retrieve.py 372
7903-        # ask the cache first
7904-        got_from_cache = False
7905-        datavs = []
7906-        for (offset, length) in readv:
7907-            (data, timestamp) = self._node._read_from_cache(self.verinfo, shnum,
7908-                                                            offset, length)
7909-            if data is not None:
7910-                datavs.append(data)
7911-        if len(datavs) == len(readv):
7912-            self.log("got data from cache")
7913-            got_from_cache = True
7914-            d = fireEventually({shnum: datavs})
7915-            # datavs is a dict mapping shnum to a pair of strings
7916+        for i in xrange(self._total_shares):
7917+            # So we don't have to do this later.
7918+            self._block_hash_trees[i] = hashtree.IncompleteHashTree(self._num_segments)
7919+
7920+        # Our last task is to tell the downloader where to start and
7921+        # where to stop. We use three parameters for that:
7922+        #   - self._start_segment: the segment that we need to start
7923+        #     downloading from.
7924+        #   - self._current_segment: the next segment that we need to
7925+        #     download.
7926+        #   - self._last_segment: The last segment that we were asked to
7927+        #     download.
7928+        #
7929+        #  We say that the download is complete when
7930+        #  self._current_segment > self._last_segment. We use
7931+        #  self._start_segment and self._last_segment to know when to
7932+        #  strip things off of segments, and how much to strip.
7933+        if self._offset:
7934+            self.log("got offset: %d" % self._offset)
7935+            # our start segment is the first segment containing the
7936+            # offset we were given.
7937+            start = mathutil.div_ceil(self._offset,
7938+                                      self._segment_size)
7939+            # this gets us the first segment after self._offset. Then
7940+            # our start segment is the one before it.
7941+            start -= 1
7942+
7943+            assert start < self._num_segments
7944+            self._start_segment = start
7945+            self.log("got start segment: %d" % self._start_segment)
7946)
7947hunk ./src/allmydata/mutable/retrieve.py 386
7948             d = fireEventually({shnum: datavs})
7949             # datavs is a dict mapping shnum to a pair of strings
7950         else:
7951-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
7952-        self.remaining_sharemap.discard(shnum, peerid)
7953+            self._start_segment = 0
7954 
7955hunk ./src/allmydata/mutable/retrieve.py 388
7956-        d.addCallback(self._got_results, m, peerid, started, got_from_cache)
7957-        d.addErrback(self._query_failed, m, peerid)
7958-        # errors that aren't handled by _query_failed (and errors caused by
7959-        # _query_failed) get logged, but we still want to check for doneness.
7960-        def _oops(f):
7961-            self.log(format="problem in _query_failed for sh#%(shnum)d to %(peerid)s",
7962-                     shnum=shnum,
7963-                     peerid=idlib.shortnodeid_b2a(peerid),
7964-                     failure=f,
7965-                     level=log.WEIRD, umid="W0xnQA")
7966-        d.addErrback(_oops)
7967-        d.addBoth(self._check_for_done)
7968-        # any error during _check_for_done means the download fails. If the
7969-        # download is successful, _check_for_done will fire _done by itself.
7970-        d.addErrback(self._done)
7971-        d.addErrback(log.err)
7972-        return d # purely for testing convenience
7973 
7974hunk ./src/allmydata/mutable/retrieve.py 389
7975-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
7976-        # isolate the callRemote to a separate method, so tests can subclass
7977-        # Publish and override it
7978-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
7979-        return d
7980+        if self._read_length:
7981+            # our end segment is the last segment containing part of the
7982+            # segment that we were asked to read.
7983+            self.log("got read length %d" % self._read_length)
7984+            end_data = self._offset + self._read_length
7985+            end = mathutil.div_ceil(end_data,
7986+                                    self._segment_size)
7987+            end -= 1
7988+            assert end < self._num_segments
7989+            self._last_segment = end
7990+            self.log("got end segment: %d" % self._last_segment)
7991+        else:
7992+            self._last_segment = self._num_segments - 1
7993 
7994hunk ./src/allmydata/mutable/retrieve.py 403
7995-    def remove_peer(self, peerid):
7996-        for shnum in list(self.remaining_sharemap.keys()):
7997-            self.remaining_sharemap.discard(shnum, peerid)
7998+        self._current_segment = self._start_segment
7999 
8000hunk ./src/allmydata/mutable/retrieve.py 405
8001-    def _got_results(self, datavs, marker, peerid, started, got_from_cache):
8002-        now = time.time()
8003-        elapsed = now - started
8004-        if not got_from_cache:
8005-            self._status.add_fetch_timing(peerid, elapsed)
8006-        self.log(format="got results (%(shares)d shares) from [%(peerid)s]",
8007-                 shares=len(datavs),
8008-                 peerid=idlib.shortnodeid_b2a(peerid),
8009-                 level=log.NOISY)
8010-        self._outstanding_queries.pop(marker, None)
8011-        if not self._running:
8012-            return
8013+    def _add_active_peers(self):
8014+        """
8015+        I populate self._active_readers with enough active readers to
8016+        retrieve the contents of this mutable file. I am called before
8017+        downloading starts, and (eventually) after each validation
8018+        error, connection error, or other problem in the download.
8019+        """
8020+        # TODO: It would be cool to investigate other heuristics for
8021+        # reader selection. For instance, the cost (in time the user
8022+        # spends waiting for their file) of selecting a really slow peer
8023+        # that happens to have a primary share is probably more than
8024+        # selecting a really fast peer that doesn't have a primary
8025+        # share. Maybe the servermap could be extended to provide this
8026+        # information; it could keep track of latency information while
8027+        # it gathers more important data, and then this routine could
8028+        # use that to select active readers.
8029+        #
8030+        # (these and other questions would be easier to answer with a
8031+        #  robust, configurable tahoe-lafs simulator, which modeled node
8032+        #  failures, differences in node speed, and other characteristics
8033+        #  that we expect storage servers to have.  You could have
8034+        #  presets for really stable grids (like allmydata.com),
8035+        #  friendnets, make it easy to configure your own settings, and
8036+        #  then simulate the effect of big changes on these use cases
8037+        #  instead of just reasoning about what the effect might be. Out
8038+        #  of scope for MDMF, though.)
8039 
8040hunk ./src/allmydata/mutable/retrieve.py 432
8041-        # note that we only ask for a single share per query, so we only
8042-        # expect a single share back. On the other hand, we use the extra
8043-        # shares if we get them.. seems better than an assert().
8044+        # We need at least self._required_shares readers to download a
8045+        # segment.
8046+        if self._verify:
8047+            needed = self._total_shares
8048+        else:
8049+            needed = self._required_shares - len(self._active_readers)
8050+        # XXX: Why don't format= log messages work here?
8051+        self.log("adding %d peers to the active peers list" % needed)
8052 
8053hunk ./src/allmydata/mutable/retrieve.py 441
8054-        for shnum,datav in datavs.items():
8055-            (prefix, hash_and_data) = datav[:2]
8056-            try:
8057-                self._got_results_one_share(shnum, peerid,
8058-                                            prefix, hash_and_data)
8059-            except CorruptShareError, e:
8060-                # log it and give the other shares a chance to be processed
8061-                f = failure.Failure()
8062-                self.log(format="bad share: %(f_value)s",
8063-                         f_value=str(f.value), failure=f,
8064-                         level=log.WEIRD, umid="7fzWZw")
8065-                self.notify_server_corruption(peerid, shnum, str(e))
8066-                self.remove_peer(peerid)
8067-                self.servermap.mark_bad_share(peerid, shnum, prefix)
8068-                self._bad_shares.add( (peerid, shnum) )
8069-                self._status.problems[peerid] = f
8070-                self._last_failure = f
8071-                pass
8072-            if self._need_privkey and len(datav) > 2:
8073-                lp = None
8074-                self._try_to_validate_privkey(datav[2], peerid, shnum, lp)
8075-        # all done!
8076+        # We favor lower numbered shares, since FEC is faster with
8077+        # primary shares than with other shares, and lower-numbered
8078+        # shares are more likely to be primary than higher numbered
8079+        # shares.
8080+        active_shnums = set(sorted(self.remaining_sharemap.keys()))
8081+        # We shouldn't consider adding shares that we already have; this
8082+        # will cause problems later.
8083+        active_shnums -= set([reader.shnum for reader in self._active_readers])
8084+        active_shnums = list(active_shnums)[:needed]
8085+        if len(active_shnums) < needed and not self._verify:
8086+            # We don't have enough readers to retrieve the file; fail.
8087+            return self._failed()
8088 
8089hunk ./src/allmydata/mutable/retrieve.py 454
8090-    def notify_server_corruption(self, peerid, shnum, reason):
8091-        ss = self.servermap.connections[peerid]
8092-        ss.callRemoteOnly("advise_corrupt_share",
8093-                          "mutable", self._storage_index, shnum, reason)
8094+        for shnum in active_shnums:
8095+            self._active_readers.append(self.readers[shnum])
8096+            self.log("added reader for share %d" % shnum)
8097+        assert len(self._active_readers) >= self._required_shares
8098+        # Conceptually, this is part of the _add_active_peers step. It
8099+        # validates the prefixes of newly added readers to make sure
8100+        # that they match what we are expecting for self.verinfo. If
8101+        # validation is successful, _validate_active_prefixes will call
8102+        # _download_current_segment for us. If validation is
8103+        # unsuccessful, then _validate_prefixes will remove the peer and
8104+        # call _add_active_peers again, where we will attempt to rectify
8105+        # the problem by choosing another peer.
8106+        return self._validate_active_prefixes()
8107 
8108hunk ./src/allmydata/mutable/retrieve.py 468
8109-    def _got_results_one_share(self, shnum, peerid,
8110-                               got_prefix, got_hash_and_data):
8111-        self.log("_got_results: got shnum #%d from peerid %s"
8112-                 % (shnum, idlib.shortnodeid_b2a(peerid)))
8113-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
8114-         offsets_tuple) = self.verinfo
8115-        assert len(got_prefix) == len(prefix), (len(got_prefix), len(prefix))
8116-        if got_prefix != prefix:
8117-            msg = "someone wrote to the data since we read the servermap: prefix changed"
8118-            raise UncoordinatedWriteError(msg)
8119-        (share_hash_chain, block_hash_tree,
8120-         share_data) = unpack_share_data(self.verinfo, got_hash_and_data)
8121 
8122hunk ./src/allmydata/mutable/retrieve.py 469
8123-        assert isinstance(share_data, str)
8124-        # build the block hash tree. SDMF has only one leaf.
8125-        leaves = [hashutil.block_hash(share_data)]
8126-        t = hashtree.HashTree(leaves)
8127-        if list(t) != block_hash_tree:
8128-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
8129-        share_hash_leaf = t[0]
8130-        t2 = hashtree.IncompleteHashTree(N)
8131-        # root_hash was checked by the signature
8132-        t2.set_hashes({0: root_hash})
8133-        try:
8134-            t2.set_hashes(hashes=share_hash_chain,
8135-                          leaves={shnum: share_hash_leaf})
8136-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
8137-                IndexError), e:
8138-            msg = "corrupt hashes: %s" % (e,)
8139-            raise CorruptShareError(peerid, shnum, msg)
8140-        self.log(" data valid! len=%d" % len(share_data))
8141-        # each query comes down to this: placing validated share data into
8142-        # self.shares
8143-        self.shares[shnum] = share_data
8144+    def _validate_active_prefixes(self):
8145+        """
8146+        I check to make sure that the prefixes on the peers that I am
8147+        currently reading from match the prefix that we want to see, as
8148+        said in self.verinfo.
8149 
8150hunk ./src/allmydata/mutable/retrieve.py 475
8151-    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
8152+        If I find that all of the active peers have acceptable prefixes,
8153+        I pass control to _download_current_segment, which will use
8154+        those peers to do cool things. If I find that some of the active
8155+        peers have unacceptable prefixes, I will remove them from active
8156+        peers (and from further consideration) and call
8157+        _add_active_peers to attempt to rectify the situation. I keep
8158+        track of which peers I have already validated so that I don't
8159+        need to do so again.
8160+        """
8161+        assert self._active_readers, "No more active readers"
8162 
8163hunk ./src/allmydata/mutable/retrieve.py 486
8164-        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
8165-        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
8166-        if alleged_writekey != self._node.get_writekey():
8167-            self.log("invalid privkey from %s shnum %d" %
8168-                     (idlib.nodeid_b2a(peerid)[:8], shnum),
8169-                     parent=lp, level=log.WEIRD, umid="YIw4tA")
8170-            return
8171+        ds = []
8172+        new_readers = set(self._active_readers) - self._validated_readers
8173+        self.log('validating %d newly-added active readers' % len(new_readers))
8174 
8175hunk ./src/allmydata/mutable/retrieve.py 490
8176-        # it's good
8177-        self.log("got valid privkey from shnum %d on peerid %s" %
8178-                 (shnum, idlib.shortnodeid_b2a(peerid)),
8179-                 parent=lp)
8180-        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
8181-        self._node._populate_encprivkey(enc_privkey)
8182-        self._node._populate_privkey(privkey)
8183-        self._need_privkey = False
8184+        for reader in new_readers:
8185+            # We force a remote read here -- otherwise, we are relying
8186+            # on cached data that we already verified as valid, and we
8187+            # won't detect an uncoordinated write that has occurred
8188+            # since the last servermap update.
8189+            d = reader.get_prefix(force_remote=True)
8190+            d.addCallback(self._try_to_validate_prefix, reader)
8191+            ds.append(d)
8192+        dl = defer.DeferredList(ds, consumeErrors=True)
8193+        def _check_results(results):
8194+            # Each result in results will be of the form (success, msg).
8195+            # We don't care about msg, but success will tell us whether
8196+            # or not the checkstring validated. If it didn't, we need to
8197+            # remove the offending (peer,share) from our active readers,
8198+            # and ensure that active readers is again populated.
8199+            bad_readers = []
8200+            for i, result in enumerate(results):
8201+                if not result[0]:
8202+                    reader = self._active_readers[i]
8203+                    f = result[1]
8204+                    assert isinstance(f, failure.Failure)
8205 
8206hunk ./src/allmydata/mutable/retrieve.py 512
8207-    def _query_failed(self, f, marker, peerid):
8208-        self.log(format="query to [%(peerid)s] failed",
8209-                 peerid=idlib.shortnodeid_b2a(peerid),
8210-                 level=log.NOISY)
8211-        self._status.problems[peerid] = f
8212-        self._outstanding_queries.pop(marker, None)
8213-        if not self._running:
8214-            return
8215-        self._last_failure = f
8216-        self.remove_peer(peerid)
8217-        level = log.WEIRD
8218-        if f.check(DeadReferenceError):
8219-            level = log.UNUSUAL
8220-        self.log(format="error during query: %(f_value)s",
8221-                 f_value=str(f.value), failure=f, level=level, umid="gOJB5g")
8222+                    self.log("The reader %s failed to "
8223+                             "properly validate: %s" % \
8224+                             (reader, str(f.value)))
8225+                    bad_readers.append((reader, f))
8226+                else:
8227+                    reader = self._active_readers[i]
8228+                    self.log("the reader %s checks out, so we'll use it" % \
8229+                             reader)
8230+                    self._validated_readers.add(reader)
8231+                    # Each time we validate a reader, we check to see if
8232+                    # we need the private key. If we do, we politely ask
8233+                    # for it and then continue computing. If we find
8234+                    # that we haven't gotten it at the end of
8235+                    # segment decoding, then we'll take more drastic
8236+                    # measures.
8237+                    if self._need_privkey and not self._node.is_readonly():
8238+                        d = reader.get_encprivkey()
8239+                        d.addCallback(self._try_to_validate_privkey, reader)
8240+            if bad_readers:
8241+                # We do them all at once, or else we screw up list indexing.
8242+                for (reader, f) in bad_readers:
8243+                    self._mark_bad_share(reader, f)
8244+                if self._verify:
8245+                    if len(self._active_readers) >= self._required_shares:
8246+                        return self._download_current_segment()
8247+                    else:
8248+                        return self._failed()
8249+                else:
8250+                    return self._add_active_peers()
8251+            else:
8252+                return self._download_current_segment()
8253+            # The next step will assert that it has enough active
8254+            # readers to fetch shares; we just need to remove it.
8255+        dl.addCallback(_check_results)
8256+        return dl
8257 
8258hunk ./src/allmydata/mutable/retrieve.py 548
8259-    def _check_for_done(self, res):
8260-        # exit paths:
8261-        #  return : keep waiting, no new queries
8262-        #  return self._send_more_queries(outstanding) : send some more queries
8263-        #  fire self._done(plaintext) : download successful
8264-        #  raise exception : download fails
8265 
8266hunk ./src/allmydata/mutable/retrieve.py 549
8267-        self.log(format="_check_for_done: running=%(running)s, decoding=%(decoding)s",
8268-                 running=self._running, decoding=self._decoding,
8269-                 level=log.NOISY)
8270-        if not self._running:
8271-            return
8272-        if self._decoding:
8273-            return
8274-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
8275+    def _try_to_validate_prefix(self, prefix, reader):
8276+        """
8277+        I check that the prefix returned by a candidate server for
8278+        retrieval matches the prefix that the servermap knows about
8279+        (and, hence, the prefix that was validated earlier). If it does,
8280+        I return True, which means that I approve of the use of the
8281+        candidate server for segment retrieval. If it doesn't, I return
8282+        False, which means that another server must be chosen.
8283+        """
8284+        (seqnum,
8285+         root_hash,
8286+         IV,
8287+         segsize,
8288+         datalength,
8289+         k,
8290+         N,
8291+         known_prefix,
8292          offsets_tuple) = self.verinfo
8293hunk ./src/allmydata/mutable/retrieve.py 567
8294+        if known_prefix != prefix:
8295+            self.log("prefix from share %d doesn't match" % reader.shnum)
8296+            raise UncoordinatedWriteError("Mismatched prefix -- this could "
8297+                                          "indicate an uncoordinated write")
8298+        # Otherwise, we're okay -- no issues.
8299 
8300hunk ./src/allmydata/mutable/retrieve.py 573
8301-        if len(self.shares) < k:
8302-            # we don't have enough shares yet
8303-            return self._maybe_send_more_queries(k)
8304-        if self._need_privkey:
8305-            # we got k shares, but none of them had a valid privkey. TODO:
8306-            # look further. Adding code to do this is a bit complicated, and
8307-            # I want to avoid that complication, and this should be pretty
8308-            # rare (k shares with bitflips in the enc_privkey but not in the
8309-            # data blocks). If we actually do get here, the subsequent repair
8310-            # will fail for lack of a privkey.
8311-            self.log("got k shares but still need_privkey, bummer",
8312-                     level=log.WEIRD, umid="MdRHPA")
8313 
8314hunk ./src/allmydata/mutable/retrieve.py 574
8315-        # we have enough to finish. All the shares have had their hashes
8316-        # checked, so if something fails at this point, we don't know how
8317-        # to fix it, so the download will fail.
8318+    def _remove_reader(self, reader):
8319+        """
8320+        At various points, we will wish to remove a peer from
8321+        consideration and/or use. These include, but are not necessarily
8322+        limited to:
8323 
8324hunk ./src/allmydata/mutable/retrieve.py 580
8325-        self._decoding = True # avoid reentrancy
8326-        self._status.set_status("decoding")
8327-        now = time.time()
8328-        elapsed = now - self._started
8329-        self._status.timings["fetch"] = elapsed
8330+            - A connection error.
8331+            - A mismatched prefix (that is, a prefix that does not match
8332+              our conception of the version information string).
8333+            - A failing block hash, salt hash, or share hash, which can
8334+              indicate disk failure/bit flips, or network trouble.
8335 
8336hunk ./src/allmydata/mutable/retrieve.py 586
8337-        d = defer.maybeDeferred(self._decode)
8338-        d.addCallback(self._decrypt, IV, self._node.get_readkey())
8339-        d.addBoth(self._done)
8340-        return d # purely for test convenience
8341+        This method will do that. I will make sure that the
8342+        (shnum,reader) combination represented by my reader argument is
8343+        not used for anything else during this download. I will not
8344+        advise the reader of any corruption, something that my callers
8345+        may wish to do on their own.
8346+        """
8347+        # TODO: When you're done writing this, see if this is ever
8348+        # actually used for something that _mark_bad_share isn't. I have
8349+        # a feeling that they will be used for very similar things, and
8350+        # that having them both here is just going to be an epic amount
8351+        # of code duplication.
8352+        #
8353+        # (well, okay, not epic, but meaningful)
8354+        self.log("removing reader %s" % reader)
8355+        # Remove the reader from _active_readers
8356+        self._active_readers.remove(reader)
8357+        # TODO: self.readers.remove(reader)?
8358+        for shnum in list(self.remaining_sharemap.keys()):
8359+            self.remaining_sharemap.discard(shnum, reader.peerid)
8360 
8361hunk ./src/allmydata/mutable/retrieve.py 606
8362-    def _maybe_send_more_queries(self, k):
8363-        # we don't have enough shares yet. Should we send out more queries?
8364-        # There are some number of queries outstanding, each for a single
8365-        # share. If we can generate 'needed_shares' additional queries, we do
8366-        # so. If we can't, then we know this file is a goner, and we raise
8367-        # NotEnoughSharesError.
8368-        self.log(format=("_maybe_send_more_queries, have=%(have)d, k=%(k)d, "
8369-                         "outstanding=%(outstanding)d"),
8370-                 have=len(self.shares), k=k,
8371-                 outstanding=len(self._outstanding_queries),
8372-                 level=log.NOISY)
8373 
8374hunk ./src/allmydata/mutable/retrieve.py 607
8375-        remaining_shares = k - len(self.shares)
8376-        needed = remaining_shares - len(self._outstanding_queries)
8377-        if not needed:
8378-            # we have enough queries in flight already
8379+    def _mark_bad_share(self, reader, f):
8380+        """
8381+        I mark the (peerid, shnum) encapsulated by my reader argument as
8382+        a bad share, which means that it will not be used anywhere else.
8383 
8384hunk ./src/allmydata/mutable/retrieve.py 612
8385-            # TODO: but if they've been in flight for a long time, and we
8386-            # have reason to believe that new queries might respond faster
8387-            # (i.e. we've seen other queries come back faster, then consider
8388-            # sending out new queries. This could help with peers which have
8389-            # silently gone away since the servermap was updated, for which
8390-            # we're still waiting for the 15-minute TCP disconnect to happen.
8391-            self.log("enough queries are in flight, no more are needed",
8392-                     level=log.NOISY)
8393-            return
8394+        There are several reasons to want to mark something as a bad
8395+        share. These include:
8396+
8397+            - A connection error to the peer.
8398+            - A mismatched prefix (that is, a prefix that does not match
8399+              our local conception of the version information string).
8400+            - A failing block hash, salt hash, share hash, or other
8401+              integrity check.
8402 
8403hunk ./src/allmydata/mutable/retrieve.py 621
8404-        outstanding_shnums = set([shnum
8405-                                  for (peerid, shnum, started)
8406-                                  in self._outstanding_queries.values()])
8407-        # prefer low-numbered shares, they are more likely to be primary
8408-        available_shnums = sorted(self.remaining_sharemap.keys())
8409-        for shnum in available_shnums:
8410-            if shnum in outstanding_shnums:
8411-                # skip ones that are already in transit
8412-                continue
8413-            if shnum not in self.remaining_sharemap:
8414-                # no servers for that shnum. note that DictOfSets removes
8415-                # empty sets from the dict for us.
8416-                continue
8417-            peerid = list(self.remaining_sharemap[shnum])[0]
8418-            # get_data will remove that peerid from the sharemap, and add the
8419-            # query to self._outstanding_queries
8420-            self._status.set_status("Retrieving More Shares")
8421-            self.get_data(shnum, peerid)
8422-            needed -= 1
8423-            if not needed:
8424+        This method will ensure that readers that we wish to mark bad
8425+        (for these reasons or other reasons) are not used for the rest
8426+        of the download. Additionally, it will attempt to tell the
8427+        remote peer (with no guarantee of success) that its share is
8428+        corrupt.
8429+        """
8430+        self.log("marking share %d on server %s as bad" % \
8431+                 (reader.shnum, reader))
8432+        prefix = self.verinfo[-2]
8433+        self.servermap.mark_bad_share(reader.peerid,
8434+                                      reader.shnum,
8435+                                      prefix)
8436+        self._remove_reader(reader)
8437+        self._bad_shares.add((reader.peerid, reader.shnum, f))
8438+        self._status.problems[reader.peerid] = f
8439+        self._last_failure = f
8440+        self.notify_server_corruption(reader.peerid, reader.shnum,
8441+                                      str(f.value))
8442+
8443+
8444+    def _download_current_segment(self):
8445+        """
8446+        I download, validate, decode, decrypt, and assemble the segment
8447+        that this Retrieve is currently responsible for downloading.
8448+        """
8449+        assert len(self._active_readers) >= self._required_shares
8450+        if self._current_segment <= self._last_segment:
8451+            d = self._process_segment(self._current_segment)
8452+        else:
8453+            d = defer.succeed(None)
8454+        d.addBoth(self._turn_barrier)
8455+        d.addCallback(self._check_for_done)
8456+        return d
8457+
8458+
8459+    def _turn_barrier(self, result):
8460+        """
8461+        I help the download process avoid the recursion limit issues
8462+        discussed in #237.
8463+        """
8464+        return fireEventually(result)
8465+
8466+
8467+    def _process_segment(self, segnum):
8468+        """
8469+        I download, validate, decode, and decrypt one segment of the
8470+        file that this Retrieve is retrieving. This means coordinating
8471+        the process of getting k blocks of that file, validating them,
8472+        assembling them into one segment with the decoder, and then
8473+        decrypting them.
8474+        """
8475+        self.log("processing segment %d" % segnum)
8476+
8477+        # TODO: The old code uses a marker. Should this code do that
8478+        # too? What did the Marker do?
8479+        assert len(self._active_readers) >= self._required_shares
8480+
8481+        # We need to ask each of our active readers for its block and
8482+        # salt. We will then validate those. If validation is
8483+        # successful, we will assemble the results into plaintext.
8484+        ds = []
8485+        for reader in self._active_readers:
8486+            started = time.time()
8487+            d = reader.get_block_and_salt(segnum, queue=True)
8488+            d2 = self._get_needed_hashes(reader, segnum)
8489+            dl = defer.DeferredList([d, d2], consumeErrors=True)
8490+            dl.addCallback(self._validate_block, segnum, reader, started)
8491+            dl.addErrback(self._validation_or_decoding_failed, [reader])
8492+            ds.append(dl)
8493+            reader.flush()
8494+        dl = defer.DeferredList(ds)
8495+        if self._verify:
8496+            dl.addCallback(lambda ignored: "")
8497+            dl.addCallback(self._set_segment)
8498+        else:
8499+            dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
8500+        return dl
8501+
8502+
8503+    def _maybe_decode_and_decrypt_segment(self, blocks_and_salts, segnum):
8504+        """
8505+        I take the results of fetching and validating the blocks from a
8506+        callback chain in another method. If the results are such that
8507+        they tell me that validation and fetching succeeded without
8508+        incident, I will proceed with decoding and decryption.
8509+        Otherwise, I will do nothing.
8510+        """
8511+        self.log("trying to decode and decrypt segment %d" % segnum)
8512+        failures = False
8513+        for block_and_salt in blocks_and_salts:
8514+            if not block_and_salt[0] or block_and_salt[1] == None:
8515+                self.log("some validation operations failed; not proceeding")
8516+                failures = True
8517                 break
8518hunk ./src/allmydata/mutable/retrieve.py 715
8519+        if not failures:
8520+            self.log("everything looks ok, building segment %d" % segnum)
8521+            d = self._decode_blocks(blocks_and_salts, segnum)
8522+            d.addCallback(self._decrypt_segment)
8523+            d.addErrback(self._validation_or_decoding_failed,
8524+                         self._active_readers)
8525+            # check to see whether we've been paused before writing
8526+            # anything.
8527+            d.addCallback(self._check_for_paused)
8528+            d.addCallback(self._set_segment)
8529+            return d
8530+        else:
8531+            return defer.succeed(None)
8532+
8533+
8534+    def _set_segment(self, segment):
8535+        """
8536+        Given a plaintext segment, I register that segment with the
8537+        target that is handling the file download.
8538+        """
8539+        self.log("got plaintext for segment %d" % self._current_segment)
8540+        if self._current_segment == self._start_segment:
8541+            # We're on the first segment. It's possible that we want
8542+            # only some part of the end of this segment, and that we
8543+            # just downloaded the whole thing to get that part. If so,
8544+            # we need to account for that and give the reader just the
8545+            # data that they want.
8546+            n = self._offset % self._segment_size
8547+            self.log("stripping %d bytes off of the first segment" % n)
8548+            self.log("original segment length: %d" % len(segment))
8549+            segment = segment[n:]
8550+            self.log("new segment length: %d" % len(segment))
8551+
8552+        if self._current_segment == self._last_segment and self._read_length is not None:
8553+            # We're on the last segment. It's possible that we only want
8554+            # part of the beginning of this segment, and that we
8555+            # downloaded the whole thing anyway. Make sure to give the
8556+            # caller only the portion of the segment that they want to
8557+            # receive.
8558+            extra = self._read_length
8559+            if self._start_segment != self._last_segment:
8560+                extra -= self._segment_size - \
8561+                            (self._offset % self._segment_size)
8562+            extra %= self._segment_size
8563+            self.log("original segment length: %d" % len(segment))
8564+            segment = segment[:extra]
8565+            self.log("new segment length: %d" % len(segment))
8566+            self.log("only taking %d bytes of the last segment" % extra)
8567+
8568+        if not self._verify:
8569+            self._consumer.write(segment)
8570+        else:
8571+            # we don't care about the plaintext if we are doing a verify.
8572+            segment = None
8573+        self._current_segment += 1
8574 
8575hunk ./src/allmydata/mutable/retrieve.py 771
8576-        # at this point, we have as many outstanding queries as we can. If
8577-        # needed!=0 then we might not have enough to recover the file.
8578-        if needed:
8579-            format = ("ran out of peers: "
8580-                      "have %(have)d shares (k=%(k)d), "
8581-                      "%(outstanding)d queries in flight, "
8582-                      "need %(need)d more, "
8583-                      "found %(bad)d bad shares")
8584-            args = {"have": len(self.shares),
8585-                    "k": k,
8586-                    "outstanding": len(self._outstanding_queries),
8587-                    "need": needed,
8588-                    "bad": len(self._bad_shares),
8589-                    }
8590-            self.log(format=format,
8591-                     level=log.WEIRD, umid="ezTfjw", **args)
8592-            err = NotEnoughSharesError("%s, last failure: %s" %
8593-                                      (format % args, self._last_failure))
8594-            if self._bad_shares:
8595-                self.log("We found some bad shares this pass. You should "
8596-                         "update the servermap and try again to check "
8597-                         "more peers",
8598-                         level=log.WEIRD, umid="EFkOlA")
8599-                err.servermap = self.servermap
8600-            raise err
8601 
8602hunk ./src/allmydata/mutable/retrieve.py 772
8603+    def _validation_or_decoding_failed(self, f, readers):
8604+        """
8605+        I am called when a block or a salt fails to correctly validate, or when
8606+        the decryption or decoding operation fails for some reason.  I react to
8607+        this failure by notifying the remote server of corruption, and then
8608+        removing the remote peer from further activity.
8609+        """
8610+        assert isinstance(readers, list)
8611+        bad_shnums = [reader.shnum for reader in readers]
8612+
8613+        self.log("validation or decoding failed on share(s) %s, peer(s) %s "
8614+                 ", segment %d: %s" % \
8615+                 (bad_shnums, readers, self._current_segment, str(f)))
8616+        for reader in readers:
8617+            self._mark_bad_share(reader, f)
8618         return
8619 
8620hunk ./src/allmydata/mutable/retrieve.py 789
8621-    def _decode(self):
8622-        started = time.time()
8623-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
8624-         offsets_tuple) = self.verinfo
8625 
8626hunk ./src/allmydata/mutable/retrieve.py 790
8627-        # shares_dict is a dict mapping shnum to share data, but the codec
8628-        # wants two lists.
8629-        shareids = []; shares = []
8630-        for shareid, share in self.shares.items():
8631+    def _validate_block(self, results, segnum, reader, started):
8632+        """
8633+        I validate a block from one share on a remote server.
8634+        """
8635+        # Grab the part of the block hash tree that is necessary to
8636+        # validate this block, then generate the block hash root.
8637+        self.log("validating share %d for segment %d" % (reader.shnum,
8638+                                                             segnum))
8639+        self._status.add_fetch_timing(reader.peerid, started)
8640+        self._status.set_status("Valdiating blocks for segment %d" % segnum)
8641+        # Did we fail to fetch either of the things that we were
8642+        # supposed to? Fail if so.
8643+        if not results[0][0] and results[1][0]:
8644+            # handled by the errback handler.
8645+
8646+            # These all get batched into one query, so the resulting
8647+            # failure should be the same for all of them, so we can just
8648+            # use the first one.
8649+            assert isinstance(results[0][1], failure.Failure)
8650+
8651+            f = results[0][1]
8652+            raise CorruptShareError(reader.peerid,
8653+                                    reader.shnum,
8654+                                    "Connection error: %s" % str(f))
8655+
8656+        block_and_salt, block_and_sharehashes = results
8657+        block, salt = block_and_salt[1]
8658+        blockhashes, sharehashes = block_and_sharehashes[1]
8659+
8660+        blockhashes = dict(enumerate(blockhashes[1]))
8661+        self.log("the reader gave me the following blockhashes: %s" % \
8662+                 blockhashes.keys())
8663+        self.log("the reader gave me the following sharehashes: %s" % \
8664+                 sharehashes[1].keys())
8665+        bht = self._block_hash_trees[reader.shnum]
8666+
8667+        if bht.needed_hashes(segnum, include_leaf=True):
8668+            try:
8669+                bht.set_hashes(blockhashes)
8670+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
8671+                    IndexError), e:
8672+                raise CorruptShareError(reader.peerid,
8673+                                        reader.shnum,
8674+                                        "block hash tree failure: %s" % e)
8675+
8676+        if self._version == MDMF_VERSION:
8677+            blockhash = hashutil.block_hash(salt + block)
8678+        else:
8679+            blockhash = hashutil.block_hash(block)
8680+        # If this works without an error, then validation is
8681+        # successful.
8682+        try:
8683+           bht.set_hashes(leaves={segnum: blockhash})
8684+        except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
8685+                IndexError), e:
8686+            raise CorruptShareError(reader.peerid,
8687+                                    reader.shnum,
8688+                                    "block hash tree failure: %s" % e)
8689+
8690+        # Reaching this point means that we know that this segment
8691+        # is correct. Now we need to check to see whether the share
8692+        # hash chain is also correct.
8693+        # SDMF wrote share hash chains that didn't contain the
8694+        # leaves, which would be produced from the block hash tree.
8695+        # So we need to validate the block hash tree first. If
8696+        # successful, then bht[0] will contain the root for the
8697+        # shnum, which will be a leaf in the share hash tree, which
8698+        # will allow us to validate the rest of the tree.
8699+        if self.share_hash_tree.needed_hashes(reader.shnum,
8700+                                              include_leaf=True) or \
8701+                                              self._verify:
8702+            try:
8703+                self.share_hash_tree.set_hashes(hashes=sharehashes[1],
8704+                                            leaves={reader.shnum: bht[0]})
8705+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
8706+                    IndexError), e:
8707+                raise CorruptShareError(reader.peerid,
8708+                                        reader.shnum,
8709+                                        "corrupt hashes: %s" % e)
8710+
8711+        self.log('share %d is valid for segment %d' % (reader.shnum,
8712+                                                       segnum))
8713+        return {reader.shnum: (block, salt)}
8714+
8715+
8716+    def _get_needed_hashes(self, reader, segnum):
8717+        """
8718+        I get the hashes needed to validate segnum from the reader, then return
8719+        to my caller when this is done.
8720+        """
8721+        bht = self._block_hash_trees[reader.shnum]
8722+        needed = bht.needed_hashes(segnum, include_leaf=True)
8723+        # The root of the block hash tree is also a leaf in the share
8724+        # hash tree. So we don't need to fetch it from the remote
8725+        # server. In the case of files with one segment, this means that
8726+        # we won't fetch any block hash tree from the remote server,
8727+        # since the hash of each share of the file is the entire block
8728+        # hash tree, and is a leaf in the share hash tree. This is fine,
8729+        # since any share corruption will be detected in the share hash
8730+        # tree.
8731+        #needed.discard(0)
8732+        self.log("getting blockhashes for segment %d, share %d: %s" % \
8733+                 (segnum, reader.shnum, str(needed)))
8734+        d1 = reader.get_blockhashes(needed, queue=True, force_remote=True)
8735+        if self.share_hash_tree.needed_hashes(reader.shnum):
8736+            need = self.share_hash_tree.needed_hashes(reader.shnum)
8737+            self.log("also need sharehashes for share %d: %s" % (reader.shnum,
8738+                                                                 str(need)))
8739+            d2 = reader.get_sharehashes(need, queue=True, force_remote=True)
8740+        else:
8741+            d2 = defer.succeed({}) # the logic in the next method
8742+                                   # expects a dict
8743+        dl = defer.DeferredList([d1, d2], consumeErrors=True)
8744+        return dl
8745+
8746+
8747+    def _decode_blocks(self, blocks_and_salts, segnum):
8748+        """
8749+        I take a list of k blocks and salts, and decode that into a
8750+        single encrypted segment.
8751+        """
8752+        d = {}
8753+        # We want to merge our dictionaries to the form
8754+        # {shnum: blocks_and_salts}
8755+        #
8756+        # The dictionaries come from validate block that way, so we just
8757+        # need to merge them.
8758+        for block_and_salt in blocks_and_salts:
8759+            d.update(block_and_salt[1])
8760+
8761+        # All of these blocks should have the same salt; in SDMF, it is
8762+        # the file-wide IV, while in MDMF it is the per-segment salt. In
8763+        # either case, we just need to get one of them and use it.
8764+        #
8765+        # d.items()[0] is like (shnum, (block, salt))
8766+        # d.items()[0][1] is like (block, salt)
8767+        # d.items()[0][1][1] is the salt.
8768+        salt = d.items()[0][1][1]
8769+        # Next, extract just the blocks from the dict. We'll use the
8770+        # salt in the next step.
8771+        share_and_shareids = [(k, v[0]) for k, v in d.items()]
8772+        d2 = dict(share_and_shareids)
8773+        shareids = []
8774+        shares = []
8775+        for shareid, share in d2.items():
8776             shareids.append(shareid)
8777             shares.append(share)
8778 
8779hunk ./src/allmydata/mutable/retrieve.py 938
8780-        assert len(shareids) >= k, len(shareids)
8781+        self._status.set_status("Decoding")
8782+        started = time.time()
8783+        assert len(shareids) >= self._required_shares, len(shareids)
8784         # zfec really doesn't want extra shares
8785hunk ./src/allmydata/mutable/retrieve.py 942
8786-        shareids = shareids[:k]
8787-        shares = shares[:k]
8788-
8789-        fec = codec.CRSDecoder()
8790-        fec.set_params(segsize, k, N)
8791-
8792-        self.log("params %s, we have %d shares" % ((segsize, k, N), len(shares)))
8793-        self.log("about to decode, shareids=%s" % (shareids,))
8794-        d = defer.maybeDeferred(fec.decode, shares, shareids)
8795-        def _done(buffers):
8796-            self._status.timings["decode"] = time.time() - started
8797-            self.log(" decode done, %d buffers" % len(buffers))
8798+        shareids = shareids[:self._required_shares]
8799+        shares = shares[:self._required_shares]
8800+        self.log("decoding segment %d" % segnum)
8801+        if segnum == self._num_segments - 1:
8802+            d = defer.maybeDeferred(self._tail_decoder.decode, shares, shareids)
8803+        else:
8804+            d = defer.maybeDeferred(self._segment_decoder.decode, shares, shareids)
8805+        def _process(buffers):
8806             segment = "".join(buffers)
8807hunk ./src/allmydata/mutable/retrieve.py 951
8808+            self.log(format="now decoding segment %(segnum)s of %(numsegs)s",
8809+                     segnum=segnum,
8810+                     numsegs=self._num_segments,
8811+                     level=log.NOISY)
8812             self.log(" joined length %d, datalength %d" %
8813hunk ./src/allmydata/mutable/retrieve.py 956
8814-                     (len(segment), datalength))
8815-            segment = segment[:datalength]
8816+                     (len(segment), self._data_length))
8817+            if segnum == self._num_segments - 1:
8818+                size_to_use = self._tail_data_size
8819+            else:
8820+                size_to_use = self._segment_size
8821+            segment = segment[:size_to_use]
8822             self.log(" segment len=%d" % len(segment))
8823hunk ./src/allmydata/mutable/retrieve.py 963
8824-            return segment
8825-        def _err(f):
8826-            self.log(" decode failed: %s" % f)
8827-            return f
8828-        d.addCallback(_done)
8829-        d.addErrback(_err)
8830+            self._status.timings.setdefault("decode", 0)
8831+            self._status.timings['decode'] = time.time() - started
8832+            return segment, salt
8833+        d.addCallback(_process)
8834         return d
8835 
8836hunk ./src/allmydata/mutable/retrieve.py 969
8837-    def _decrypt(self, crypttext, IV, readkey):
8838+
8839+    def _decrypt_segment(self, segment_and_salt):
8840+        """
8841+        I take a single segment and its salt, and decrypt it. I return
8842+        the plaintext of the segment that is in my argument.
8843+        """
8844+        segment, salt = segment_and_salt
8845         self._status.set_status("decrypting")
8846hunk ./src/allmydata/mutable/retrieve.py 977
8847+        self.log("decrypting segment %d" % self._current_segment)
8848         started = time.time()
8849hunk ./src/allmydata/mutable/retrieve.py 979
8850-        key = hashutil.ssk_readkey_data_hash(IV, readkey)
8851+        key = hashutil.ssk_readkey_data_hash(salt, self._node.get_readkey())
8852         decryptor = AES(key)
8853hunk ./src/allmydata/mutable/retrieve.py 981
8854-        plaintext = decryptor.process(crypttext)
8855-        self._status.timings["decrypt"] = time.time() - started
8856+        plaintext = decryptor.process(segment)
8857+        self._status.timings.setdefault("decrypt", 0)
8858+        self._status.timings['decrypt'] = time.time() - started
8859         return plaintext
8860 
8861hunk ./src/allmydata/mutable/retrieve.py 986
8862-    def _done(self, res):
8863-        if not self._running:
8864+
8865+    def notify_server_corruption(self, peerid, shnum, reason):
8866+        ss = self.servermap.connections[peerid]
8867+        ss.callRemoteOnly("advise_corrupt_share",
8868+                          "mutable", self._storage_index, shnum, reason)
8869+
8870+
8871+    def _try_to_validate_privkey(self, enc_privkey, reader):
8872+        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
8873+        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
8874+        if alleged_writekey != self._node.get_writekey():
8875+            self.log("invalid privkey from %s shnum %d" %
8876+                     (reader, reader.shnum),
8877+                     level=log.WEIRD, umid="YIw4tA")
8878+            if self._verify:
8879+                self.servermap.mark_bad_share(reader.peerid, reader.shnum,
8880+                                              self.verinfo[-2])
8881+                e = CorruptShareError(reader.peerid,
8882+                                      reader.shnum,
8883+                                      "invalid privkey")
8884+                f = failure.Failure(e)
8885+                self._bad_shares.add((reader.peerid, reader.shnum, f))
8886             return
8887hunk ./src/allmydata/mutable/retrieve.py 1009
8888+
8889+        # it's good
8890+        self.log("got valid privkey from shnum %d on reader %s" %
8891+                 (reader.shnum, reader))
8892+        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
8893+        self._node._populate_encprivkey(enc_privkey)
8894+        self._node._populate_privkey(privkey)
8895+        self._need_privkey = False
8896+
8897+
8898+    def _check_for_done(self, res):
8899+        """
8900+        I check to see if this Retrieve object has successfully finished
8901+        its work.
8902+
8903+        I can exit in the following ways:
8904+            - If there are no more segments to download, then I exit by
8905+              causing self._done_deferred to fire with the plaintext
8906+              content requested by the caller.
8907+            - If there are still segments to be downloaded, and there
8908+              are enough active readers (readers which have not broken
8909+              and have not given us corrupt data) to continue
8910+              downloading, I send control back to
8911+              _download_current_segment.
8912+            - If there are still segments to be downloaded but there are
8913+              not enough active peers to download them, I ask
8914+              _add_active_peers to add more peers. If it is successful,
8915+              it will call _download_current_segment. If there are not
8916+              enough peers to retrieve the file, then that will cause
8917+              _done_deferred to errback.
8918+        """
8919+        self.log("checking for doneness")
8920+        if self._current_segment > self._last_segment:
8921+            # No more segments to download, we're done.
8922+            self.log("got plaintext, done")
8923+            return self._done()
8924+
8925+        if len(self._active_readers) >= self._required_shares:
8926+            # More segments to download, but we have enough good peers
8927+            # in self._active_readers that we can do that without issue,
8928+            # so go nab the next segment.
8929+            self.log("not done yet: on segment %d of %d" % \
8930+                     (self._current_segment + 1, self._num_segments))
8931+            return self._download_current_segment()
8932+
8933+        self.log("not done yet: on segment %d of %d, need to add peers" % \
8934+                 (self._current_segment + 1, self._num_segments))
8935+        return self._add_active_peers()
8936+
8937+
8938+    def _done(self):
8939+        """
8940+        I am called by _check_for_done when the download process has
8941+        finished successfully. After making some useful logging
8942+        statements, I return the decrypted contents to the owner of this
8943+        Retrieve object through self._done_deferred.
8944+        """
8945         self._running = False
8946         self._status.set_active(False)
8947hunk ./src/allmydata/mutable/retrieve.py 1068
8948-        self._status.timings["total"] = time.time() - self._started
8949-        # res is either the new contents, or a Failure
8950-        if isinstance(res, failure.Failure):
8951-            self.log("Retrieve done, with failure", failure=res,
8952-                     level=log.UNUSUAL)
8953-            self._status.set_status("Failed")
8954+        now = time.time()
8955+        self._status.timings['total'] = now - self._started
8956+        self._status.timings['fetch'] = now - self._started_fetching
8957+
8958+        if self._verify:
8959+            ret = list(self._bad_shares)
8960+            self.log("done verifying, found %d bad shares" % len(ret))
8961         else:
8962hunk ./src/allmydata/mutable/retrieve.py 1076
8963-            self.log("Retrieve done, success!")
8964-            self._status.set_status("Finished")
8965-            self._status.set_progress(1.0)
8966-            # remember the encoding parameters, use them again next time
8967-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
8968-             offsets_tuple) = self.verinfo
8969-            self._node._populate_required_shares(k)
8970-            self._node._populate_total_shares(N)
8971-        eventually(self._done_deferred.callback, res)
8972+            # TODO: upload status here?
8973+            ret = self._consumer
8974+            self._consumer.unregisterProducer()
8975+        eventually(self._done_deferred.callback, ret)
8976+
8977 
8978hunk ./src/allmydata/mutable/retrieve.py 1082
8979+    def _failed(self):
8980+        """
8981+        I am called by _add_active_peers when there are not enough
8982+        active peers left to complete the download. After making some
8983+        useful logging statements, I return an exception to that effect
8984+        to the caller of this Retrieve object through
8985+        self._done_deferred.
8986+        """
8987+        self._running = False
8988+        self._status.set_active(False)
8989+        now = time.time()
8990+        self._status.timings['total'] = now - self._started
8991+        self._status.timings['fetch'] = now - self._started_fetching
8992+
8993+        if self._verify:
8994+            ret = list(self._bad_shares)
8995+        else:
8996+            format = ("ran out of peers: "
8997+                      "have %(have)d of %(total)d segments "
8998+                      "found %(bad)d bad shares "
8999+                      "encoding %(k)d-of-%(n)d")
9000+            args = {"have": self._current_segment,
9001+                    "total": self._num_segments,
9002+                    "need": self._last_segment,
9003+                    "k": self._required_shares,
9004+                    "n": self._total_shares,
9005+                    "bad": len(self._bad_shares)}
9006+            e = NotEnoughSharesError("%s, last failure: %s" % \
9007+                                     (format % args, str(self._last_failure)))
9008+            f = failure.Failure(e)
9009+            ret = f
9010+        eventually(self._done_deferred.callback, ret)
9011}
9012[mutable/servermap.py: Alter the servermap updater to work with MDMF files
9013Kevan Carstensen <kevan@isnotajoke.com>**20100819003439
9014 Ignore-this: 7e408303194834bd59a2f27efab3bdb
9015 
9016 These modifications were basically all to the end of having the
9017 servermap updater use the unified MDMF + SDMF read interface whenever
9018 possible -- this reduces the complexity of the code, making it easier to
9019 read and maintain. To do this, I needed to modify the process of
9020 updating the servermap a little bit.
9021 
9022 To support partial-file updates, I also modified the servermap updater
9023 to fetch the block hash trees and certain segments of files while it
9024 performed a servermap update (this can be done without adding any new
9025 roundtrips because of batch-read functionality that the read proxy has).
9026 
9027] {
9028hunk ./src/allmydata/mutable/servermap.py 2
9029 
9030-import sys, time
9031+import sys, time, struct
9032 from zope.interface import implements
9033 from itertools import count
9034 from twisted.internet import defer
9035merger 0.0 (
9036hunk ./src/allmydata/mutable/servermap.py 9
9037+from allmydata.util.dictutil import DictOfSets
9038hunk ./src/allmydata/mutable/servermap.py 7
9039-from foolscap.api import DeadReferenceError, RemoteException, eventually
9040-from allmydata.util import base32, hashutil, idlib, log
9041+from foolscap.api import DeadReferenceError, RemoteException, eventually, \
9042+                         fireEventually
9043+from allmydata.util import base32, hashutil, idlib, log, deferredutil
9044)
9045merger 0.0 (
9046hunk ./src/allmydata/mutable/servermap.py 14
9047-     DictOfSets, CorruptShareError, NeedMoreDataError
9048+     CorruptShareError, NeedMoreDataError
9049hunk ./src/allmydata/mutable/servermap.py 14
9050-     DictOfSets, CorruptShareError, NeedMoreDataError
9051-from allmydata.mutable.layout import unpack_prefix_and_signature, unpack_header, unpack_share, \
9052-     SIGNED_PREFIX_LENGTH
9053+     DictOfSets, CorruptShareError
9054+from allmydata.mutable.layout import SIGNED_PREFIX_LENGTH, MDMFSlotReadProxy
9055)
9056hunk ./src/allmydata/mutable/servermap.py 123
9057         self.bad_shares = {} # maps (peerid,shnum) to old checkstring
9058         self.last_update_mode = None
9059         self.last_update_time = 0
9060+        self.update_data = {} # (verinfo,shnum) => data
9061 
9062     def copy(self):
9063         s = ServerMap()
9064hunk ./src/allmydata/mutable/servermap.py 254
9065         """Return a set of versionids, one for each version that is currently
9066         recoverable."""
9067         versionmap = self.make_versionmap()
9068-
9069         recoverable_versions = set()
9070         for (verinfo, shares) in versionmap.items():
9071             (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
9072hunk ./src/allmydata/mutable/servermap.py 339
9073         return False
9074 
9075 
9076+    def get_update_data_for_share_and_verinfo(self, shnum, verinfo):
9077+        """
9078+        I return the update data for the given shnum
9079+        """
9080+        update_data = self.update_data[shnum]
9081+        update_datum = [i[1] for i in update_data if i[0] == verinfo][0]
9082+        return update_datum
9083+
9084+
9085+    def set_update_data_for_share_and_verinfo(self, shnum, verinfo, data):
9086+        """
9087+        I record the block hash tree for the given shnum.
9088+        """
9089+        self.update_data.setdefault(shnum , []).append((verinfo, data))
9090+
9091+
9092 class ServermapUpdater:
9093     def __init__(self, filenode, storage_broker, monitor, servermap,
9094hunk ./src/allmydata/mutable/servermap.py 357
9095-                 mode=MODE_READ, add_lease=False):
9096+                 mode=MODE_READ, add_lease=False, update_range=None):
9097         """I update a servermap, locating a sufficient number of useful
9098         shares and remembering where they are located.
9099 
9100hunk ./src/allmydata/mutable/servermap.py 382
9101         self._servers_responded = set()
9102 
9103         # how much data should we read?
9104+        # SDMF:
9105         #  * if we only need the checkstring, then [0:75]
9106         #  * if we need to validate the checkstring sig, then [543ish:799ish]
9107         #  * if we need the verification key, then [107:436ish]
9108merger 0.0 (
9109hunk ./src/allmydata/mutable/servermap.py 392
9110-        # read 2000 bytes, which also happens to read enough actual data to
9111-        # pre-fetch a 9-entry dirnode.
9112+        # read 4000 bytes, which also happens to read enough actual data to
9113+        # pre-fetch an 18-entry dirnode.
9114hunk ./src/allmydata/mutable/servermap.py 390
9115-        # A future version of the SMDF slot format should consider using
9116-        # fixed-size slots so we can retrieve less data. For now, we'll just
9117-        # read 2000 bytes, which also happens to read enough actual data to
9118-        # pre-fetch a 9-entry dirnode.
9119+        # MDMF:
9120+        #  * Checkstring? [0:72]
9121+        #  * If we want to validate the checkstring, then [0:72], [143:?] --
9122+        #    the offset table will tell us for sure.
9123+        #  * If we need the verification key, we have to consult the offset
9124+        #    table as well.
9125+        # At this point, we don't know which we are. Our filenode can
9126+        # tell us, but it might be lying -- in some cases, we're
9127+        # responsible for telling it which kind of file it is.
9128)
9129hunk ./src/allmydata/mutable/servermap.py 399
9130             # we use unpack_prefix_and_signature, so we need 1k
9131             self._read_size = 1000
9132         self._need_privkey = False
9133+
9134         if mode == MODE_WRITE and not self._node.get_privkey():
9135             self._need_privkey = True
9136         # check+repair: repair requires the privkey, so if we didn't happen
9137hunk ./src/allmydata/mutable/servermap.py 406
9138         # to ask for it during the check, we'll have problems doing the
9139         # publish.
9140 
9141+        self.fetch_update_data = False
9142+        if mode == MODE_WRITE and update_range:
9143+            # We're updating the servermap in preparation for an
9144+            # in-place file update, so we need to fetch some additional
9145+            # data from each share that we find.
9146+            assert len(update_range) == 2
9147+
9148+            self.start_segment = update_range[0]
9149+            self.end_segment = update_range[1]
9150+            self.fetch_update_data = True
9151+
9152         prefix = si_b2a(self._storage_index)[:5]
9153         self._log_number = log.msg(format="SharemapUpdater(%(si)s): starting (%(mode)s)",
9154                                    si=prefix, mode=mode)
9155merger 0.0 (
9156hunk ./src/allmydata/mutable/servermap.py 455
9157-        full_peerlist = sb.get_servers_for_index(self._storage_index)
9158+        full_peerlist = [(s.get_serverid(), s.get_rref())
9159+                         for s in sb.get_servers_for_psi(self._storage_index)]
9160hunk ./src/allmydata/mutable/servermap.py 455
9161+        # All of the peers, permuted by the storage index, as usual.
9162)
9163hunk ./src/allmydata/mutable/servermap.py 461
9164         self._good_peers = set() # peers who had some shares
9165         self._empty_peers = set() # peers who don't have any shares
9166         self._bad_peers = set() # peers to whom our queries failed
9167+        self._readers = {} # peerid -> dict(sharewriters), filled in
9168+                           # after responses come in.
9169 
9170         k = self._node.get_required_shares()
9171hunk ./src/allmydata/mutable/servermap.py 465
9172+        # For what cases can these conditions work?
9173         if k is None:
9174             # make a guess
9175             k = 3
9176hunk ./src/allmydata/mutable/servermap.py 478
9177         self.num_peers_to_query = k + self.EPSILON
9178 
9179         if self.mode == MODE_CHECK:
9180+            # We want to query all of the peers.
9181             initial_peers_to_query = dict(full_peerlist)
9182             must_query = set(initial_peers_to_query.keys())
9183             self.extra_peers = []
9184hunk ./src/allmydata/mutable/servermap.py 486
9185             # we're planning to replace all the shares, so we want a good
9186             # chance of finding them all. We will keep searching until we've
9187             # seen epsilon that don't have a share.
9188+            # We don't query all of the peers because that could take a while.
9189             self.num_peers_to_query = N + self.EPSILON
9190             initial_peers_to_query, must_query = self._build_initial_querylist()
9191             self.required_num_empty_peers = self.EPSILON
9192hunk ./src/allmydata/mutable/servermap.py 496
9193             # might also avoid the round trip required to read the encrypted
9194             # private key.
9195 
9196-        else:
9197+        else: # MODE_READ, MODE_ANYTHING
9198+            # 2k peers is good enough.
9199             initial_peers_to_query, must_query = self._build_initial_querylist()
9200 
9201         # this is a set of peers that we are required to get responses from:
9202hunk ./src/allmydata/mutable/servermap.py 512
9203         # before we can consider ourselves finished, and self.extra_peers
9204         # contains the overflow (peers that we should tap if we don't get
9205         # enough responses)
9206+        # I guess that self._must_query is a subset of
9207+        # initial_peers_to_query?
9208+        assert set(must_query).issubset(set(initial_peers_to_query))
9209 
9210         self._send_initial_requests(initial_peers_to_query)
9211         self._status.timings["initial_queries"] = time.time() - self._started
9212hunk ./src/allmydata/mutable/servermap.py 571
9213         # errors that aren't handled by _query_failed (and errors caused by
9214         # _query_failed) get logged, but we still want to check for doneness.
9215         d.addErrback(log.err)
9216-        d.addBoth(self._check_for_done)
9217         d.addErrback(self._fatal_error)
9218hunk ./src/allmydata/mutable/servermap.py 572
9219+        d.addCallback(self._check_for_done)
9220         return d
9221 
9222     def _do_read(self, ss, peerid, storage_index, shnums, readv):
9223hunk ./src/allmydata/mutable/servermap.py 591
9224         d = ss.callRemote("slot_readv", storage_index, shnums, readv)
9225         return d
9226 
9227+
9228+    def _got_corrupt_share(self, e, shnum, peerid, data, lp):
9229+        """
9230+        I am called when a remote server returns a corrupt share in
9231+        response to one of our queries. By corrupt, I mean a share
9232+        without a valid signature. I then record the failure, notify the
9233+        server of the corruption, and record the share as bad.
9234+        """
9235+        f = failure.Failure(e)
9236+        self.log(format="bad share: %(f_value)s", f_value=str(f),
9237+                 failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
9238+        # Notify the server that its share is corrupt.
9239+        self.notify_server_corruption(peerid, shnum, str(e))
9240+        # By flagging this as a bad peer, we won't count any of
9241+        # the other shares on that peer as valid, though if we
9242+        # happen to find a valid version string amongst those
9243+        # shares, we'll keep track of it so that we don't need
9244+        # to validate the signature on those again.
9245+        self._bad_peers.add(peerid)
9246+        self._last_failure = f
9247+        # XXX: Use the reader for this?
9248+        checkstring = data[:SIGNED_PREFIX_LENGTH]
9249+        self._servermap.mark_bad_share(peerid, shnum, checkstring)
9250+        self._servermap.problems.append(f)
9251+
9252+
9253+    def _cache_good_sharedata(self, verinfo, shnum, now, data):
9254+        """
9255+        If one of my queries returns successfully (which means that we
9256+        were able to and successfully did validate the signature), I
9257+        cache the data that we initially fetched from the storage
9258+        server. This will help reduce the number of roundtrips that need
9259+        to occur when the file is downloaded, or when the file is
9260+        updated.
9261+        """
9262+        if verinfo:
9263+            self._node._add_to_cache(verinfo, shnum, 0, data, now)
9264+
9265+
9266     def _got_results(self, datavs, peerid, readsize, stuff, started):
9267         lp = self.log(format="got result from [%(peerid)s], %(numshares)d shares",
9268                       peerid=idlib.shortnodeid_b2a(peerid),
9269hunk ./src/allmydata/mutable/servermap.py 633
9270-                      numshares=len(datavs),
9271-                      level=log.NOISY)
9272+                      numshares=len(datavs))
9273         now = time.time()
9274         elapsed = now - started
9275hunk ./src/allmydata/mutable/servermap.py 636
9276-        self._queries_outstanding.discard(peerid)
9277-        self._servermap.reachable_peers.add(peerid)
9278-        self._must_query.discard(peerid)
9279-        self._queries_completed += 1
9280+        def _done_processing(ignored=None):
9281+            self._queries_outstanding.discard(peerid)
9282+            self._servermap.reachable_peers.add(peerid)
9283+            self._must_query.discard(peerid)
9284+            self._queries_completed += 1
9285         if not self._running:
9286hunk ./src/allmydata/mutable/servermap.py 642
9287-            self.log("but we're not running, so we'll ignore it", parent=lp,
9288-                     level=log.NOISY)
9289+            self.log("but we're not running, so we'll ignore it", parent=lp)
9290+            _done_processing()
9291             self._status.add_per_server_time(peerid, "late", started, elapsed)
9292             return
9293         self._status.add_per_server_time(peerid, "query", started, elapsed)
9294hunk ./src/allmydata/mutable/servermap.py 653
9295         else:
9296             self._empty_peers.add(peerid)
9297 
9298-        last_verinfo = None
9299-        last_shnum = None
9300+        ss, storage_index = stuff
9301+        ds = []
9302+
9303         for shnum,datav in datavs.items():
9304             data = datav[0]
9305             try:
9306merger 0.0 (
9307hunk ./src/allmydata/mutable/servermap.py 662
9308-                self._node._add_to_cache(verinfo, shnum, 0, data, now)
9309+                self._node._add_to_cache(verinfo, shnum, 0, data)
9310hunk ./src/allmydata/mutable/servermap.py 658
9311-            try:
9312-                verinfo = self._got_results_one_share(shnum, data, peerid, lp)
9313-                last_verinfo = verinfo
9314-                last_shnum = shnum
9315-                self._node._add_to_cache(verinfo, shnum, 0, data, now)
9316-            except CorruptShareError, e:
9317-                # log it and give the other shares a chance to be processed
9318-                f = failure.Failure()
9319-                self.log(format="bad share: %(f_value)s", f_value=str(f.value),
9320-                         failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
9321-                self.notify_server_corruption(peerid, shnum, str(e))
9322-                self._bad_peers.add(peerid)
9323-                self._last_failure = f
9324-                checkstring = data[:SIGNED_PREFIX_LENGTH]
9325-                self._servermap.mark_bad_share(peerid, shnum, checkstring)
9326-                self._servermap.problems.append(f)
9327-                pass
9328+            reader = MDMFSlotReadProxy(ss,
9329+                                       storage_index,
9330+                                       shnum,
9331+                                       data)
9332+            self._readers.setdefault(peerid, dict())[shnum] = reader
9333+            # our goal, with each response, is to validate the version
9334+            # information and share data as best we can at this point --
9335+            # we do this by validating the signature. To do this, we
9336+            # need to do the following:
9337+            #   - If we don't already have the public key, fetch the
9338+            #     public key. We use this to validate the signature.
9339+            if not self._node.get_pubkey():
9340+                # fetch and set the public key.
9341+                d = reader.get_verification_key(queue=True)
9342+                d.addCallback(lambda results, shnum=shnum, peerid=peerid:
9343+                    self._try_to_set_pubkey(results, peerid, shnum, lp))
9344+                # XXX: Make self._pubkey_query_failed?
9345+                d.addErrback(lambda error, shnum=shnum, peerid=peerid:
9346+                    self._got_corrupt_share(error, shnum, peerid, data, lp))
9347+            else:
9348+                # we already have the public key.
9349+                d = defer.succeed(None)
9350)
9351hunk ./src/allmydata/mutable/servermap.py 676
9352                 self._servermap.problems.append(f)
9353                 pass
9354 
9355-        self._status.timings["cumulative_verify"] += (time.time() - now)
9356+            # Neither of these two branches return anything of
9357+            # consequence, so the first entry in our deferredlist will
9358+            # be None.
9359 
9360hunk ./src/allmydata/mutable/servermap.py 680
9361-        if self._need_privkey and last_verinfo:
9362-            # send them a request for the privkey. We send one request per
9363-            # server.
9364-            lp2 = self.log("sending privkey request",
9365-                           parent=lp, level=log.NOISY)
9366-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
9367-             offsets_tuple) = last_verinfo
9368-            o = dict(offsets_tuple)
9369+            # - Next, we need the version information. We almost
9370+            #   certainly got this by reading the first thousand or so
9371+            #   bytes of the share on the storage server, so we
9372+            #   shouldn't need to fetch anything at this step.
9373+            d2 = reader.get_verinfo()
9374+            d2.addErrback(lambda error, shnum=shnum, peerid=peerid:
9375+                self._got_corrupt_share(error, shnum, peerid, data, lp))
9376+            # - Next, we need the signature. For an SDMF share, it is
9377+            #   likely that we fetched this when doing our initial fetch
9378+            #   to get the version information. In MDMF, this lives at
9379+            #   the end of the share, so unless the file is quite small,
9380+            #   we'll need to do a remote fetch to get it.
9381+            d3 = reader.get_signature(queue=True)
9382+            d3.addErrback(lambda error, shnum=shnum, peerid=peerid:
9383+                self._got_corrupt_share(error, shnum, peerid, data, lp))
9384+            #  Once we have all three of these responses, we can move on
9385+            #  to validating the signature
9386 
9387hunk ./src/allmydata/mutable/servermap.py 698
9388-            self._queries_outstanding.add(peerid)
9389-            readv = [ (o['enc_privkey'], (o['EOF'] - o['enc_privkey'])) ]
9390-            ss = self._servermap.connections[peerid]
9391-            privkey_started = time.time()
9392-            d = self._do_read(ss, peerid, self._storage_index,
9393-                              [last_shnum], readv)
9394-            d.addCallback(self._got_privkey_results, peerid, last_shnum,
9395-                          privkey_started, lp2)
9396-            d.addErrback(self._privkey_query_failed, peerid, last_shnum, lp2)
9397-            d.addErrback(log.err)
9398-            d.addCallback(self._check_for_done)
9399-            d.addErrback(self._fatal_error)
9400+            # Does the node already have a privkey? If not, we'll try to
9401+            # fetch it here.
9402+            if self._need_privkey:
9403+                d4 = reader.get_encprivkey(queue=True)
9404+                d4.addCallback(lambda results, shnum=shnum, peerid=peerid:
9405+                    self._try_to_validate_privkey(results, peerid, shnum, lp))
9406+                d4.addErrback(lambda error, shnum=shnum, peerid=peerid:
9407+                    self._privkey_query_failed(error, shnum, data, lp))
9408+            else:
9409+                d4 = defer.succeed(None)
9410+
9411+
9412+            if self.fetch_update_data:
9413+                # fetch the block hash tree and first + last segment, as
9414+                # configured earlier.
9415+                # Then set them in wherever we happen to want to set
9416+                # them.
9417+                ds = []
9418+                # XXX: We do this above, too. Is there a good way to
9419+                # make the two routines share the value without
9420+                # introducing more roundtrips?
9421+                ds.append(reader.get_verinfo())
9422+                ds.append(reader.get_blockhashes(queue=True))
9423+                ds.append(reader.get_block_and_salt(self.start_segment,
9424+                                                    queue=True))
9425+                ds.append(reader.get_block_and_salt(self.end_segment,
9426+                                                    queue=True))
9427+                d5 = deferredutil.gatherResults(ds)
9428+                d5.addCallback(self._got_update_results_one_share, shnum)
9429+            else:
9430+                d5 = defer.succeed(None)
9431 
9432hunk ./src/allmydata/mutable/servermap.py 730
9433+            dl = defer.DeferredList([d, d2, d3, d4, d5])
9434+            dl.addBoth(self._turn_barrier)
9435+            reader.flush()
9436+            dl.addCallback(lambda results, shnum=shnum, peerid=peerid:
9437+                self._got_signature_one_share(results, shnum, peerid, lp))
9438+            dl.addErrback(lambda error, shnum=shnum, data=data:
9439+               self._got_corrupt_share(error, shnum, peerid, data, lp))
9440+            dl.addCallback(lambda verinfo, shnum=shnum, peerid=peerid, data=data:
9441+                self._cache_good_sharedata(verinfo, shnum, now, data))
9442+            ds.append(dl)
9443+        # dl is a deferred list that will fire when all of the shares
9444+        # that we found on this peer are done processing. When dl fires,
9445+        # we know that processing is done, so we can decrement the
9446+        # semaphore-like thing that we incremented earlier.
9447+        dl = defer.DeferredList(ds, fireOnOneErrback=True)
9448+        # Are we done? Done means that there are no more queries to
9449+        # send, that there are no outstanding queries, and that we
9450+        # haven't received any queries that are still processing. If we
9451+        # are done, self._check_for_done will cause the done deferred
9452+        # that we returned to our caller to fire, which tells them that
9453+        # they have a complete servermap, and that we won't be touching
9454+        # the servermap anymore.
9455+        dl.addCallback(_done_processing)
9456+        dl.addCallback(self._check_for_done)
9457+        dl.addErrback(self._fatal_error)
9458         # all done!
9459         self.log("_got_results done", parent=lp, level=log.NOISY)
9460hunk ./src/allmydata/mutable/servermap.py 757
9461+        return dl
9462+
9463+
9464+    def _turn_barrier(self, result):
9465+        """
9466+        I help the servermap updater avoid the recursion limit issues
9467+        discussed in #237.
9468+        """
9469+        return fireEventually(result)
9470+
9471+
9472+    def _try_to_set_pubkey(self, pubkey_s, peerid, shnum, lp):
9473+        if self._node.get_pubkey():
9474+            return # don't go through this again if we don't have to
9475+        fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
9476+        assert len(fingerprint) == 32
9477+        if fingerprint != self._node.get_fingerprint():
9478+            raise CorruptShareError(peerid, shnum,
9479+                                "pubkey doesn't match fingerprint")
9480+        self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
9481+        assert self._node.get_pubkey()
9482+
9483 
9484     def notify_server_corruption(self, peerid, shnum, reason):
9485         ss = self._servermap.connections[peerid]
9486hunk ./src/allmydata/mutable/servermap.py 785
9487         ss.callRemoteOnly("advise_corrupt_share",
9488                           "mutable", self._storage_index, shnum, reason)
9489 
9490-    def _got_results_one_share(self, shnum, data, peerid, lp):
9491+
9492+    def _got_signature_one_share(self, results, shnum, peerid, lp):
9493+        # It is our job to give versioninfo to our caller. We need to
9494+        # raise CorruptShareError if the share is corrupt for any
9495+        # reason, something that our caller will handle.
9496         self.log(format="_got_results: got shnum #%(shnum)d from peerid %(peerid)s",
9497                  shnum=shnum,
9498                  peerid=idlib.shortnodeid_b2a(peerid),
9499hunk ./src/allmydata/mutable/servermap.py 795
9500                  level=log.NOISY,
9501                  parent=lp)
9502+        if not self._running:
9503+            # We can't process the results, since we can't touch the
9504+            # servermap anymore.
9505+            self.log("but we're not running anymore.")
9506+            return None
9507 
9508hunk ./src/allmydata/mutable/servermap.py 801
9509-        # this might raise NeedMoreDataError, if the pubkey and signature
9510-        # live at some weird offset. That shouldn't happen, so I'm going to
9511-        # treat it as a bad share.
9512-        (seqnum, root_hash, IV, k, N, segsize, datalength,
9513-         pubkey_s, signature, prefix) = unpack_prefix_and_signature(data)
9514-
9515-        if not self._node.get_pubkey():
9516-            fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
9517-            assert len(fingerprint) == 32
9518-            if fingerprint != self._node.get_fingerprint():
9519-                raise CorruptShareError(peerid, shnum,
9520-                                        "pubkey doesn't match fingerprint")
9521-            self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
9522-
9523-        if self._need_privkey:
9524-            self._try_to_extract_privkey(data, peerid, shnum, lp)
9525-
9526-        (ig_version, ig_seqnum, ig_root_hash, ig_IV, ig_k, ig_N,
9527-         ig_segsize, ig_datalen, offsets) = unpack_header(data)
9528+        _, verinfo, signature, __, ___ = results
9529+        (seqnum,
9530+         root_hash,
9531+         saltish,
9532+         segsize,
9533+         datalen,
9534+         k,
9535+         n,
9536+         prefix,
9537+         offsets) = verinfo[1]
9538         offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
9539 
9540hunk ./src/allmydata/mutable/servermap.py 813
9541-        verinfo = (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
9542+        # XXX: This should be done for us in the method, so
9543+        # presumably you can go in there and fix it.
9544+        verinfo = (seqnum,
9545+                   root_hash,
9546+                   saltish,
9547+                   segsize,
9548+                   datalen,
9549+                   k,
9550+                   n,
9551+                   prefix,
9552                    offsets_tuple)
9553hunk ./src/allmydata/mutable/servermap.py 824
9554+        # This tuple uniquely identifies a share on the grid; we use it
9555+        # to keep track of the ones that we've already seen.
9556 
9557         if verinfo not in self._valid_versions:
9558hunk ./src/allmydata/mutable/servermap.py 828
9559-            # it's a new pair. Verify the signature.
9560-            valid = self._node.get_pubkey().verify(prefix, signature)
9561+            # This is a new version tuple, and we need to validate it
9562+            # against the public key before keeping track of it.
9563+            assert self._node.get_pubkey()
9564+            valid = self._node.get_pubkey().verify(prefix, signature[1])
9565             if not valid:
9566hunk ./src/allmydata/mutable/servermap.py 833
9567-                raise CorruptShareError(peerid, shnum, "signature is invalid")
9568+                raise CorruptShareError(peerid, shnum,
9569+                                        "signature is invalid")
9570 
9571hunk ./src/allmydata/mutable/servermap.py 836
9572-            # ok, it's a valid verinfo. Add it to the list of validated
9573-            # versions.
9574-            self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
9575-                     % (seqnum, base32.b2a(root_hash)[:4],
9576-                        idlib.shortnodeid_b2a(peerid), shnum,
9577-                        k, N, segsize, datalength),
9578-                     parent=lp)
9579-            self._valid_versions.add(verinfo)
9580-        # We now know that this is a valid candidate verinfo.
9581+        # ok, it's a valid verinfo. Add it to the list of validated
9582+        # versions.
9583+        self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
9584+                 % (seqnum, base32.b2a(root_hash)[:4],
9585+                    idlib.shortnodeid_b2a(peerid), shnum,
9586+                    k, n, segsize, datalen),
9587+                    parent=lp)
9588+        self._valid_versions.add(verinfo)
9589+        # We now know that this is a valid candidate verinfo. Whether or
9590+        # not this instance of it is valid is a matter for the next
9591+        # statement; at this point, we just know that if we see this
9592+        # version info again, that its signature checks out and that
9593+        # we're okay to skip the signature-checking step.
9594 
9595hunk ./src/allmydata/mutable/servermap.py 850
9596+        # (peerid, shnum) are bound in the method invocation.
9597         if (peerid, shnum) in self._servermap.bad_shares:
9598             # we've been told that the rest of the data in this share is
9599             # unusable, so don't add it to the servermap.
9600hunk ./src/allmydata/mutable/servermap.py 863
9601         self._servermap.add_new_share(peerid, shnum, verinfo, timestamp)
9602         # and the versionmap
9603         self.versionmap.add(verinfo, (shnum, peerid, timestamp))
9604+
9605+        # It's our job to set the protocol version of our parent
9606+        # filenode if it isn't already set.
9607+        if not self._node.get_version():
9608+            # The first byte of the prefix is the version.
9609+            v = struct.unpack(">B", prefix[:1])[0]
9610+            self.log("got version %d" % v)
9611+            self._node.set_version(v)
9612+
9613         return verinfo
9614 
9615hunk ./src/allmydata/mutable/servermap.py 874
9616-    def _deserialize_pubkey(self, pubkey_s):
9617-        verifier = rsa.create_verifying_key_from_string(pubkey_s)
9618-        return verifier
9619 
9620hunk ./src/allmydata/mutable/servermap.py 875
9621-    def _try_to_extract_privkey(self, data, peerid, shnum, lp):
9622-        try:
9623-            r = unpack_share(data)
9624-        except NeedMoreDataError, e:
9625-            # this share won't help us. oh well.
9626-            offset = e.encprivkey_offset
9627-            length = e.encprivkey_length
9628-            self.log("shnum %d on peerid %s: share was too short (%dB) "
9629-                     "to get the encprivkey; [%d:%d] ought to hold it" %
9630-                     (shnum, idlib.shortnodeid_b2a(peerid), len(data),
9631-                      offset, offset+length),
9632-                     parent=lp)
9633-            # NOTE: if uncoordinated writes are taking place, someone might
9634-            # change the share (and most probably move the encprivkey) before
9635-            # we get a chance to do one of these reads and fetch it. This
9636-            # will cause us to see a NotEnoughSharesError(unable to fetch
9637-            # privkey) instead of an UncoordinatedWriteError . This is a
9638-            # nuisance, but it will go away when we move to DSA-based mutable
9639-            # files (since the privkey will be small enough to fit in the
9640-            # write cap).
9641+    def _got_update_results_one_share(self, results, share):
9642+        """
9643+        I record the update results in results.
9644+        """
9645+        assert len(results) == 4
9646+        verinfo, blockhashes, start, end = results
9647+        (seqnum,
9648+         root_hash,
9649+         saltish,
9650+         segsize,
9651+         datalen,
9652+         k,
9653+         n,
9654+         prefix,
9655+         offsets) = verinfo
9656+        offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
9657 
9658hunk ./src/allmydata/mutable/servermap.py 892
9659-            return
9660+        # XXX: This should be done for us in the method, so
9661+        # presumably you can go in there and fix it.
9662+        verinfo = (seqnum,
9663+                   root_hash,
9664+                   saltish,
9665+                   segsize,
9666+                   datalen,
9667+                   k,
9668+                   n,
9669+                   prefix,
9670+                   offsets_tuple)
9671 
9672hunk ./src/allmydata/mutable/servermap.py 904
9673-        (seqnum, root_hash, IV, k, N, segsize, datalen,
9674-         pubkey, signature, share_hash_chain, block_hash_tree,
9675-         share_data, enc_privkey) = r
9676+        update_data = (blockhashes, start, end)
9677+        self._servermap.set_update_data_for_share_and_verinfo(share,
9678+                                                              verinfo,
9679+                                                              update_data)
9680 
9681hunk ./src/allmydata/mutable/servermap.py 909
9682-        return self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
9683+
9684+    def _deserialize_pubkey(self, pubkey_s):
9685+        verifier = rsa.create_verifying_key_from_string(pubkey_s)
9686+        return verifier
9687 
9688hunk ./src/allmydata/mutable/servermap.py 914
9689-    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
9690 
9691hunk ./src/allmydata/mutable/servermap.py 915
9692+    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
9693+        """
9694+        Given a writekey from a remote server, I validate it against the
9695+        writekey stored in my node. If it is valid, then I set the
9696+        privkey and encprivkey properties of the node.
9697+        """
9698         alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
9699         alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
9700         if alleged_writekey != self._node.get_writekey():
9701hunk ./src/allmydata/mutable/servermap.py 993
9702         self._queries_completed += 1
9703         self._last_failure = f
9704 
9705-    def _got_privkey_results(self, datavs, peerid, shnum, started, lp):
9706-        now = time.time()
9707-        elapsed = now - started
9708-        self._status.add_per_server_time(peerid, "privkey", started, elapsed)
9709-        self._queries_outstanding.discard(peerid)
9710-        if not self._need_privkey:
9711-            return
9712-        if shnum not in datavs:
9713-            self.log("privkey wasn't there when we asked it",
9714-                     level=log.WEIRD, umid="VA9uDQ")
9715-            return
9716-        datav = datavs[shnum]
9717-        enc_privkey = datav[0]
9718-        self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
9719 
9720     def _privkey_query_failed(self, f, peerid, shnum, lp):
9721         self._queries_outstanding.discard(peerid)
9722hunk ./src/allmydata/mutable/servermap.py 1007
9723         self._servermap.problems.append(f)
9724         self._last_failure = f
9725 
9726+
9727     def _check_for_done(self, res):
9728         # exit paths:
9729         #  return self._send_more_queries(outstanding) : send some more queries
9730hunk ./src/allmydata/mutable/servermap.py 1013
9731         #  return self._done() : all done
9732         #  return : keep waiting, no new queries
9733-
9734         lp = self.log(format=("_check_for_done, mode is '%(mode)s', "
9735                               "%(outstanding)d queries outstanding, "
9736                               "%(extra)d extra peers available, "
9737hunk ./src/allmydata/mutable/servermap.py 1204
9738 
9739     def _done(self):
9740         if not self._running:
9741+            self.log("not running; we're already done")
9742             return
9743         self._running = False
9744         now = time.time()
9745hunk ./src/allmydata/mutable/servermap.py 1219
9746         self._servermap.last_update_time = self._started
9747         # the servermap will not be touched after this
9748         self.log("servermap: %s" % self._servermap.summarize_versions())
9749+
9750         eventually(self._done_deferred.callback, self._servermap)
9751 
9752     def _fatal_error(self, f):
9753}
9754[tests:
9755Kevan Carstensen <kevan@isnotajoke.com>**20100819003531
9756 Ignore-this: 314e8bbcce532ea4d5d2cecc9f31cca0
9757 
9758     - A lot of existing tests relied on aspects of the mutable file
9759       implementation that were changed. This patch updates those tests
9760       to work with the changes.
9761     - This patch also adds tests for new features.
9762] {
9763hunk ./src/allmydata/test/common.py 11
9764 from foolscap.api import flushEventualQueue, fireEventually
9765 from allmydata import uri, dirnode, client
9766 from allmydata.introducer.server import IntroducerNode
9767-from allmydata.interfaces import IMutableFileNode, IImmutableFileNode, \
9768-     FileTooLargeError, NotEnoughSharesError, ICheckable
9769+from allmydata.interfaces import IMutableFileNode, IImmutableFileNode,\
9770+                                 NotEnoughSharesError, ICheckable, \
9771+                                 IMutableUploadable, SDMF_VERSION, \
9772+                                 MDMF_VERSION
9773 from allmydata.check_results import CheckResults, CheckAndRepairResults, \
9774      DeepCheckResults, DeepCheckAndRepairResults
9775 from allmydata.mutable.common import CorruptShareError
9776hunk ./src/allmydata/test/common.py 19
9777 from allmydata.mutable.layout import unpack_header
9778+from allmydata.mutable.publish import MutableData
9779 from allmydata.storage.server import storage_index_to_dir
9780 from allmydata.storage.mutable import MutableShareFile
9781 from allmydata.util import hashutil, log, fileutil, pollmixin
9782hunk ./src/allmydata/test/common.py 153
9783         consumer.write(data[start:end])
9784         return consumer
9785 
9786+
9787+    def get_best_readable_version(self):
9788+        return defer.succeed(self)
9789+
9790+
9791+    download_best_version = download_to_data
9792+
9793+
9794+    def download_to_data(self):
9795+        return download_to_data(self)
9796+
9797+
9798+    def get_size_of_best_version(self):
9799+        return defer.succeed(self.get_size)
9800+
9801+
9802 def make_chk_file_cap(size):
9803     return uri.CHKFileURI(key=os.urandom(16),
9804                           uri_extension_hash=os.urandom(32),
9805hunk ./src/allmydata/test/common.py 193
9806     MUTABLE_SIZELIMIT = 10000
9807     all_contents = {}
9808     bad_shares = {}
9809+    file_types = {} # storage index => MDMF_VERSION or SDMF_VERSION
9810 
9811     def __init__(self, storage_broker, secret_holder,
9812                  default_encoding_parameters, history):
9813hunk ./src/allmydata/test/common.py 200
9814         self.init_from_cap(make_mutable_file_cap())
9815     def create(self, contents, key_generator=None, keysize=None):
9816         initial_contents = self._get_initial_contents(contents)
9817-        if len(initial_contents) > self.MUTABLE_SIZELIMIT:
9818-            raise FileTooLargeError("SDMF is limited to one segment, and "
9819-                                    "%d > %d" % (len(initial_contents),
9820-                                                 self.MUTABLE_SIZELIMIT))
9821-        self.all_contents[self.storage_index] = initial_contents
9822+        data = initial_contents.read(initial_contents.get_size())
9823+        data = "".join(data)
9824+        self.all_contents[self.storage_index] = data
9825         return defer.succeed(self)
9826     def _get_initial_contents(self, contents):
9827hunk ./src/allmydata/test/common.py 205
9828-        if isinstance(contents, str):
9829-            return contents
9830         if contents is None:
9831hunk ./src/allmydata/test/common.py 206
9832-            return ""
9833+            return MutableData("")
9834+
9835+        if IMutableUploadable.providedBy(contents):
9836+            return contents
9837+
9838         assert callable(contents), "%s should be callable, not %s" % \
9839                (contents, type(contents))
9840         return contents(self)
9841hunk ./src/allmydata/test/common.py 258
9842     def get_storage_index(self):
9843         return self.storage_index
9844 
9845+    def get_servermap(self, mode):
9846+        return defer.succeed(None)
9847+
9848+    def set_version(self, version):
9849+        assert version in (SDMF_VERSION, MDMF_VERSION)
9850+        self.file_types[self.storage_index] = version
9851+
9852+    def get_version(self):
9853+        assert self.storage_index in self.file_types
9854+        return self.file_types[self.storage_index]
9855+
9856     def check(self, monitor, verify=False, add_lease=False):
9857         r = CheckResults(self.my_uri, self.storage_index)
9858         is_bad = self.bad_shares.get(self.storage_index, None)
9859hunk ./src/allmydata/test/common.py 327
9860         return d
9861 
9862     def download_best_version(self):
9863+        return defer.succeed(self._download_best_version())
9864+
9865+
9866+    def _download_best_version(self, ignored=None):
9867         if isinstance(self.my_uri, uri.LiteralFileURI):
9868hunk ./src/allmydata/test/common.py 332
9869-            return defer.succeed(self.my_uri.data)
9870+            return self.my_uri.data
9871         if self.storage_index not in self.all_contents:
9872hunk ./src/allmydata/test/common.py 334
9873-            return defer.fail(NotEnoughSharesError(None, 0, 3))
9874-        return defer.succeed(self.all_contents[self.storage_index])
9875+            raise NotEnoughSharesError(None, 0, 3)
9876+        return self.all_contents[self.storage_index]
9877+
9878 
9879     def overwrite(self, new_contents):
9880hunk ./src/allmydata/test/common.py 339
9881-        if len(new_contents) > self.MUTABLE_SIZELIMIT:
9882-            raise FileTooLargeError("SDMF is limited to one segment, and "
9883-                                    "%d > %d" % (len(new_contents),
9884-                                                 self.MUTABLE_SIZELIMIT))
9885         assert not self.is_readonly()
9886hunk ./src/allmydata/test/common.py 340
9887-        self.all_contents[self.storage_index] = new_contents
9888+        new_data = new_contents.read(new_contents.get_size())
9889+        new_data = "".join(new_data)
9890+        self.all_contents[self.storage_index] = new_data
9891         return defer.succeed(None)
9892     def modify(self, modifier):
9893         # this does not implement FileTooLargeError, but the real one does
9894hunk ./src/allmydata/test/common.py 350
9895     def _modify(self, modifier):
9896         assert not self.is_readonly()
9897         old_contents = self.all_contents[self.storage_index]
9898-        self.all_contents[self.storage_index] = modifier(old_contents, None, True)
9899+        new_data = modifier(old_contents, None, True)
9900+        self.all_contents[self.storage_index] = new_data
9901         return None
9902 
9903hunk ./src/allmydata/test/common.py 354
9904+    # As actually implemented, MutableFilenode and MutableFileVersion
9905+    # are distinct. However, nothing in the webapi uses (yet) that
9906+    # distinction -- it just uses the unified download interface
9907+    # provided by get_best_readable_version and read. When we start
9908+    # doing cooler things like LDMF, we will want to revise this code to
9909+    # be less simplistic.
9910+    def get_best_readable_version(self):
9911+        return defer.succeed(self)
9912+
9913+
9914+    def get_best_mutable_version(self):
9915+        return defer.succeed(self)
9916+
9917+    # Ditto for this, which is an implementation of IWritable.
9918+    # XXX: Declare that the same is implemented.
9919+    def update(self, data, offset):
9920+        assert not self.is_readonly()
9921+        def modifier(old, servermap, first_time):
9922+            new = old[:offset] + "".join(data.read(data.get_size()))
9923+            new += old[len(new):]
9924+            return new
9925+        return self.modify(modifier)
9926+
9927+
9928+    def read(self, consumer, offset=0, size=None):
9929+        data = self._download_best_version()
9930+        if size:
9931+            data = data[offset:offset+size]
9932+        consumer.write(data)
9933+        return defer.succeed(consumer)
9934+
9935+
9936 def make_mutable_file_cap():
9937     return uri.WriteableSSKFileURI(writekey=os.urandom(16),
9938                                    fingerprint=os.urandom(32))
9939hunk ./src/allmydata/test/test_checker.py 11
9940 from allmydata.test.no_network import GridTestMixin
9941 from allmydata.immutable.upload import Data
9942 from allmydata.test.common_web import WebRenderingMixin
9943+from allmydata.mutable.publish import MutableData
9944 
9945 class FakeClient:
9946     def get_storage_broker(self):
9947hunk ./src/allmydata/test/test_checker.py 291
9948         def _stash_immutable(ur):
9949             self.imm = c0.create_node_from_uri(ur.uri)
9950         d.addCallback(_stash_immutable)
9951-        d.addCallback(lambda ign: c0.create_mutable_file("contents"))
9952+        d.addCallback(lambda ign:
9953+            c0.create_mutable_file(MutableData("contents")))
9954         def _stash_mutable(node):
9955             self.mut = node
9956         d.addCallback(_stash_mutable)
9957hunk ./src/allmydata/test/test_cli.py 13
9958 from allmydata.util import fileutil, hashutil, base32
9959 from allmydata import uri
9960 from allmydata.immutable import upload
9961+from allmydata.mutable.publish import MutableData
9962 from allmydata.dirnode import normalize
9963 
9964 # Test that the scripts can be imported.
9965hunk ./src/allmydata/test/test_cli.py 662
9966 
9967         d = self.do_cli("create-alias", etudes_arg)
9968         def _check_create_unicode((rc, out, err)):
9969-            self.failUnlessReallyEqual(rc, 0)
9970+            #self.failUnlessReallyEqual(rc, 0)
9971             self.failUnlessReallyEqual(err, "")
9972             self.failUnlessIn("Alias %s created" % quote_output(u"\u00E9tudes"), out)
9973 
9974hunk ./src/allmydata/test/test_cli.py 967
9975         d.addCallback(lambda (rc,out,err): self.failUnlessReallyEqual(out, DATA2))
9976         return d
9977 
9978+    def test_mutable_type(self):
9979+        self.basedir = "cli/Put/mutable_type"
9980+        self.set_up_grid()
9981+        data = "data" * 100000
9982+        fn1 = os.path.join(self.basedir, "data")
9983+        fileutil.write(fn1, data)
9984+        d = self.do_cli("create-alias", "tahoe")
9985+        d.addCallback(lambda ignored:
9986+            self.do_cli("put", "--mutable", "--mutable-type=mdmf",
9987+                        fn1, "tahoe:uploaded.txt"))
9988+        d.addCallback(lambda ignored:
9989+            self.do_cli("ls", "--json", "tahoe:uploaded.txt"))
9990+        d.addCallback(lambda (rc, json, err): self.failUnlessIn("mdmf", json))
9991+        d.addCallback(lambda ignored:
9992+            self.do_cli("put", "--mutable", "--mutable-type=sdmf",
9993+                        fn1, "tahoe:uploaded2.txt"))
9994+        d.addCallback(lambda ignored:
9995+            self.do_cli("ls", "--json", "tahoe:uploaded2.txt"))
9996+        d.addCallback(lambda (rc, json, err):
9997+            self.failUnlessIn("sdmf", json))
9998+        return d
9999+
10000+    def test_mutable_type_unlinked(self):
10001+        self.basedir = "cli/Put/mutable_type_unlinked"
10002+        self.set_up_grid()
10003+        data = "data" * 100000
10004+        fn1 = os.path.join(self.basedir, "data")
10005+        fileutil.write(fn1, data)
10006+        d = self.do_cli("put", "--mutable", "--mutable-type=mdmf", fn1)
10007+        d.addCallback(lambda (rc, cap, err):
10008+            self.do_cli("ls", "--json", cap))
10009+        d.addCallback(lambda (rc, json, err): self.failUnlessIn("mdmf", json))
10010+        d.addCallback(lambda ignored:
10011+            self.do_cli("put", "--mutable", "--mutable-type=sdmf", fn1))
10012+        d.addCallback(lambda (rc, cap, err):
10013+            self.do_cli("ls", "--json", cap))
10014+        d.addCallback(lambda (rc, json, err):
10015+            self.failUnlessIn("sdmf", json))
10016+        return d
10017+
10018+    def test_mutable_type_invalid_format(self):
10019+        self.basedir = "cli/Put/mutable_type_invalid_format"
10020+        self.set_up_grid()
10021+        data = "data" * 100000
10022+        fn1 = os.path.join(self.basedir, "data")
10023+        fileutil.write(fn1, data)
10024+        d = self.do_cli("put", "--mutable", "--mutable-type=ldmf", fn1)
10025+        def _check_failure((rc, out, err)):
10026+            self.failIfEqual(rc, 0)
10027+            self.failUnlessIn("invalid", err)
10028+        d.addCallback(_check_failure)
10029+        return d
10030+
10031     def test_put_with_nonexistent_alias(self):
10032         # when invoked with an alias that doesn't exist, 'tahoe put'
10033         # should output a useful error message, not a stack trace
10034hunk ./src/allmydata/test/test_cli.py 2136
10035         self.set_up_grid()
10036         c0 = self.g.clients[0]
10037         DATA = "data" * 100
10038-        d = c0.create_mutable_file(DATA)
10039+        DATA_uploadable = MutableData(DATA)
10040+        d = c0.create_mutable_file(DATA_uploadable)
10041         def _stash_uri(n):
10042             self.uri = n.get_uri()
10043         d.addCallback(_stash_uri)
10044hunk ./src/allmydata/test/test_cli.py 2238
10045                                            upload.Data("literal",
10046                                                         convergence="")))
10047         d.addCallback(_stash_uri, "small")
10048-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"1"))
10049+        d.addCallback(lambda ign:
10050+            c0.create_mutable_file(MutableData(DATA+"1")))
10051         d.addCallback(lambda fn: self.rootnode.set_node(u"mutable", fn))
10052         d.addCallback(_stash_uri, "mutable")
10053 
10054hunk ./src/allmydata/test/test_cli.py 2257
10055         # root/small
10056         # root/mutable
10057 
10058+        # We haven't broken anything yet, so this should all be healthy.
10059         d.addCallback(lambda ign: self.do_cli("deep-check", "--verbose",
10060                                               self.rooturi))
10061         def _check2((rc, out, err)):
10062hunk ./src/allmydata/test/test_cli.py 2272
10063                             in lines, out)
10064         d.addCallback(_check2)
10065 
10066+        # Similarly, all of these results should be as we expect them to
10067+        # be for a healthy file layout.
10068         d.addCallback(lambda ign: self.do_cli("stats", self.rooturi))
10069         def _check_stats((rc, out, err)):
10070             self.failUnlessReallyEqual(err, "")
10071hunk ./src/allmydata/test/test_cli.py 2289
10072             self.failUnlessIn(" 317-1000 : 1    (1000 B, 1000 B)", lines)
10073         d.addCallback(_check_stats)
10074 
10075+        # Now we break things.
10076         def _clobber_shares(ignored):
10077             shares = self.find_uri_shares(self.uris[u"g\u00F6\u00F6d"])
10078             self.failUnlessReallyEqual(len(shares), 10)
10079hunk ./src/allmydata/test/test_cli.py 2314
10080 
10081         d.addCallback(lambda ign:
10082                       self.do_cli("deep-check", "--verbose", self.rooturi))
10083+        # This should reveal the missing share, but not the corrupt
10084+        # share, since we didn't tell the deep check operation to also
10085+        # verify.
10086         def _check3((rc, out, err)):
10087             self.failUnlessReallyEqual(err, "")
10088             self.failUnlessReallyEqual(rc, 0)
10089hunk ./src/allmydata/test/test_cli.py 2365
10090                                   "--verbose", "--verify", "--repair",
10091                                   self.rooturi))
10092         def _check6((rc, out, err)):
10093+            # We've just repaired the directory. There is no reason for
10094+            # that repair to be unsuccessful.
10095             self.failUnlessReallyEqual(err, "")
10096             self.failUnlessReallyEqual(rc, 0)
10097             lines = out.splitlines()
10098hunk ./src/allmydata/test/test_deepcheck.py 9
10099 from twisted.internet import threads # CLI tests use deferToThread
10100 from allmydata.immutable import upload
10101 from allmydata.mutable.common import UnrecoverableFileError
10102+from allmydata.mutable.publish import MutableData
10103 from allmydata.util import idlib
10104 from allmydata.util import base32
10105 from allmydata.scripts import runner
10106hunk ./src/allmydata/test/test_deepcheck.py 38
10107         self.basedir = "deepcheck/MutableChecker/good"
10108         self.set_up_grid()
10109         CONTENTS = "a little bit of data"
10110-        d = self.g.clients[0].create_mutable_file(CONTENTS)
10111+        CONTENTS_uploadable = MutableData(CONTENTS)
10112+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
10113         def _created(node):
10114             self.node = node
10115             self.fileurl = "uri/" + urllib.quote(node.get_uri())
10116hunk ./src/allmydata/test/test_deepcheck.py 61
10117         self.basedir = "deepcheck/MutableChecker/corrupt"
10118         self.set_up_grid()
10119         CONTENTS = "a little bit of data"
10120-        d = self.g.clients[0].create_mutable_file(CONTENTS)
10121+        CONTENTS_uploadable = MutableData(CONTENTS)
10122+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
10123         def _stash_and_corrupt(node):
10124             self.node = node
10125             self.fileurl = "uri/" + urllib.quote(node.get_uri())
10126hunk ./src/allmydata/test/test_deepcheck.py 99
10127         self.basedir = "deepcheck/MutableChecker/delete_share"
10128         self.set_up_grid()
10129         CONTENTS = "a little bit of data"
10130-        d = self.g.clients[0].create_mutable_file(CONTENTS)
10131+        CONTENTS_uploadable = MutableData(CONTENTS)
10132+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
10133         def _stash_and_delete(node):
10134             self.node = node
10135             self.fileurl = "uri/" + urllib.quote(node.get_uri())
10136hunk ./src/allmydata/test/test_deepcheck.py 223
10137             self.root = n
10138             self.root_uri = n.get_uri()
10139         d.addCallback(_created_root)
10140-        d.addCallback(lambda ign: c0.create_mutable_file("mutable file contents"))
10141+        d.addCallback(lambda ign:
10142+            c0.create_mutable_file(MutableData("mutable file contents")))
10143         d.addCallback(lambda n: self.root.set_node(u"mutable", n))
10144         def _created_mutable(n):
10145             self.mutable = n
10146hunk ./src/allmydata/test/test_deepcheck.py 965
10147     def create_mangled(self, ignored, name):
10148         nodetype, mangletype = name.split("-", 1)
10149         if nodetype == "mutable":
10150-            d = self.g.clients[0].create_mutable_file("mutable file contents")
10151+            mutable_uploadable = MutableData("mutable file contents")
10152+            d = self.g.clients[0].create_mutable_file(mutable_uploadable)
10153             d.addCallback(lambda n: self.root.set_node(unicode(name), n))
10154         elif nodetype == "large":
10155             large = upload.Data("Lots of data\n" * 1000 + name + "\n", None)
10156hunk ./src/allmydata/test/test_dirnode.py 1304
10157     implements(IMutableFileNode)
10158     counter = 0
10159     def __init__(self, initial_contents=""):
10160-        self.data = self._get_initial_contents(initial_contents)
10161+        data = self._get_initial_contents(initial_contents)
10162+        self.data = data.read(data.get_size())
10163+        self.data = "".join(self.data)
10164+
10165         counter = FakeMutableFile.counter
10166         FakeMutableFile.counter += 1
10167         writekey = hashutil.ssk_writekey_hash(str(counter))
10168hunk ./src/allmydata/test/test_dirnode.py 1354
10169         pass
10170 
10171     def modify(self, modifier):
10172-        self.data = modifier(self.data, None, True)
10173+        data = modifier(self.data, None, True)
10174+        self.data = data
10175         return defer.succeed(None)
10176 
10177 class FakeNodeMaker(NodeMaker):
10178hunk ./src/allmydata/test/test_dirnode.py 1359
10179-    def create_mutable_file(self, contents="", keysize=None):
10180+    def create_mutable_file(self, contents="", keysize=None, version=None):
10181         return defer.succeed(FakeMutableFile(contents))
10182 
10183 class FakeClient2(Client):
10184hunk ./src/allmydata/test/test_filenode.py 98
10185         def _check_segment(res):
10186             self.failUnlessEqual(res, DATA[1:1+5])
10187         d.addCallback(_check_segment)
10188+        d.addCallback(lambda ignored: fn1.get_best_readable_version())
10189+        d.addCallback(lambda fn2: self.failUnlessEqual(fn1, fn2))
10190+        d.addCallback(lambda ignored:
10191+            fn1.get_size_of_best_version())
10192+        d.addCallback(lambda size:
10193+            self.failUnlessEqual(size, len(DATA)))
10194+        d.addCallback(lambda ignored:
10195+            fn1.download_to_data())
10196+        d.addCallback(lambda data:
10197+            self.failUnlessEqual(data, DATA))
10198+        d.addCallback(lambda ignored:
10199+            fn1.download_best_version())
10200+        d.addCallback(lambda data:
10201+            self.failUnlessEqual(data, DATA))
10202 
10203         return d
10204 
10205hunk ./src/allmydata/test/test_hung_server.py 10
10206 from allmydata.util.consumer import download_to_data
10207 from allmydata.immutable import upload
10208 from allmydata.mutable.common import UnrecoverableFileError
10209+from allmydata.mutable.publish import MutableData
10210 from allmydata.storage.common import storage_index_to_dir
10211 from allmydata.test.no_network import GridTestMixin
10212 from allmydata.test.common import ShouldFailMixin
10213hunk ./src/allmydata/test/test_hung_server.py 110
10214         self.servers = self.servers[5:] + self.servers[:5]
10215 
10216         if mutable:
10217-            d = nm.create_mutable_file(mutable_plaintext)
10218+            uploadable = MutableData(mutable_plaintext)
10219+            d = nm.create_mutable_file(uploadable)
10220             def _uploaded_mutable(node):
10221                 self.uri = node.get_uri()
10222                 self.shares = self.find_uri_shares(self.uri)
10223hunk ./src/allmydata/test/test_immutable.py 263
10224         d.addCallback(_after_attempt)
10225         return d
10226 
10227+    def test_download_to_data(self):
10228+        d = self.n.download_to_data()
10229+        d.addCallback(lambda data:
10230+            self.failUnlessEqual(data, common.TEST_DATA))
10231+        return d
10232 
10233hunk ./src/allmydata/test/test_immutable.py 269
10234+
10235+    def test_download_best_version(self):
10236+        d = self.n.download_best_version()
10237+        d.addCallback(lambda data:
10238+            self.failUnlessEqual(data, common.TEST_DATA))
10239+        return d
10240+
10241+
10242+    def test_get_best_readable_version(self):
10243+        d = self.n.get_best_readable_version()
10244+        d.addCallback(lambda n2:
10245+            self.failUnlessEqual(n2, self.n))
10246+        return d
10247+
10248+    def test_get_size_of_best_version(self):
10249+        d = self.n.get_size_of_best_version()
10250+        d.addCallback(lambda size:
10251+            self.failUnlessEqual(size, len(common.TEST_DATA)))
10252+        return d
10253+
10254+
10255 # XXX extend these tests to show bad behavior of various kinds from servers:
10256 # raising exception from each remove_foo() method, for example
10257 
10258hunk ./src/allmydata/test/test_mutable.py 2
10259 
10260-import struct
10261+import os
10262 from cStringIO import StringIO
10263 from twisted.trial import unittest
10264 from twisted.internet import defer, reactor
10265hunk ./src/allmydata/test/test_mutable.py 8
10266 from allmydata import uri, client
10267 from allmydata.nodemaker import NodeMaker
10268-from allmydata.util import base32
10269+from allmydata.util import base32, consumer
10270 from allmydata.util.hashutil import tagged_hash, ssk_writekey_hash, \
10271      ssk_pubkey_fingerprint_hash
10272hunk ./src/allmydata/test/test_mutable.py 11
10273+from allmydata.util.deferredutil import gatherResults
10274 from allmydata.interfaces import IRepairResults, ICheckAndRepairResults, \
10275hunk ./src/allmydata/test/test_mutable.py 13
10276-     NotEnoughSharesError
10277+     NotEnoughSharesError, SDMF_VERSION, MDMF_VERSION
10278 from allmydata.monitor import Monitor
10279 from allmydata.test.common import ShouldFailMixin
10280 from allmydata.test.no_network import GridTestMixin
10281hunk ./src/allmydata/test/test_mutable.py 27
10282      NeedMoreDataError, UnrecoverableFileError, UncoordinatedWriteError, \
10283      NotEnoughServersError, CorruptShareError
10284 from allmydata.mutable.retrieve import Retrieve
10285-from allmydata.mutable.publish import Publish
10286+from allmydata.mutable.publish import Publish, MutableFileHandle, \
10287+                                      MutableData, \
10288+                                      DEFAULT_MAX_SEGMENT_SIZE
10289 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
10290hunk ./src/allmydata/test/test_mutable.py 31
10291-from allmydata.mutable.layout import unpack_header, unpack_share
10292+from allmydata.mutable.layout import unpack_header, MDMFSlotReadProxy
10293 from allmydata.mutable.repairer import MustForceRepairError
10294 
10295 import allmydata.test.common_util as testutil
10296hunk ./src/allmydata/test/test_mutable.py 100
10297         self.storage = storage
10298         self.queries = 0
10299     def callRemote(self, methname, *args, **kwargs):
10300+        self.queries += 1
10301         def _call():
10302             meth = getattr(self, methname)
10303             return meth(*args, **kwargs)
10304hunk ./src/allmydata/test/test_mutable.py 107
10305         d = fireEventually()
10306         d.addCallback(lambda res: _call())
10307         return d
10308+
10309     def callRemoteOnly(self, methname, *args, **kwargs):
10310hunk ./src/allmydata/test/test_mutable.py 109
10311+        self.queries += 1
10312         d = self.callRemote(methname, *args, **kwargs)
10313         d.addBoth(lambda ignore: None)
10314         pass
10315hunk ./src/allmydata/test/test_mutable.py 157
10316             chr(ord(original[byte_offset]) ^ 0x01) +
10317             original[byte_offset+1:])
10318 
10319+def add_two(original, byte_offset):
10320+    # It isn't enough to simply flip the bit for the version number,
10321+    # because 1 is a valid version number. So we add two instead.
10322+    return (original[:byte_offset] +
10323+            chr(ord(original[byte_offset]) ^ 0x02) +
10324+            original[byte_offset+1:])
10325+
10326 def corrupt(res, s, offset, shnums_to_corrupt=None, offset_offset=0):
10327     # if shnums_to_corrupt is None, corrupt all shares. Otherwise it is a
10328     # list of shnums to corrupt.
10329hunk ./src/allmydata/test/test_mutable.py 167
10330+    ds = []
10331     for peerid in s._peers:
10332         shares = s._peers[peerid]
10333         for shnum in shares:
10334hunk ./src/allmydata/test/test_mutable.py 175
10335                 and shnum not in shnums_to_corrupt):
10336                 continue
10337             data = shares[shnum]
10338-            (version,
10339-             seqnum,
10340-             root_hash,
10341-             IV,
10342-             k, N, segsize, datalen,
10343-             o) = unpack_header(data)
10344-            if isinstance(offset, tuple):
10345-                offset1, offset2 = offset
10346-            else:
10347-                offset1 = offset
10348-                offset2 = 0
10349-            if offset1 == "pubkey":
10350-                real_offset = 107
10351-            elif offset1 in o:
10352-                real_offset = o[offset1]
10353-            else:
10354-                real_offset = offset1
10355-            real_offset = int(real_offset) + offset2 + offset_offset
10356-            assert isinstance(real_offset, int), offset
10357-            shares[shnum] = flip_bit(data, real_offset)
10358-    return res
10359+            # We're feeding the reader all of the share data, so it
10360+            # won't need to use the rref that we didn't provide, nor the
10361+            # storage index that we didn't provide. We do this because
10362+            # the reader will work for both MDMF and SDMF.
10363+            reader = MDMFSlotReadProxy(None, None, shnum, data)
10364+            # We need to get the offsets for the next part.
10365+            d = reader.get_verinfo()
10366+            def _do_corruption(verinfo, data, shnum):
10367+                (seqnum,
10368+                 root_hash,
10369+                 IV,
10370+                 segsize,
10371+                 datalen,
10372+                 k, n, prefix, o) = verinfo
10373+                if isinstance(offset, tuple):
10374+                    offset1, offset2 = offset
10375+                else:
10376+                    offset1 = offset
10377+                    offset2 = 0
10378+                if offset1 == "pubkey" and IV:
10379+                    real_offset = 107
10380+                elif offset1 == "share_data" and not IV:
10381+                    real_offset = 107
10382+                elif offset1 in o:
10383+                    real_offset = o[offset1]
10384+                else:
10385+                    real_offset = offset1
10386+                real_offset = int(real_offset) + offset2 + offset_offset
10387+                assert isinstance(real_offset, int), offset
10388+                if offset1 == 0: # verbyte
10389+                    f = add_two
10390+                else:
10391+                    f = flip_bit
10392+                shares[shnum] = f(data, real_offset)
10393+            d.addCallback(_do_corruption, data, shnum)
10394+            ds.append(d)
10395+    dl = defer.DeferredList(ds)
10396+    dl.addCallback(lambda ignored: res)
10397+    return dl
10398 
10399 def make_storagebroker(s=None, num_peers=10):
10400     if not s:
10401hunk ./src/allmydata/test/test_mutable.py 256
10402             self.failUnlessEqual(len(shnums), 1)
10403         d.addCallback(_created)
10404         return d
10405+    test_create.timeout = 15
10406+
10407+
10408+    def test_create_mdmf(self):
10409+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
10410+        def _created(n):
10411+            self.failUnless(isinstance(n, MutableFileNode))
10412+            self.failUnlessEqual(n.get_storage_index(), n._storage_index)
10413+            sb = self.nodemaker.storage_broker
10414+            peer0 = sorted(sb.get_all_serverids())[0]
10415+            shnums = self._storage._peers[peer0].keys()
10416+            self.failUnlessEqual(len(shnums), 1)
10417+        d.addCallback(_created)
10418+        return d
10419+
10420 
10421     def test_serialize(self):
10422         n = MutableFileNode(None, None, {"k": 3, "n": 10}, None)
10423hunk ./src/allmydata/test/test_mutable.py 301
10424             d.addCallback(lambda smap: smap.dump(StringIO()))
10425             d.addCallback(lambda sio:
10426                           self.failUnless("3-of-10" in sio.getvalue()))
10427-            d.addCallback(lambda res: n.overwrite("contents 1"))
10428+            d.addCallback(lambda res: n.overwrite(MutableData("contents 1")))
10429             d.addCallback(lambda res: self.failUnlessIdentical(res, None))
10430             d.addCallback(lambda res: n.download_best_version())
10431             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
10432hunk ./src/allmydata/test/test_mutable.py 308
10433             d.addCallback(lambda res: n.get_size_of_best_version())
10434             d.addCallback(lambda size:
10435                           self.failUnlessEqual(size, len("contents 1")))
10436-            d.addCallback(lambda res: n.overwrite("contents 2"))
10437+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
10438             d.addCallback(lambda res: n.download_best_version())
10439             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
10440             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
10441hunk ./src/allmydata/test/test_mutable.py 312
10442-            d.addCallback(lambda smap: n.upload("contents 3", smap))
10443+            d.addCallback(lambda smap: n.upload(MutableData("contents 3"), smap))
10444             d.addCallback(lambda res: n.download_best_version())
10445             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 3"))
10446             d.addCallback(lambda res: n.get_servermap(MODE_ANYTHING))
10447hunk ./src/allmydata/test/test_mutable.py 324
10448             # mapupdate-to-retrieve data caching (i.e. make the shares larger
10449             # than the default readsize, which is 2000 bytes). A 15kB file
10450             # will have 5kB shares.
10451-            d.addCallback(lambda res: n.overwrite("large size file" * 1000))
10452+            d.addCallback(lambda res: n.overwrite(MutableData("large size file" * 1000)))
10453             d.addCallback(lambda res: n.download_best_version())
10454             d.addCallback(lambda res:
10455                           self.failUnlessEqual(res, "large size file" * 1000))
10456hunk ./src/allmydata/test/test_mutable.py 332
10457         d.addCallback(_created)
10458         return d
10459 
10460+
10461+    def test_upload_and_download_mdmf(self):
10462+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
10463+        def _created(n):
10464+            d = defer.succeed(None)
10465+            d.addCallback(lambda ignored:
10466+                n.get_servermap(MODE_READ))
10467+            def _then(servermap):
10468+                dumped = servermap.dump(StringIO())
10469+                self.failUnlessIn("3-of-10", dumped.getvalue())
10470+            d.addCallback(_then)
10471+            # Now overwrite the contents with some new contents. We want
10472+            # to make them big enough to force the file to be uploaded
10473+            # in more than one segment.
10474+            big_contents = "contents1" * 100000 # about 900 KiB
10475+            big_contents_uploadable = MutableData(big_contents)
10476+            d.addCallback(lambda ignored:
10477+                n.overwrite(big_contents_uploadable))
10478+            d.addCallback(lambda ignored:
10479+                n.download_best_version())
10480+            d.addCallback(lambda data:
10481+                self.failUnlessEqual(data, big_contents))
10482+            # Overwrite the contents again with some new contents. As
10483+            # before, they need to be big enough to force multiple
10484+            # segments, so that we make the downloader deal with
10485+            # multiple segments.
10486+            bigger_contents = "contents2" * 1000000 # about 9MiB
10487+            bigger_contents_uploadable = MutableData(bigger_contents)
10488+            d.addCallback(lambda ignored:
10489+                n.overwrite(bigger_contents_uploadable))
10490+            d.addCallback(lambda ignored:
10491+                n.download_best_version())
10492+            d.addCallback(lambda data:
10493+                self.failUnlessEqual(data, bigger_contents))
10494+            return d
10495+        d.addCallback(_created)
10496+        return d
10497+
10498+
10499+    def test_mdmf_write_count(self):
10500+        # Publishing an MDMF file should only cause one write for each
10501+        # share that is to be published. Otherwise, we introduce
10502+        # undesirable semantics that are a regression from SDMF
10503+        upload = MutableData("MDMF" * 100000) # about 400 KiB
10504+        d = self.nodemaker.create_mutable_file(upload,
10505+                                               version=MDMF_VERSION)
10506+        def _check_server_write_counts(ignored):
10507+            sb = self.nodemaker.storage_broker
10508+            peers = sb.test_servers.values()
10509+            for peer in peers:
10510+                self.failUnlessEqual(peer.queries, 1)
10511+        d.addCallback(_check_server_write_counts)
10512+        return d
10513+
10514+
10515     def test_create_with_initial_contents(self):
10516hunk ./src/allmydata/test/test_mutable.py 388
10517-        d = self.nodemaker.create_mutable_file("contents 1")
10518+        upload1 = MutableData("contents 1")
10519+        d = self.nodemaker.create_mutable_file(upload1)
10520         def _created(n):
10521             d = n.download_best_version()
10522             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
10523hunk ./src/allmydata/test/test_mutable.py 393
10524-            d.addCallback(lambda res: n.overwrite("contents 2"))
10525+            upload2 = MutableData("contents 2")
10526+            d.addCallback(lambda res: n.overwrite(upload2))
10527             d.addCallback(lambda res: n.download_best_version())
10528             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
10529             return d
10530hunk ./src/allmydata/test/test_mutable.py 400
10531         d.addCallback(_created)
10532         return d
10533+    test_create_with_initial_contents.timeout = 15
10534+
10535+
10536+    def test_create_mdmf_with_initial_contents(self):
10537+        initial_contents = "foobarbaz" * 131072 # 900KiB
10538+        initial_contents_uploadable = MutableData(initial_contents)
10539+        d = self.nodemaker.create_mutable_file(initial_contents_uploadable,
10540+                                               version=MDMF_VERSION)
10541+        def _created(n):
10542+            d = n.download_best_version()
10543+            d.addCallback(lambda data:
10544+                self.failUnlessEqual(data, initial_contents))
10545+            uploadable2 = MutableData(initial_contents + "foobarbaz")
10546+            d.addCallback(lambda ignored:
10547+                n.overwrite(uploadable2))
10548+            d.addCallback(lambda ignored:
10549+                n.download_best_version())
10550+            d.addCallback(lambda data:
10551+                self.failUnlessEqual(data, initial_contents +
10552+                                           "foobarbaz"))
10553+            return d
10554+        d.addCallback(_created)
10555+        return d
10556+    test_create_mdmf_with_initial_contents.timeout = 20
10557+
10558 
10559     def test_response_cache_memory_leak(self):
10560         d = self.nodemaker.create_mutable_file("contents")
10561hunk ./src/allmydata/test/test_mutable.py 451
10562             key = n.get_writekey()
10563             self.failUnless(isinstance(key, str), key)
10564             self.failUnlessEqual(len(key), 16) # AES key size
10565-            return data
10566+            return MutableData(data)
10567         d = self.nodemaker.create_mutable_file(_make_contents)
10568         def _created(n):
10569             return n.download_best_version()
10570hunk ./src/allmydata/test/test_mutable.py 459
10571         d.addCallback(lambda data2: self.failUnlessEqual(data2, data))
10572         return d
10573 
10574+
10575+    def test_create_mdmf_with_initial_contents_function(self):
10576+        data = "initial contents" * 100000
10577+        def _make_contents(n):
10578+            self.failUnless(isinstance(n, MutableFileNode))
10579+            key = n.get_writekey()
10580+            self.failUnless(isinstance(key, str), key)
10581+            self.failUnlessEqual(len(key), 16)
10582+            return MutableData(data)
10583+        d = self.nodemaker.create_mutable_file(_make_contents,
10584+                                               version=MDMF_VERSION)
10585+        d.addCallback(lambda n:
10586+            n.download_best_version())
10587+        d.addCallback(lambda data2:
10588+            self.failUnlessEqual(data2, data))
10589+        return d
10590+
10591+
10592     def test_create_with_too_large_contents(self):
10593         BIG = "a" * (self.OLD_MAX_SEGMENT_SIZE + 1)
10594hunk ./src/allmydata/test/test_mutable.py 479
10595-        d = self.nodemaker.create_mutable_file(BIG)
10596+        BIG_uploadable = MutableData(BIG)
10597+        d = self.nodemaker.create_mutable_file(BIG_uploadable)
10598         def _created(n):
10599hunk ./src/allmydata/test/test_mutable.py 482
10600-            d = n.overwrite(BIG)
10601+            other_BIG_uploadable = MutableData(BIG)
10602+            d = n.overwrite(other_BIG_uploadable)
10603             return d
10604         d.addCallback(_created)
10605         return d
10606hunk ./src/allmydata/test/test_mutable.py 497
10607 
10608     def test_modify(self):
10609         def _modifier(old_contents, servermap, first_time):
10610-            return old_contents + "line2"
10611+            new_contents = old_contents + "line2"
10612+            return new_contents
10613         def _non_modifier(old_contents, servermap, first_time):
10614             return old_contents
10615         def _none_modifier(old_contents, servermap, first_time):
10616hunk ./src/allmydata/test/test_mutable.py 506
10617         def _error_modifier(old_contents, servermap, first_time):
10618             raise ValueError("oops")
10619         def _toobig_modifier(old_contents, servermap, first_time):
10620-            return "b" * (self.OLD_MAX_SEGMENT_SIZE+1)
10621+            new_content = "b" * (self.OLD_MAX_SEGMENT_SIZE + 1)
10622+            return new_content
10623         calls = []
10624         def _ucw_error_modifier(old_contents, servermap, first_time):
10625             # simulate an UncoordinatedWriteError once
10626hunk ./src/allmydata/test/test_mutable.py 514
10627             calls.append(1)
10628             if len(calls) <= 1:
10629                 raise UncoordinatedWriteError("simulated")
10630-            return old_contents + "line3"
10631+            new_contents = old_contents + "line3"
10632+            return new_contents
10633         def _ucw_error_non_modifier(old_contents, servermap, first_time):
10634             # simulate an UncoordinatedWriteError once, and don't actually
10635             # modify the contents on subsequent invocations
10636hunk ./src/allmydata/test/test_mutable.py 524
10637                 raise UncoordinatedWriteError("simulated")
10638             return old_contents
10639 
10640-        d = self.nodemaker.create_mutable_file("line1")
10641+        initial_contents = "line1"
10642+        d = self.nodemaker.create_mutable_file(MutableData(initial_contents))
10643         def _created(n):
10644             d = n.modify(_modifier)
10645             d.addCallback(lambda res: n.download_best_version())
10646hunk ./src/allmydata/test/test_mutable.py 582
10647             return d
10648         d.addCallback(_created)
10649         return d
10650+    test_modify.timeout = 15
10651+
10652 
10653     def test_modify_backoffer(self):
10654         def _modifier(old_contents, servermap, first_time):
10655hunk ./src/allmydata/test/test_mutable.py 609
10656         giveuper._delay = 0.1
10657         giveuper.factor = 1
10658 
10659-        d = self.nodemaker.create_mutable_file("line1")
10660+        d = self.nodemaker.create_mutable_file(MutableData("line1"))
10661         def _created(n):
10662             d = n.modify(_modifier)
10663             d.addCallback(lambda res: n.download_best_version())
10664hunk ./src/allmydata/test/test_mutable.py 659
10665             d.addCallback(lambda smap: smap.dump(StringIO()))
10666             d.addCallback(lambda sio:
10667                           self.failUnless("3-of-10" in sio.getvalue()))
10668-            d.addCallback(lambda res: n.overwrite("contents 1"))
10669+            d.addCallback(lambda res: n.overwrite(MutableData("contents 1")))
10670             d.addCallback(lambda res: self.failUnlessIdentical(res, None))
10671             d.addCallback(lambda res: n.download_best_version())
10672             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
10673hunk ./src/allmydata/test/test_mutable.py 663
10674-            d.addCallback(lambda res: n.overwrite("contents 2"))
10675+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
10676             d.addCallback(lambda res: n.download_best_version())
10677             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
10678             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
10679hunk ./src/allmydata/test/test_mutable.py 667
10680-            d.addCallback(lambda smap: n.upload("contents 3", smap))
10681+            d.addCallback(lambda smap: n.upload(MutableData("contents 3"), smap))
10682             d.addCallback(lambda res: n.download_best_version())
10683             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 3"))
10684             d.addCallback(lambda res: n.get_servermap(MODE_ANYTHING))
10685hunk ./src/allmydata/test/test_mutable.py 680
10686         return d
10687 
10688 
10689-class MakeShares(unittest.TestCase):
10690-    def test_encrypt(self):
10691-        nm = make_nodemaker()
10692-        CONTENTS = "some initial contents"
10693-        d = nm.create_mutable_file(CONTENTS)
10694-        def _created(fn):
10695-            p = Publish(fn, nm.storage_broker, None)
10696-            p.salt = "SALT" * 4
10697-            p.readkey = "\x00" * 16
10698-            p.newdata = CONTENTS
10699-            p.required_shares = 3
10700-            p.total_shares = 10
10701-            p.setup_encoding_parameters()
10702-            return p._encrypt_and_encode()
10703+    def test_size_after_servermap_update(self):
10704+        # a mutable file node should have something to say about how big
10705+        # it is after a servermap update is performed, since this tells
10706+        # us how large the best version of that mutable file is.
10707+        d = self.nodemaker.create_mutable_file()
10708+        def _created(n):
10709+            self.n = n
10710+            return n.get_servermap(MODE_READ)
10711+        d.addCallback(_created)
10712+        d.addCallback(lambda ignored:
10713+            self.failUnlessEqual(self.n.get_size(), 0))
10714+        d.addCallback(lambda ignored:
10715+            self.n.overwrite(MutableData("foobarbaz")))
10716+        d.addCallback(lambda ignored:
10717+            self.failUnlessEqual(self.n.get_size(), 9))
10718+        d.addCallback(lambda ignored:
10719+            self.nodemaker.create_mutable_file(MutableData("foobarbaz")))
10720+        d.addCallback(_created)
10721+        d.addCallback(lambda ignored:
10722+            self.failUnlessEqual(self.n.get_size(), 9))
10723+        return d
10724+
10725+
10726+class PublishMixin:
10727+    def publish_one(self):
10728+        # publish a file and create shares, which can then be manipulated
10729+        # later.
10730+        self.CONTENTS = "New contents go here" * 1000
10731+        self.uploadable = MutableData(self.CONTENTS)
10732+        self._storage = FakeStorage()
10733+        self._nodemaker = make_nodemaker(self._storage)
10734+        self._storage_broker = self._nodemaker.storage_broker
10735+        d = self._nodemaker.create_mutable_file(self.uploadable)
10736+        def _created(node):
10737+            self._fn = node
10738+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
10739         d.addCallback(_created)
10740hunk ./src/allmydata/test/test_mutable.py 717
10741-        def _done(shares_and_shareids):
10742-            (shares, share_ids) = shares_and_shareids
10743-            self.failUnlessEqual(len(shares), 10)
10744-            for sh in shares:
10745-                self.failUnless(isinstance(sh, str))
10746-                self.failUnlessEqual(len(sh), 7)
10747-            self.failUnlessEqual(len(share_ids), 10)
10748-        d.addCallback(_done)
10749         return d
10750 
10751hunk ./src/allmydata/test/test_mutable.py 719
10752-    def test_generate(self):
10753-        nm = make_nodemaker()
10754-        CONTENTS = "some initial contents"
10755-        d = nm.create_mutable_file(CONTENTS)
10756-        def _created(fn):
10757-            self._fn = fn
10758-            p = Publish(fn, nm.storage_broker, None)
10759-            self._p = p
10760-            p.newdata = CONTENTS
10761-            p.required_shares = 3
10762-            p.total_shares = 10
10763-            p.setup_encoding_parameters()
10764-            p._new_seqnum = 3
10765-            p.salt = "SALT" * 4
10766-            # make some fake shares
10767-            shares_and_ids = ( ["%07d" % i for i in range(10)], range(10) )
10768-            p._privkey = fn.get_privkey()
10769-            p._encprivkey = fn.get_encprivkey()
10770-            p._pubkey = fn.get_pubkey()
10771-            return p._generate_shares(shares_and_ids)
10772+    def publish_mdmf(self):
10773+        # like publish_one, except that the result is guaranteed to be
10774+        # an MDMF file.
10775+        # self.CONTENTS should have more than one segment.
10776+        self.CONTENTS = "This is an MDMF file" * 100000
10777+        self.uploadable = MutableData(self.CONTENTS)
10778+        self._storage = FakeStorage()
10779+        self._nodemaker = make_nodemaker(self._storage)
10780+        self._storage_broker = self._nodemaker.storage_broker
10781+        d = self._nodemaker.create_mutable_file(self.uploadable, version=MDMF_VERSION)
10782+        def _created(node):
10783+            self._fn = node
10784+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
10785         d.addCallback(_created)
10786hunk ./src/allmydata/test/test_mutable.py 733
10787-        def _generated(res):
10788-            p = self._p
10789-            final_shares = p.shares
10790-            root_hash = p.root_hash
10791-            self.failUnlessEqual(len(root_hash), 32)
10792-            self.failUnless(isinstance(final_shares, dict))
10793-            self.failUnlessEqual(len(final_shares), 10)
10794-            self.failUnlessEqual(sorted(final_shares.keys()), range(10))
10795-            for i,sh in final_shares.items():
10796-                self.failUnless(isinstance(sh, str))
10797-                # feed the share through the unpacker as a sanity-check
10798-                pieces = unpack_share(sh)
10799-                (u_seqnum, u_root_hash, IV, k, N, segsize, datalen,
10800-                 pubkey, signature, share_hash_chain, block_hash_tree,
10801-                 share_data, enc_privkey) = pieces
10802-                self.failUnlessEqual(u_seqnum, 3)
10803-                self.failUnlessEqual(u_root_hash, root_hash)
10804-                self.failUnlessEqual(k, 3)
10805-                self.failUnlessEqual(N, 10)
10806-                self.failUnlessEqual(segsize, 21)
10807-                self.failUnlessEqual(datalen, len(CONTENTS))
10808-                self.failUnlessEqual(pubkey, p._pubkey.serialize())
10809-                sig_material = struct.pack(">BQ32s16s BBQQ",
10810-                                           0, p._new_seqnum, root_hash, IV,
10811-                                           k, N, segsize, datalen)
10812-                self.failUnless(p._pubkey.verify(sig_material, signature))
10813-                #self.failUnlessEqual(signature, p._privkey.sign(sig_material))
10814-                self.failUnless(isinstance(share_hash_chain, dict))
10815-                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
10816-                for shnum,share_hash in share_hash_chain.items():
10817-                    self.failUnless(isinstance(shnum, int))
10818-                    self.failUnless(isinstance(share_hash, str))
10819-                    self.failUnlessEqual(len(share_hash), 32)
10820-                self.failUnless(isinstance(block_hash_tree, list))
10821-                self.failUnlessEqual(len(block_hash_tree), 1) # very small tree
10822-                self.failUnlessEqual(IV, "SALT"*4)
10823-                self.failUnlessEqual(len(share_data), len("%07d" % 1))
10824-                self.failUnlessEqual(enc_privkey, self._fn.get_encprivkey())
10825-        d.addCallback(_generated)
10826         return d
10827 
10828hunk ./src/allmydata/test/test_mutable.py 735
10829-    # TODO: when we publish to 20 peers, we should get one share per peer on 10
10830-    # when we publish to 3 peers, we should get either 3 or 4 shares per peer
10831-    # when we publish to zero peers, we should get a NotEnoughSharesError
10832 
10833hunk ./src/allmydata/test/test_mutable.py 736
10834-class PublishMixin:
10835-    def publish_one(self):
10836-        # publish a file and create shares, which can then be manipulated
10837-        # later.
10838-        self.CONTENTS = "New contents go here" * 1000
10839+    def publish_sdmf(self):
10840+        # like publish_one, except that the result is guaranteed to be
10841+        # an SDMF file
10842+        self.CONTENTS = "This is an SDMF file" * 1000
10843+        self.uploadable = MutableData(self.CONTENTS)
10844         self._storage = FakeStorage()
10845         self._nodemaker = make_nodemaker(self._storage)
10846         self._storage_broker = self._nodemaker.storage_broker
10847hunk ./src/allmydata/test/test_mutable.py 744
10848-        d = self._nodemaker.create_mutable_file(self.CONTENTS)
10849+        d = self._nodemaker.create_mutable_file(self.uploadable, version=SDMF_VERSION)
10850         def _created(node):
10851             self._fn = node
10852             self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
10853hunk ./src/allmydata/test/test_mutable.py 751
10854         d.addCallback(_created)
10855         return d
10856 
10857-    def publish_multiple(self):
10858+
10859+    def publish_multiple(self, version=0):
10860         self.CONTENTS = ["Contents 0",
10861                          "Contents 1",
10862                          "Contents 2",
10863hunk ./src/allmydata/test/test_mutable.py 758
10864                          "Contents 3a",
10865                          "Contents 3b"]
10866+        self.uploadables = [MutableData(d) for d in self.CONTENTS]
10867         self._copied_shares = {}
10868         self._storage = FakeStorage()
10869         self._nodemaker = make_nodemaker(self._storage)
10870hunk ./src/allmydata/test/test_mutable.py 762
10871-        d = self._nodemaker.create_mutable_file(self.CONTENTS[0]) # seqnum=1
10872+        d = self._nodemaker.create_mutable_file(self.uploadables[0], version=version) # seqnum=1
10873         def _created(node):
10874             self._fn = node
10875             # now create multiple versions of the same file, and accumulate
10876hunk ./src/allmydata/test/test_mutable.py 769
10877             # their shares, so we can mix and match them later.
10878             d = defer.succeed(None)
10879             d.addCallback(self._copy_shares, 0)
10880-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[1])) #s2
10881+            d.addCallback(lambda res: node.overwrite(self.uploadables[1])) #s2
10882             d.addCallback(self._copy_shares, 1)
10883hunk ./src/allmydata/test/test_mutable.py 771
10884-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[2])) #s3
10885+            d.addCallback(lambda res: node.overwrite(self.uploadables[2])) #s3
10886             d.addCallback(self._copy_shares, 2)
10887hunk ./src/allmydata/test/test_mutable.py 773
10888-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[3])) #s4a
10889+            d.addCallback(lambda res: node.overwrite(self.uploadables[3])) #s4a
10890             d.addCallback(self._copy_shares, 3)
10891             # now we replace all the shares with version s3, and upload a new
10892             # version to get s4b.
10893hunk ./src/allmydata/test/test_mutable.py 779
10894             rollback = dict([(i,2) for i in range(10)])
10895             d.addCallback(lambda res: self._set_versions(rollback))
10896-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[4])) #s4b
10897+            d.addCallback(lambda res: node.overwrite(self.uploadables[4])) #s4b
10898             d.addCallback(self._copy_shares, 4)
10899             # we leave the storage in state 4
10900             return d
10901hunk ./src/allmydata/test/test_mutable.py 786
10902         d.addCallback(_created)
10903         return d
10904 
10905+
10906     def _copy_shares(self, ignored, index):
10907         shares = self._storage._peers
10908         # we need a deep copy
10909hunk ./src/allmydata/test/test_mutable.py 810
10910                     shares[peerid][shnum] = oldshares[index][peerid][shnum]
10911 
10912 
10913+
10914+
10915 class Servermap(unittest.TestCase, PublishMixin):
10916     def setUp(self):
10917         return self.publish_one()
10918hunk ./src/allmydata/test/test_mutable.py 816
10919 
10920-    def make_servermap(self, mode=MODE_CHECK, fn=None, sb=None):
10921+    def make_servermap(self, mode=MODE_CHECK, fn=None, sb=None,
10922+                       update_range=None):
10923         if fn is None:
10924             fn = self._fn
10925         if sb is None:
10926hunk ./src/allmydata/test/test_mutable.py 823
10927             sb = self._storage_broker
10928         smu = ServermapUpdater(fn, sb, Monitor(),
10929-                               ServerMap(), mode)
10930+                               ServerMap(), mode, update_range=update_range)
10931         d = smu.update()
10932         return d
10933 
10934hunk ./src/allmydata/test/test_mutable.py 889
10935         # create a new file, which is large enough to knock the privkey out
10936         # of the early part of the file
10937         LARGE = "These are Larger contents" * 200 # about 5KB
10938-        d.addCallback(lambda res: self._nodemaker.create_mutable_file(LARGE))
10939+        LARGE_uploadable = MutableData(LARGE)
10940+        d.addCallback(lambda res: self._nodemaker.create_mutable_file(LARGE_uploadable))
10941         def _created(large_fn):
10942             large_fn2 = self._nodemaker.create_from_cap(large_fn.get_uri())
10943             return self.make_servermap(MODE_WRITE, large_fn2)
10944hunk ./src/allmydata/test/test_mutable.py 898
10945         d.addCallback(lambda sm: self.failUnlessOneRecoverable(sm, 10))
10946         return d
10947 
10948+
10949     def test_mark_bad(self):
10950         d = defer.succeed(None)
10951         ms = self.make_servermap
10952hunk ./src/allmydata/test/test_mutable.py 944
10953         self._storage._peers = {} # delete all shares
10954         ms = self.make_servermap
10955         d = defer.succeed(None)
10956-
10957+#
10958         d.addCallback(lambda res: ms(mode=MODE_CHECK))
10959         d.addCallback(lambda sm: self.failUnlessNoneRecoverable(sm))
10960 
10961hunk ./src/allmydata/test/test_mutable.py 996
10962         return d
10963 
10964 
10965+    def test_servermapupdater_finds_mdmf_files(self):
10966+        # setUp already published an MDMF file for us. We just need to
10967+        # make sure that when we run the ServermapUpdater, the file is
10968+        # reported to have one recoverable version.
10969+        d = defer.succeed(None)
10970+        d.addCallback(lambda ignored:
10971+            self.publish_mdmf())
10972+        d.addCallback(lambda ignored:
10973+            self.make_servermap(mode=MODE_CHECK))
10974+        # Calling make_servermap also updates the servermap in the mode
10975+        # that we specify, so we just need to see what it says.
10976+        def _check_servermap(sm):
10977+            self.failUnlessEqual(len(sm.recoverable_versions()), 1)
10978+        d.addCallback(_check_servermap)
10979+        return d
10980+
10981+
10982+    def test_fetch_update(self):
10983+        d = defer.succeed(None)
10984+        d.addCallback(lambda ignored:
10985+            self.publish_mdmf())
10986+        d.addCallback(lambda ignored:
10987+            self.make_servermap(mode=MODE_WRITE, update_range=(1, 2)))
10988+        def _check_servermap(sm):
10989+            # 10 shares
10990+            self.failUnlessEqual(len(sm.update_data), 10)
10991+            # one version
10992+            for data in sm.update_data.itervalues():
10993+                self.failUnlessEqual(len(data), 1)
10994+        d.addCallback(_check_servermap)
10995+        return d
10996+
10997+
10998+    def test_servermapupdater_finds_sdmf_files(self):
10999+        d = defer.succeed(None)
11000+        d.addCallback(lambda ignored:
11001+            self.publish_sdmf())
11002+        d.addCallback(lambda ignored:
11003+            self.make_servermap(mode=MODE_CHECK))
11004+        d.addCallback(lambda servermap:
11005+            self.failUnlessEqual(len(servermap.recoverable_versions()), 1))
11006+        return d
11007+
11008 
11009 class Roundtrip(unittest.TestCase, testutil.ShouldFailMixin, PublishMixin):
11010     def setUp(self):
11011hunk ./src/allmydata/test/test_mutable.py 1079
11012         if version is None:
11013             version = servermap.best_recoverable_version()
11014         r = Retrieve(self._fn, servermap, version)
11015-        return r.download()
11016+        c = consumer.MemoryConsumer()
11017+        d = r.download(consumer=c)
11018+        d.addCallback(lambda mc: "".join(mc.chunks))
11019+        return d
11020+
11021 
11022     def test_basic(self):
11023         d = self.make_servermap()
11024hunk ./src/allmydata/test/test_mutable.py 1160
11025         return d
11026     test_no_servers_download.timeout = 15
11027 
11028+
11029     def _test_corrupt_all(self, offset, substring,
11030hunk ./src/allmydata/test/test_mutable.py 1162
11031-                          should_succeed=False, corrupt_early=True,
11032-                          failure_checker=None):
11033+                          should_succeed=False,
11034+                          corrupt_early=True,
11035+                          failure_checker=None,
11036+                          fetch_privkey=False):
11037         d = defer.succeed(None)
11038         if corrupt_early:
11039             d.addCallback(corrupt, self._storage, offset)
11040hunk ./src/allmydata/test/test_mutable.py 1182
11041                     self.failUnlessIn(substring, "".join(allproblems))
11042                 return servermap
11043             if should_succeed:
11044-                d1 = self._fn.download_version(servermap, ver)
11045+                d1 = self._fn.download_version(servermap, ver,
11046+                                               fetch_privkey)
11047                 d1.addCallback(lambda new_contents:
11048                                self.failUnlessEqual(new_contents, self.CONTENTS))
11049             else:
11050hunk ./src/allmydata/test/test_mutable.py 1190
11051                 d1 = self.shouldFail(NotEnoughSharesError,
11052                                      "_corrupt_all(offset=%s)" % (offset,),
11053                                      substring,
11054-                                     self._fn.download_version, servermap, ver)
11055+                                     self._fn.download_version, servermap,
11056+                                                                ver,
11057+                                                                fetch_privkey)
11058             if failure_checker:
11059                 d1.addCallback(failure_checker)
11060             d1.addCallback(lambda res: servermap)
11061hunk ./src/allmydata/test/test_mutable.py 1201
11062         return d
11063 
11064     def test_corrupt_all_verbyte(self):
11065-        # when the version byte is not 0, we hit an UnknownVersionError error
11066-        # in unpack_share().
11067+        # when the version byte is not 0 or 1, we hit an UnknownVersionError
11068+        # error in unpack_share().
11069         d = self._test_corrupt_all(0, "UnknownVersionError")
11070         def _check_servermap(servermap):
11071             # and the dump should mention the problems
11072hunk ./src/allmydata/test/test_mutable.py 1208
11073             s = StringIO()
11074             dump = servermap.dump(s).getvalue()
11075-            self.failUnless("10 PROBLEMS" in dump, dump)
11076+            self.failUnless("30 PROBLEMS" in dump, dump)
11077         d.addCallback(_check_servermap)
11078         return d
11079 
11080hunk ./src/allmydata/test/test_mutable.py 1278
11081         return self._test_corrupt_all("enc_privkey", None, should_succeed=True)
11082 
11083 
11084+    def test_corrupt_all_encprivkey_late(self):
11085+        # this should work for the same reason as above, but we corrupt
11086+        # after the servermap update to exercise the error handling
11087+        # code.
11088+        # We need to remove the privkey from the node, or the retrieve
11089+        # process won't know to update it.
11090+        self._fn._privkey = None
11091+        return self._test_corrupt_all("enc_privkey",
11092+                                      None, # this shouldn't fail
11093+                                      should_succeed=True,
11094+                                      corrupt_early=False,
11095+                                      fetch_privkey=True)
11096+
11097+
11098     def test_corrupt_all_seqnum_late(self):
11099         # corrupting the seqnum between mapupdate and retrieve should result
11100         # in NotEnoughSharesError, since each share will look invalid
11101hunk ./src/allmydata/test/test_mutable.py 1298
11102         def _check(res):
11103             f = res[0]
11104             self.failUnless(f.check(NotEnoughSharesError))
11105-            self.failUnless("someone wrote to the data since we read the servermap" in str(f))
11106+            self.failUnless("uncoordinated write" in str(f))
11107         return self._test_corrupt_all(1, "ran out of peers",
11108                                       corrupt_early=False,
11109                                       failure_checker=_check)
11110hunk ./src/allmydata/test/test_mutable.py 1342
11111                             in str(servermap.problems[0]))
11112             ver = servermap.best_recoverable_version()
11113             r = Retrieve(self._fn, servermap, ver)
11114-            return r.download()
11115+            c = consumer.MemoryConsumer()
11116+            return r.download(c)
11117         d.addCallback(_do_retrieve)
11118hunk ./src/allmydata/test/test_mutable.py 1345
11119+        d.addCallback(lambda mc: "".join(mc.chunks))
11120         d.addCallback(lambda new_contents:
11121                       self.failUnlessEqual(new_contents, self.CONTENTS))
11122         return d
11123hunk ./src/allmydata/test/test_mutable.py 1350
11124 
11125-    def test_corrupt_some(self):
11126-        # corrupt the data of first five shares (so the servermap thinks
11127-        # they're good but retrieve marks them as bad), so that the
11128-        # MODE_READ set of 6 will be insufficient, forcing node.download to
11129-        # retry with more servers.
11130-        corrupt(None, self._storage, "share_data", range(5))
11131-        d = self.make_servermap()
11132+
11133+    def _test_corrupt_some(self, offset, mdmf=False):
11134+        if mdmf:
11135+            d = self.publish_mdmf()
11136+        else:
11137+            d = defer.succeed(None)
11138+        d.addCallback(lambda ignored:
11139+            corrupt(None, self._storage, offset, range(5)))
11140+        d.addCallback(lambda ignored:
11141+            self.make_servermap())
11142         def _do_retrieve(servermap):
11143             ver = servermap.best_recoverable_version()
11144             self.failUnless(ver)
11145hunk ./src/allmydata/test/test_mutable.py 1366
11146             return self._fn.download_best_version()
11147         d.addCallback(_do_retrieve)
11148         d.addCallback(lambda new_contents:
11149-                      self.failUnlessEqual(new_contents, self.CONTENTS))
11150+            self.failUnlessEqual(new_contents, self.CONTENTS))
11151         return d
11152 
11153hunk ./src/allmydata/test/test_mutable.py 1369
11154+
11155+    def test_corrupt_some(self):
11156+        # corrupt the data of first five shares (so the servermap thinks
11157+        # they're good but retrieve marks them as bad), so that the
11158+        # MODE_READ set of 6 will be insufficient, forcing node.download to
11159+        # retry with more servers.
11160+        return self._test_corrupt_some("share_data")
11161+
11162+
11163     def test_download_fails(self):
11164hunk ./src/allmydata/test/test_mutable.py 1379
11165-        corrupt(None, self._storage, "signature")
11166-        d = self.shouldFail(UnrecoverableFileError, "test_download_anyway",
11167+        d = corrupt(None, self._storage, "signature")
11168+        d.addCallback(lambda ignored:
11169+            self.shouldFail(UnrecoverableFileError, "test_download_anyway",
11170                             "no recoverable versions",
11171hunk ./src/allmydata/test/test_mutable.py 1383
11172-                            self._fn.download_best_version)
11173+                            self._fn.download_best_version))
11174         return d
11175 
11176 
11177hunk ./src/allmydata/test/test_mutable.py 1387
11178+
11179+    def test_corrupt_mdmf_block_hash_tree(self):
11180+        d = self.publish_mdmf()
11181+        d.addCallback(lambda ignored:
11182+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
11183+                                   "block hash tree failure",
11184+                                   corrupt_early=False,
11185+                                   should_succeed=False))
11186+        return d
11187+
11188+
11189+    def test_corrupt_mdmf_block_hash_tree_late(self):
11190+        d = self.publish_mdmf()
11191+        d.addCallback(lambda ignored:
11192+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
11193+                                   "block hash tree failure",
11194+                                   corrupt_early=True,
11195+                                   should_succeed=False))
11196+        return d
11197+
11198+
11199+    def test_corrupt_mdmf_share_data(self):
11200+        d = self.publish_mdmf()
11201+        d.addCallback(lambda ignored:
11202+            # TODO: Find out what the block size is and corrupt a
11203+            # specific block, rather than just guessing.
11204+            self._test_corrupt_all(("share_data", 12 * 40),
11205+                                    "block hash tree failure",
11206+                                    corrupt_early=True,
11207+                                    should_succeed=False))
11208+        return d
11209+
11210+
11211+    def test_corrupt_some_mdmf(self):
11212+        return self._test_corrupt_some(("share_data", 12 * 40),
11213+                                       mdmf=True)
11214+
11215+
11216 class CheckerMixin:
11217     def check_good(self, r, where):
11218         self.failUnless(r.is_healthy(), where)
11219hunk ./src/allmydata/test/test_mutable.py 1455
11220         d.addCallback(self.check_good, "test_check_good")
11221         return d
11222 
11223+    def test_check_mdmf_good(self):
11224+        d = self.publish_mdmf()
11225+        d.addCallback(lambda ignored:
11226+            self._fn.check(Monitor()))
11227+        d.addCallback(self.check_good, "test_check_mdmf_good")
11228+        return d
11229+
11230     def test_check_no_shares(self):
11231         for shares in self._storage._peers.values():
11232             shares.clear()
11233hunk ./src/allmydata/test/test_mutable.py 1469
11234         d.addCallback(self.check_bad, "test_check_no_shares")
11235         return d
11236 
11237+    def test_check_mdmf_no_shares(self):
11238+        d = self.publish_mdmf()
11239+        def _then(ignored):
11240+            for share in self._storage._peers.values():
11241+                share.clear()
11242+        d.addCallback(_then)
11243+        d.addCallback(lambda ignored:
11244+            self._fn.check(Monitor()))
11245+        d.addCallback(self.check_bad, "test_check_mdmf_no_shares")
11246+        return d
11247+
11248     def test_check_not_enough_shares(self):
11249         for shares in self._storage._peers.values():
11250             for shnum in shares.keys():
11251hunk ./src/allmydata/test/test_mutable.py 1489
11252         d.addCallback(self.check_bad, "test_check_not_enough_shares")
11253         return d
11254 
11255+    def test_check_mdmf_not_enough_shares(self):
11256+        d = self.publish_mdmf()
11257+        def _then(ignored):
11258+            for shares in self._storage._peers.values():
11259+                for shnum in shares.keys():
11260+                    if shnum > 0:
11261+                        del shares[shnum]
11262+        d.addCallback(_then)
11263+        d.addCallback(lambda ignored:
11264+            self._fn.check(Monitor()))
11265+        d.addCallback(self.check_bad, "test_check_mdmf_not_enougH_shares")
11266+        return d
11267+
11268+
11269     def test_check_all_bad_sig(self):
11270hunk ./src/allmydata/test/test_mutable.py 1504
11271-        corrupt(None, self._storage, 1) # bad sig
11272-        d = self._fn.check(Monitor())
11273+        d = corrupt(None, self._storage, 1) # bad sig
11274+        d.addCallback(lambda ignored:
11275+            self._fn.check(Monitor()))
11276         d.addCallback(self.check_bad, "test_check_all_bad_sig")
11277         return d
11278 
11279hunk ./src/allmydata/test/test_mutable.py 1510
11280+    def test_check_mdmf_all_bad_sig(self):
11281+        d = self.publish_mdmf()
11282+        d.addCallback(lambda ignored:
11283+            corrupt(None, self._storage, 1))
11284+        d.addCallback(lambda ignored:
11285+            self._fn.check(Monitor()))
11286+        d.addCallback(self.check_bad, "test_check_mdmf_all_bad_sig")
11287+        return d
11288+
11289     def test_check_all_bad_blocks(self):
11290hunk ./src/allmydata/test/test_mutable.py 1520
11291-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
11292+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
11293         # the Checker won't notice this.. it doesn't look at actual data
11294hunk ./src/allmydata/test/test_mutable.py 1522
11295-        d = self._fn.check(Monitor())
11296+        d.addCallback(lambda ignored:
11297+            self._fn.check(Monitor()))
11298         d.addCallback(self.check_good, "test_check_all_bad_blocks")
11299         return d
11300 
11301hunk ./src/allmydata/test/test_mutable.py 1527
11302+
11303+    def test_check_mdmf_all_bad_blocks(self):
11304+        d = self.publish_mdmf()
11305+        d.addCallback(lambda ignored:
11306+            corrupt(None, self._storage, "share_data"))
11307+        d.addCallback(lambda ignored:
11308+            self._fn.check(Monitor()))
11309+        d.addCallback(self.check_good, "test_check_mdmf_all_bad_blocks")
11310+        return d
11311+
11312     def test_verify_good(self):
11313         d = self._fn.check(Monitor(), verify=True)
11314         d.addCallback(self.check_good, "test_verify_good")
11315hunk ./src/allmydata/test/test_mutable.py 1541
11316         return d
11317+    test_verify_good.timeout = 15
11318 
11319     def test_verify_all_bad_sig(self):
11320hunk ./src/allmydata/test/test_mutable.py 1544
11321-        corrupt(None, self._storage, 1) # bad sig
11322-        d = self._fn.check(Monitor(), verify=True)
11323+        d = corrupt(None, self._storage, 1) # bad sig
11324+        d.addCallback(lambda ignored:
11325+            self._fn.check(Monitor(), verify=True))
11326         d.addCallback(self.check_bad, "test_verify_all_bad_sig")
11327         return d
11328 
11329hunk ./src/allmydata/test/test_mutable.py 1551
11330     def test_verify_one_bad_sig(self):
11331-        corrupt(None, self._storage, 1, [9]) # bad sig
11332-        d = self._fn.check(Monitor(), verify=True)
11333+        d = corrupt(None, self._storage, 1, [9]) # bad sig
11334+        d.addCallback(lambda ignored:
11335+            self._fn.check(Monitor(), verify=True))
11336         d.addCallback(self.check_bad, "test_verify_one_bad_sig")
11337         return d
11338 
11339hunk ./src/allmydata/test/test_mutable.py 1558
11340     def test_verify_one_bad_block(self):
11341-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
11342+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
11343         # the Verifier *will* notice this, since it examines every byte
11344hunk ./src/allmydata/test/test_mutable.py 1560
11345-        d = self._fn.check(Monitor(), verify=True)
11346+        d.addCallback(lambda ignored:
11347+            self._fn.check(Monitor(), verify=True))
11348         d.addCallback(self.check_bad, "test_verify_one_bad_block")
11349         d.addCallback(self.check_expected_failure,
11350                       CorruptShareError, "block hash tree failure",
11351hunk ./src/allmydata/test/test_mutable.py 1569
11352         return d
11353 
11354     def test_verify_one_bad_sharehash(self):
11355-        corrupt(None, self._storage, "share_hash_chain", [9], 5)
11356-        d = self._fn.check(Monitor(), verify=True)
11357+        d = corrupt(None, self._storage, "share_hash_chain", [9], 5)
11358+        d.addCallback(lambda ignored:
11359+            self._fn.check(Monitor(), verify=True))
11360         d.addCallback(self.check_bad, "test_verify_one_bad_sharehash")
11361         d.addCallback(self.check_expected_failure,
11362                       CorruptShareError, "corrupt hashes",
11363hunk ./src/allmydata/test/test_mutable.py 1579
11364         return d
11365 
11366     def test_verify_one_bad_encprivkey(self):
11367-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
11368-        d = self._fn.check(Monitor(), verify=True)
11369+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
11370+        d.addCallback(lambda ignored:
11371+            self._fn.check(Monitor(), verify=True))
11372         d.addCallback(self.check_bad, "test_verify_one_bad_encprivkey")
11373         d.addCallback(self.check_expected_failure,
11374                       CorruptShareError, "invalid privkey",
11375hunk ./src/allmydata/test/test_mutable.py 1589
11376         return d
11377 
11378     def test_verify_one_bad_encprivkey_uncheckable(self):
11379-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
11380+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
11381         readonly_fn = self._fn.get_readonly()
11382         # a read-only node has no way to validate the privkey
11383hunk ./src/allmydata/test/test_mutable.py 1592
11384-        d = readonly_fn.check(Monitor(), verify=True)
11385+        d.addCallback(lambda ignored:
11386+            readonly_fn.check(Monitor(), verify=True))
11387         d.addCallback(self.check_good,
11388                       "test_verify_one_bad_encprivkey_uncheckable")
11389         return d
11390hunk ./src/allmydata/test/test_mutable.py 1598
11391 
11392+
11393+    def test_verify_mdmf_good(self):
11394+        d = self.publish_mdmf()
11395+        d.addCallback(lambda ignored:
11396+            self._fn.check(Monitor(), verify=True))
11397+        d.addCallback(self.check_good, "test_verify_mdmf_good")
11398+        return d
11399+
11400+
11401+    def test_verify_mdmf_one_bad_block(self):
11402+        d = self.publish_mdmf()
11403+        d.addCallback(lambda ignored:
11404+            corrupt(None, self._storage, "share_data", [1]))
11405+        d.addCallback(lambda ignored:
11406+            self._fn.check(Monitor(), verify=True))
11407+        # We should find one bad block here
11408+        d.addCallback(self.check_bad, "test_verify_mdmf_one_bad_block")
11409+        d.addCallback(self.check_expected_failure,
11410+                      CorruptShareError, "block hash tree failure",
11411+                      "test_verify_mdmf_one_bad_block")
11412+        return d
11413+
11414+
11415+    def test_verify_mdmf_bad_encprivkey(self):
11416+        d = self.publish_mdmf()
11417+        d.addCallback(lambda ignored:
11418+            corrupt(None, self._storage, "enc_privkey", [1]))
11419+        d.addCallback(lambda ignored:
11420+            self._fn.check(Monitor(), verify=True))
11421+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_encprivkey")
11422+        d.addCallback(self.check_expected_failure,
11423+                      CorruptShareError, "privkey",
11424+                      "test_verify_mdmf_bad_encprivkey")
11425+        return d
11426+
11427+
11428+    def test_verify_mdmf_bad_sig(self):
11429+        d = self.publish_mdmf()
11430+        d.addCallback(lambda ignored:
11431+            corrupt(None, self._storage, 1, [1]))
11432+        d.addCallback(lambda ignored:
11433+            self._fn.check(Monitor(), verify=True))
11434+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_sig")
11435+        return d
11436+
11437+
11438+    def test_verify_mdmf_bad_encprivkey_uncheckable(self):
11439+        d = self.publish_mdmf()
11440+        d.addCallback(lambda ignored:
11441+            corrupt(None, self._storage, "enc_privkey", [1]))
11442+        d.addCallback(lambda ignored:
11443+            self._fn.get_readonly())
11444+        d.addCallback(lambda fn:
11445+            fn.check(Monitor(), verify=True))
11446+        d.addCallback(self.check_good,
11447+                      "test_verify_mdmf_bad_encprivkey_uncheckable")
11448+        return d
11449+
11450+
11451 class Repair(unittest.TestCase, PublishMixin, ShouldFailMixin):
11452 
11453     def get_shares(self, s):
11454hunk ./src/allmydata/test/test_mutable.py 1722
11455         current_shares = self.old_shares[-1]
11456         self.failUnlessEqual(old_shares, current_shares)
11457 
11458+
11459     def test_unrepairable_0shares(self):
11460         d = self.publish_one()
11461         def _delete_all_shares(ign):
11462hunk ./src/allmydata/test/test_mutable.py 1737
11463         d.addCallback(_check)
11464         return d
11465 
11466+    def test_mdmf_unrepairable_0shares(self):
11467+        d = self.publish_mdmf()
11468+        def _delete_all_shares(ign):
11469+            shares = self._storage._peers
11470+            for peerid in shares:
11471+                shares[peerid] = {}
11472+        d.addCallback(_delete_all_shares)
11473+        d.addCallback(lambda ign: self._fn.check(Monitor()))
11474+        d.addCallback(lambda check_results: self._fn.repair(check_results))
11475+        d.addCallback(lambda crr: self.failIf(crr.get_successful()))
11476+        return d
11477+
11478+
11479     def test_unrepairable_1share(self):
11480         d = self.publish_one()
11481         def _delete_all_shares(ign):
11482hunk ./src/allmydata/test/test_mutable.py 1766
11483         d.addCallback(_check)
11484         return d
11485 
11486+    def test_mdmf_unrepairable_1share(self):
11487+        d = self.publish_mdmf()
11488+        def _delete_all_shares(ign):
11489+            shares = self._storage._peers
11490+            for peerid in shares:
11491+                for shnum in list(shares[peerid]):
11492+                    if shnum > 0:
11493+                        del shares[peerid][shnum]
11494+        d.addCallback(_delete_all_shares)
11495+        d.addCallback(lambda ign: self._fn.check(Monitor()))
11496+        d.addCallback(lambda check_results: self._fn.repair(check_results))
11497+        def _check(crr):
11498+            self.failUnlessEqual(crr.get_successful(), False)
11499+        d.addCallback(_check)
11500+        return d
11501+
11502+    def test_repairable_5shares(self):
11503+        d = self.publish_mdmf()
11504+        def _delete_all_shares(ign):
11505+            shares = self._storage._peers
11506+            for peerid in shares:
11507+                for shnum in list(shares[peerid]):
11508+                    if shnum > 4:
11509+                        del shares[peerid][shnum]
11510+        d.addCallback(_delete_all_shares)
11511+        d.addCallback(lambda ign: self._fn.check(Monitor()))
11512+        d.addCallback(lambda check_results: self._fn.repair(check_results))
11513+        def _check(crr):
11514+            self.failUnlessEqual(crr.get_successful(), True)
11515+        d.addCallback(_check)
11516+        return d
11517+
11518+    def test_mdmf_repairable_5shares(self):
11519+        d = self.publish_mdmf()
11520+        def _delete_some_shares(ign):
11521+            shares = self._storage._peers
11522+            for peerid in shares:
11523+                for shnum in list(shares[peerid]):
11524+                    if shnum > 5:
11525+                        del shares[peerid][shnum]
11526+        d.addCallback(_delete_some_shares)
11527+        d.addCallback(lambda ign: self._fn.check(Monitor()))
11528+        def _check(cr):
11529+            self.failIf(cr.is_healthy())
11530+            self.failUnless(cr.is_recoverable())
11531+            return cr
11532+        d.addCallback(_check)
11533+        d.addCallback(lambda check_results: self._fn.repair(check_results))
11534+        def _check1(crr):
11535+            self.failUnlessEqual(crr.get_successful(), True)
11536+        d.addCallback(_check1)
11537+        return d
11538+
11539+
11540     def test_merge(self):
11541         self.old_shares = []
11542         d = self.publish_multiple()
11543hunk ./src/allmydata/test/test_mutable.py 1934
11544 class MultipleEncodings(unittest.TestCase):
11545     def setUp(self):
11546         self.CONTENTS = "New contents go here"
11547+        self.uploadable = MutableData(self.CONTENTS)
11548         self._storage = FakeStorage()
11549         self._nodemaker = make_nodemaker(self._storage, num_peers=20)
11550         self._storage_broker = self._nodemaker.storage_broker
11551hunk ./src/allmydata/test/test_mutable.py 1938
11552-        d = self._nodemaker.create_mutable_file(self.CONTENTS)
11553+        d = self._nodemaker.create_mutable_file(self.uploadable)
11554         def _created(node):
11555             self._fn = node
11556         d.addCallback(_created)
11557hunk ./src/allmydata/test/test_mutable.py 1944
11558         return d
11559 
11560-    def _encode(self, k, n, data):
11561+    def _encode(self, k, n, data, version=SDMF_VERSION):
11562         # encode 'data' into a peerid->shares dict.
11563 
11564         fn = self._fn
11565hunk ./src/allmydata/test/test_mutable.py 1960
11566         # and set the encoding parameters to something completely different
11567         fn2._required_shares = k
11568         fn2._total_shares = n
11569+        # Normally a servermap update would occur before a publish.
11570+        # Here, it doesn't, so we have to do it ourselves.
11571+        fn2.set_version(version)
11572 
11573         s = self._storage
11574         s._peers = {} # clear existing storage
11575hunk ./src/allmydata/test/test_mutable.py 1967
11576         p2 = Publish(fn2, self._storage_broker, None)
11577-        d = p2.publish(data)
11578+        uploadable = MutableData(data)
11579+        d = p2.publish(uploadable)
11580         def _published(res):
11581             shares = s._peers
11582             s._peers = {}
11583hunk ./src/allmydata/test/test_mutable.py 2235
11584         self.basedir = "mutable/Problems/test_publish_surprise"
11585         self.set_up_grid()
11586         nm = self.g.clients[0].nodemaker
11587-        d = nm.create_mutable_file("contents 1")
11588+        d = nm.create_mutable_file(MutableData("contents 1"))
11589         def _created(n):
11590             d = defer.succeed(None)
11591             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
11592hunk ./src/allmydata/test/test_mutable.py 2245
11593             d.addCallback(_got_smap1)
11594             # then modify the file, leaving the old map untouched
11595             d.addCallback(lambda res: log.msg("starting winning write"))
11596-            d.addCallback(lambda res: n.overwrite("contents 2"))
11597+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
11598             # now attempt to modify the file with the old servermap. This
11599             # will look just like an uncoordinated write, in which every
11600             # single share got updated between our mapupdate and our publish
11601hunk ./src/allmydata/test/test_mutable.py 2254
11602                           self.shouldFail(UncoordinatedWriteError,
11603                                           "test_publish_surprise", None,
11604                                           n.upload,
11605-                                          "contents 2a", self.old_map))
11606+                                          MutableData("contents 2a"), self.old_map))
11607             return d
11608         d.addCallback(_created)
11609         return d
11610hunk ./src/allmydata/test/test_mutable.py 2263
11611         self.basedir = "mutable/Problems/test_retrieve_surprise"
11612         self.set_up_grid()
11613         nm = self.g.clients[0].nodemaker
11614-        d = nm.create_mutable_file("contents 1")
11615+        d = nm.create_mutable_file(MutableData("contents 1"))
11616         def _created(n):
11617             d = defer.succeed(None)
11618             d.addCallback(lambda res: n.get_servermap(MODE_READ))
11619hunk ./src/allmydata/test/test_mutable.py 2273
11620             d.addCallback(_got_smap1)
11621             # then modify the file, leaving the old map untouched
11622             d.addCallback(lambda res: log.msg("starting winning write"))
11623-            d.addCallback(lambda res: n.overwrite("contents 2"))
11624+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
11625             # now attempt to retrieve the old version with the old servermap.
11626             # This will look like someone has changed the file since we
11627             # updated the servermap.
11628hunk ./src/allmydata/test/test_mutable.py 2282
11629             d.addCallback(lambda res:
11630                           self.shouldFail(NotEnoughSharesError,
11631                                           "test_retrieve_surprise",
11632-                                          "ran out of peers: have 0 shares (k=3)",
11633+                                          "ran out of peers: have 0 of 1",
11634                                           n.download_version,
11635                                           self.old_map,
11636                                           self.old_map.best_recoverable_version(),
11637hunk ./src/allmydata/test/test_mutable.py 2291
11638         d.addCallback(_created)
11639         return d
11640 
11641+
11642     def test_unexpected_shares(self):
11643         # upload the file, take a servermap, shut down one of the servers,
11644         # upload it again (causing shares to appear on a new server), then
11645hunk ./src/allmydata/test/test_mutable.py 2301
11646         self.basedir = "mutable/Problems/test_unexpected_shares"
11647         self.set_up_grid()
11648         nm = self.g.clients[0].nodemaker
11649-        d = nm.create_mutable_file("contents 1")
11650+        d = nm.create_mutable_file(MutableData("contents 1"))
11651         def _created(n):
11652             d = defer.succeed(None)
11653             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
11654hunk ./src/allmydata/test/test_mutable.py 2313
11655                 self.g.remove_server(peer0)
11656                 # then modify the file, leaving the old map untouched
11657                 log.msg("starting winning write")
11658-                return n.overwrite("contents 2")
11659+                return n.overwrite(MutableData("contents 2"))
11660             d.addCallback(_got_smap1)
11661             # now attempt to modify the file with the old servermap. This
11662             # will look just like an uncoordinated write, in which every
11663hunk ./src/allmydata/test/test_mutable.py 2323
11664                           self.shouldFail(UncoordinatedWriteError,
11665                                           "test_surprise", None,
11666                                           n.upload,
11667-                                          "contents 2a", self.old_map))
11668+                                          MutableData("contents 2a"), self.old_map))
11669             return d
11670         d.addCallback(_created)
11671         return d
11672hunk ./src/allmydata/test/test_mutable.py 2327
11673+    test_unexpected_shares.timeout = 15
11674 
11675     def test_bad_server(self):
11676         # Break one server, then create the file: the initial publish should
11677hunk ./src/allmydata/test/test_mutable.py 2361
11678         d.addCallback(_break_peer0)
11679         # now "create" the file, using the pre-established key, and let the
11680         # initial publish finally happen
11681-        d.addCallback(lambda res: nm.create_mutable_file("contents 1"))
11682+        d.addCallback(lambda res: nm.create_mutable_file(MutableData("contents 1")))
11683         # that ought to work
11684         def _got_node(n):
11685             d = n.download_best_version()
11686hunk ./src/allmydata/test/test_mutable.py 2370
11687             def _break_peer1(res):
11688                 self.g.break_server(self.server1.get_serverid())
11689             d.addCallback(_break_peer1)
11690-            d.addCallback(lambda res: n.overwrite("contents 2"))
11691+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
11692             # that ought to work too
11693             d.addCallback(lambda res: n.download_best_version())
11694             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
11695hunk ./src/allmydata/test/test_mutable.py 2402
11696         peerids = [s.get_serverid() for s in sb.get_connected_servers()]
11697         self.g.break_server(peerids[0])
11698 
11699-        d = nm.create_mutable_file("contents 1")
11700+        d = nm.create_mutable_file(MutableData("contents 1"))
11701         def _created(n):
11702             d = n.download_best_version()
11703             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
11704hunk ./src/allmydata/test/test_mutable.py 2410
11705             def _break_second_server(res):
11706                 self.g.break_server(peerids[1])
11707             d.addCallback(_break_second_server)
11708-            d.addCallback(lambda res: n.overwrite("contents 2"))
11709+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
11710             # that ought to work too
11711             d.addCallback(lambda res: n.download_best_version())
11712             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
11713hunk ./src/allmydata/test/test_mutable.py 2429
11714         d = self.shouldFail(NotEnoughServersError,
11715                             "test_publish_all_servers_bad",
11716                             "Ran out of non-bad servers",
11717-                            nm.create_mutable_file, "contents")
11718+                            nm.create_mutable_file, MutableData("contents"))
11719         return d
11720 
11721     def test_publish_no_servers(self):
11722hunk ./src/allmydata/test/test_mutable.py 2441
11723         d = self.shouldFail(NotEnoughServersError,
11724                             "test_publish_no_servers",
11725                             "Ran out of non-bad servers",
11726-                            nm.create_mutable_file, "contents")
11727+                            nm.create_mutable_file, MutableData("contents"))
11728         return d
11729     test_publish_no_servers.timeout = 30
11730 
11731hunk ./src/allmydata/test/test_mutable.py 2459
11732         # we need some contents that are large enough to push the privkey out
11733         # of the early part of the file
11734         LARGE = "These are Larger contents" * 2000 # about 50KB
11735-        d = nm.create_mutable_file(LARGE)
11736+        LARGE_uploadable = MutableData(LARGE)
11737+        d = nm.create_mutable_file(LARGE_uploadable)
11738         def _created(n):
11739             self.uri = n.get_uri()
11740             self.n2 = nm.create_from_cap(self.uri)
11741hunk ./src/allmydata/test/test_mutable.py 2495
11742         self.basedir = "mutable/Problems/test_privkey_query_missing"
11743         self.set_up_grid(num_servers=20)
11744         nm = self.g.clients[0].nodemaker
11745-        LARGE = "These are Larger contents" * 2000 # about 50KB
11746+        LARGE = "These are Larger contents" * 2000 # about 50KiB
11747+        LARGE_uploadable = MutableData(LARGE)
11748         nm._node_cache = DevNullDictionary() # disable the nodecache
11749 
11750hunk ./src/allmydata/test/test_mutable.py 2499
11751-        d = nm.create_mutable_file(LARGE)
11752+        d = nm.create_mutable_file(LARGE_uploadable)
11753         def _created(n):
11754             self.uri = n.get_uri()
11755             self.n2 = nm.create_from_cap(self.uri)
11756hunk ./src/allmydata/test/test_mutable.py 2509
11757         d.addCallback(_created)
11758         d.addCallback(lambda res: self.n2.get_servermap(MODE_WRITE))
11759         return d
11760+
11761+
11762+    def test_block_and_hash_query_error(self):
11763+        # This tests for what happens when a query to a remote server
11764+        # fails in either the hash validation step or the block getting
11765+        # step (because of batching, this is the same actual query).
11766+        # We need to have the storage server persist up until the point
11767+        # that its prefix is validated, then suddenly die. This
11768+        # exercises some exception handling code in Retrieve.
11769+        self.basedir = "mutable/Problems/test_block_and_hash_query_error"
11770+        self.set_up_grid(num_servers=20)
11771+        nm = self.g.clients[0].nodemaker
11772+        CONTENTS = "contents" * 2000
11773+        CONTENTS_uploadable = MutableData(CONTENTS)
11774+        d = nm.create_mutable_file(CONTENTS_uploadable)
11775+        def _created(node):
11776+            self._node = node
11777+        d.addCallback(_created)
11778+        d.addCallback(lambda ignored:
11779+            self._node.get_servermap(MODE_READ))
11780+        def _then(servermap):
11781+            # we have our servermap. Now we set up the servers like the
11782+            # tests above -- the first one that gets a read call should
11783+            # start throwing errors, but only after returning its prefix
11784+            # for validation. Since we'll download without fetching the
11785+            # private key, the next query to the remote server will be
11786+            # for either a block and salt or for hashes, either of which
11787+            # will exercise the error handling code.
11788+            killer = FirstServerGetsKilled()
11789+            for (serverid, ss) in nm.storage_broker.get_all_servers():
11790+                ss.post_call_notifier = killer.notify
11791+            ver = servermap.best_recoverable_version()
11792+            assert ver
11793+            return self._node.download_version(servermap, ver)
11794+        d.addCallback(_then)
11795+        d.addCallback(lambda data:
11796+            self.failUnlessEqual(data, CONTENTS))
11797+        return d
11798+
11799+
11800+class FileHandle(unittest.TestCase):
11801+    def setUp(self):
11802+        self.test_data = "Test Data" * 50000
11803+        self.sio = StringIO(self.test_data)
11804+        self.uploadable = MutableFileHandle(self.sio)
11805+
11806+
11807+    def test_filehandle_read(self):
11808+        self.basedir = "mutable/FileHandle/test_filehandle_read"
11809+        chunk_size = 10
11810+        for i in xrange(0, len(self.test_data), chunk_size):
11811+            data = self.uploadable.read(chunk_size)
11812+            data = "".join(data)
11813+            start = i
11814+            end = i + chunk_size
11815+            self.failUnlessEqual(data, self.test_data[start:end])
11816+
11817+
11818+    def test_filehandle_get_size(self):
11819+        self.basedir = "mutable/FileHandle/test_filehandle_get_size"
11820+        actual_size = len(self.test_data)
11821+        size = self.uploadable.get_size()
11822+        self.failUnlessEqual(size, actual_size)
11823+
11824+
11825+    def test_filehandle_get_size_out_of_order(self):
11826+        # We should be able to call get_size whenever we want without
11827+        # disturbing the location of the seek pointer.
11828+        chunk_size = 100
11829+        data = self.uploadable.read(chunk_size)
11830+        self.failUnlessEqual("".join(data), self.test_data[:chunk_size])
11831+
11832+        # Now get the size.
11833+        size = self.uploadable.get_size()
11834+        self.failUnlessEqual(size, len(self.test_data))
11835+
11836+        # Now get more data. We should be right where we left off.
11837+        more_data = self.uploadable.read(chunk_size)
11838+        start = chunk_size
11839+        end = chunk_size * 2
11840+        self.failUnlessEqual("".join(more_data), self.test_data[start:end])
11841+
11842+
11843+    def test_filehandle_file(self):
11844+        # Make sure that the MutableFileHandle works on a file as well
11845+        # as a StringIO object, since in some cases it will be asked to
11846+        # deal with files.
11847+        self.basedir = self.mktemp()
11848+        # necessary? What am I doing wrong here?
11849+        os.mkdir(self.basedir)
11850+        f_path = os.path.join(self.basedir, "test_file")
11851+        f = open(f_path, "w")
11852+        f.write(self.test_data)
11853+        f.close()
11854+        f = open(f_path, "r")
11855+
11856+        uploadable = MutableFileHandle(f)
11857+
11858+        data = uploadable.read(len(self.test_data))
11859+        self.failUnlessEqual("".join(data), self.test_data)
11860+        size = uploadable.get_size()
11861+        self.failUnlessEqual(size, len(self.test_data))
11862+
11863+
11864+    def test_close(self):
11865+        # Make sure that the MutableFileHandle closes its handle when
11866+        # told to do so.
11867+        self.uploadable.close()
11868+        self.failUnless(self.sio.closed)
11869+
11870+
11871+class DataHandle(unittest.TestCase):
11872+    def setUp(self):
11873+        self.test_data = "Test Data" * 50000
11874+        self.uploadable = MutableData(self.test_data)
11875+
11876+
11877+    def test_datahandle_read(self):
11878+        chunk_size = 10
11879+        for i in xrange(0, len(self.test_data), chunk_size):
11880+            data = self.uploadable.read(chunk_size)
11881+            data = "".join(data)
11882+            start = i
11883+            end = i + chunk_size
11884+            self.failUnlessEqual(data, self.test_data[start:end])
11885+
11886+
11887+    def test_datahandle_get_size(self):
11888+        actual_size = len(self.test_data)
11889+        size = self.uploadable.get_size()
11890+        self.failUnlessEqual(size, actual_size)
11891+
11892+
11893+    def test_datahandle_get_size_out_of_order(self):
11894+        # We should be able to call get_size whenever we want without
11895+        # disturbing the location of the seek pointer.
11896+        chunk_size = 100
11897+        data = self.uploadable.read(chunk_size)
11898+        self.failUnlessEqual("".join(data), self.test_data[:chunk_size])
11899+
11900+        # Now get the size.
11901+        size = self.uploadable.get_size()
11902+        self.failUnlessEqual(size, len(self.test_data))
11903+
11904+        # Now get more data. We should be right where we left off.
11905+        more_data = self.uploadable.read(chunk_size)
11906+        start = chunk_size
11907+        end = chunk_size * 2
11908+        self.failUnlessEqual("".join(more_data), self.test_data[start:end])
11909+
11910+
11911+class Version(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixin, \
11912+              PublishMixin):
11913+    def setUp(self):
11914+        GridTestMixin.setUp(self)
11915+        self.basedir = self.mktemp()
11916+        self.set_up_grid()
11917+        self.c = self.g.clients[0]
11918+        self.nm = self.c.nodemaker
11919+        self.data = "test data" * 100000 # about 900 KiB; MDMF
11920+        self.small_data = "test data" * 10 # about 90 B; SDMF
11921+        return self.do_upload()
11922+
11923+
11924+    def do_upload(self):
11925+        d1 = self.nm.create_mutable_file(MutableData(self.data),
11926+                                         version=MDMF_VERSION)
11927+        d2 = self.nm.create_mutable_file(MutableData(self.small_data))
11928+        dl = gatherResults([d1, d2])
11929+        def _then((n1, n2)):
11930+            assert isinstance(n1, MutableFileNode)
11931+            assert isinstance(n2, MutableFileNode)
11932+
11933+            self.mdmf_node = n1
11934+            self.sdmf_node = n2
11935+        dl.addCallback(_then)
11936+        return dl
11937+
11938+
11939+    def test_get_readonly_mutable_version(self):
11940+        # Attempting to get a mutable version of a mutable file from a
11941+        # filenode initialized with a readcap should return a readonly
11942+        # version of that same node.
11943+        ro = self.mdmf_node.get_readonly()
11944+        d = ro.get_best_mutable_version()
11945+        d.addCallback(lambda version:
11946+            self.failUnless(version.is_readonly()))
11947+        d.addCallback(lambda ignored:
11948+            self.sdmf_node.get_readonly())
11949+        d.addCallback(lambda version:
11950+            self.failUnless(version.is_readonly()))
11951+        return d
11952+
11953+
11954+    def test_get_sequence_number(self):
11955+        d = self.mdmf_node.get_best_readable_version()
11956+        d.addCallback(lambda bv:
11957+            self.failUnlessEqual(bv.get_sequence_number(), 1))
11958+        d.addCallback(lambda ignored:
11959+            self.sdmf_node.get_best_readable_version())
11960+        d.addCallback(lambda bv:
11961+            self.failUnlessEqual(bv.get_sequence_number(), 1))
11962+        # Now update. The sequence number in both cases should be 1 in
11963+        # both cases.
11964+        def _do_update(ignored):
11965+            new_data = MutableData("foo bar baz" * 100000)
11966+            new_small_data = MutableData("foo bar baz" * 10)
11967+            d1 = self.mdmf_node.overwrite(new_data)
11968+            d2 = self.sdmf_node.overwrite(new_small_data)
11969+            dl = gatherResults([d1, d2])
11970+            return dl
11971+        d.addCallback(_do_update)
11972+        d.addCallback(lambda ignored:
11973+            self.mdmf_node.get_best_readable_version())
11974+        d.addCallback(lambda bv:
11975+            self.failUnlessEqual(bv.get_sequence_number(), 2))
11976+        d.addCallback(lambda ignored:
11977+            self.sdmf_node.get_best_readable_version())
11978+        d.addCallback(lambda bv:
11979+            self.failUnlessEqual(bv.get_sequence_number(), 2))
11980+        return d
11981+
11982+
11983+    def test_get_writekey(self):
11984+        d = self.mdmf_node.get_best_mutable_version()
11985+        d.addCallback(lambda bv:
11986+            self.failUnlessEqual(bv.get_writekey(),
11987+                                 self.mdmf_node.get_writekey()))
11988+        d.addCallback(lambda ignored:
11989+            self.sdmf_node.get_best_mutable_version())
11990+        d.addCallback(lambda bv:
11991+            self.failUnlessEqual(bv.get_writekey(),
11992+                                 self.sdmf_node.get_writekey()))
11993+        return d
11994+
11995+
11996+    def test_get_storage_index(self):
11997+        d = self.mdmf_node.get_best_mutable_version()
11998+        d.addCallback(lambda bv:
11999+            self.failUnlessEqual(bv.get_storage_index(),
12000+                                 self.mdmf_node.get_storage_index()))
12001+        d.addCallback(lambda ignored:
12002+            self.sdmf_node.get_best_mutable_version())
12003+        d.addCallback(lambda bv:
12004+            self.failUnlessEqual(bv.get_storage_index(),
12005+                                 self.sdmf_node.get_storage_index()))
12006+        return d
12007+
12008+
12009+    def test_get_readonly_version(self):
12010+        d = self.mdmf_node.get_best_readable_version()
12011+        d.addCallback(lambda bv:
12012+            self.failUnless(bv.is_readonly()))
12013+        d.addCallback(lambda ignored:
12014+            self.sdmf_node.get_best_readable_version())
12015+        d.addCallback(lambda bv:
12016+            self.failUnless(bv.is_readonly()))
12017+        return d
12018+
12019+
12020+    def test_get_mutable_version(self):
12021+        d = self.mdmf_node.get_best_mutable_version()
12022+        d.addCallback(lambda bv:
12023+            self.failIf(bv.is_readonly()))
12024+        d.addCallback(lambda ignored:
12025+            self.sdmf_node.get_best_mutable_version())
12026+        d.addCallback(lambda bv:
12027+            self.failIf(bv.is_readonly()))
12028+        return d
12029+
12030+
12031+    def test_toplevel_overwrite(self):
12032+        new_data = MutableData("foo bar baz" * 100000)
12033+        new_small_data = MutableData("foo bar baz" * 10)
12034+        d = self.mdmf_node.overwrite(new_data)
12035+        d.addCallback(lambda ignored:
12036+            self.mdmf_node.download_best_version())
12037+        d.addCallback(lambda data:
12038+            self.failUnlessEqual(data, "foo bar baz" * 100000))
12039+        d.addCallback(lambda ignored:
12040+            self.sdmf_node.overwrite(new_small_data))
12041+        d.addCallback(lambda ignored:
12042+            self.sdmf_node.download_best_version())
12043+        d.addCallback(lambda data:
12044+            self.failUnlessEqual(data, "foo bar baz" * 10))
12045+        return d
12046+
12047+
12048+    def test_toplevel_modify(self):
12049+        def modifier(old_contents, servermap, first_time):
12050+            return old_contents + "modified"
12051+        d = self.mdmf_node.modify(modifier)
12052+        d.addCallback(lambda ignored:
12053+            self.mdmf_node.download_best_version())
12054+        d.addCallback(lambda data:
12055+            self.failUnlessIn("modified", data))
12056+        d.addCallback(lambda ignored:
12057+            self.sdmf_node.modify(modifier))
12058+        d.addCallback(lambda ignored:
12059+            self.sdmf_node.download_best_version())
12060+        d.addCallback(lambda data:
12061+            self.failUnlessIn("modified", data))
12062+        return d
12063+
12064+
12065+    def test_version_modify(self):
12066+        # TODO: When we can publish multiple versions, alter this test
12067+        # to modify a version other than the best usable version, then
12068+        # test to see that the best recoverable version is that.
12069+        def modifier(old_contents, servermap, first_time):
12070+            return old_contents + "modified"
12071+        d = self.mdmf_node.modify(modifier)
12072+        d.addCallback(lambda ignored:
12073+            self.mdmf_node.download_best_version())
12074+        d.addCallback(lambda data:
12075+            self.failUnlessIn("modified", data))
12076+        d.addCallback(lambda ignored:
12077+            self.sdmf_node.modify(modifier))
12078+        d.addCallback(lambda ignored:
12079+            self.sdmf_node.download_best_version())
12080+        d.addCallback(lambda data:
12081+            self.failUnlessIn("modified", data))
12082+        return d
12083+
12084+
12085+    def test_download_version(self):
12086+        d = self.publish_multiple()
12087+        # We want to have two recoverable versions on the grid.
12088+        d.addCallback(lambda res:
12089+                      self._set_versions({0:0,2:0,4:0,6:0,8:0,
12090+                                          1:1,3:1,5:1,7:1,9:1}))
12091+        # Now try to download each version. We should get the plaintext
12092+        # associated with that version.
12093+        d.addCallback(lambda ignored:
12094+            self._fn.get_servermap(mode=MODE_READ))
12095+        def _got_servermap(smap):
12096+            versions = smap.recoverable_versions()
12097+            assert len(versions) == 2
12098+
12099+            self.servermap = smap
12100+            self.version1, self.version2 = versions
12101+            assert self.version1 != self.version2
12102+
12103+            self.version1_seqnum = self.version1[0]
12104+            self.version2_seqnum = self.version2[0]
12105+            self.version1_index = self.version1_seqnum - 1
12106+            self.version2_index = self.version2_seqnum - 1
12107+
12108+        d.addCallback(_got_servermap)
12109+        d.addCallback(lambda ignored:
12110+            self._fn.download_version(self.servermap, self.version1))
12111+        d.addCallback(lambda results:
12112+            self.failUnlessEqual(self.CONTENTS[self.version1_index],
12113+                                 results))
12114+        d.addCallback(lambda ignored:
12115+            self._fn.download_version(self.servermap, self.version2))
12116+        d.addCallback(lambda results:
12117+            self.failUnlessEqual(self.CONTENTS[self.version2_index],
12118+                                 results))
12119+        return d
12120+
12121+
12122+    def test_download_nonexistent_version(self):
12123+        d = self.mdmf_node.get_servermap(mode=MODE_WRITE)
12124+        def _set_servermap(servermap):
12125+            self.servermap = servermap
12126+        d.addCallback(_set_servermap)
12127+        d.addCallback(lambda ignored:
12128+           self.shouldFail(UnrecoverableFileError, "nonexistent version",
12129+                           None,
12130+                           self.mdmf_node.download_version, self.servermap,
12131+                           "not a version"))
12132+        return d
12133+
12134+
12135+    def test_partial_read(self):
12136+        # read only a few bytes at a time, and see that the results are
12137+        # what we expect.
12138+        d = self.mdmf_node.get_best_readable_version()
12139+        def _read_data(version):
12140+            c = consumer.MemoryConsumer()
12141+            d2 = defer.succeed(None)
12142+            for i in xrange(0, len(self.data), 10000):
12143+                d2.addCallback(lambda ignored, i=i: version.read(c, i, 10000))
12144+            d2.addCallback(lambda ignored:
12145+                self.failUnlessEqual(self.data, "".join(c.chunks)))
12146+            return d2
12147+        d.addCallback(_read_data)
12148+        return d
12149+
12150+
12151+    def test_read(self):
12152+        d = self.mdmf_node.get_best_readable_version()
12153+        def _read_data(version):
12154+            c = consumer.MemoryConsumer()
12155+            d2 = defer.succeed(None)
12156+            d2.addCallback(lambda ignored: version.read(c))
12157+            d2.addCallback(lambda ignored:
12158+                self.failUnlessEqual("".join(c.chunks), self.data))
12159+            return d2
12160+        d.addCallback(_read_data)
12161+        return d
12162+
12163+
12164+    def test_download_best_version(self):
12165+        d = self.mdmf_node.download_best_version()
12166+        d.addCallback(lambda data:
12167+            self.failUnlessEqual(data, self.data))
12168+        d.addCallback(lambda ignored:
12169+            self.sdmf_node.download_best_version())
12170+        d.addCallback(lambda data:
12171+            self.failUnlessEqual(data, self.small_data))
12172+        return d
12173+
12174+
12175+class Update(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixin):
12176+    def setUp(self):
12177+        GridTestMixin.setUp(self)
12178+        self.basedir = self.mktemp()
12179+        self.set_up_grid()
12180+        self.c = self.g.clients[0]
12181+        self.nm = self.c.nodemaker
12182+        self.data = "test data" * 100000 # about 900 KiB; MDMF
12183+        self.small_data = "test data" * 10 # about 90 B; SDMF
12184+        return self.do_upload()
12185+
12186+
12187+    def do_upload(self):
12188+        d1 = self.nm.create_mutable_file(MutableData(self.data),
12189+                                         version=MDMF_VERSION)
12190+        d2 = self.nm.create_mutable_file(MutableData(self.small_data))
12191+        dl = gatherResults([d1, d2])
12192+        def _then((n1, n2)):
12193+            assert isinstance(n1, MutableFileNode)
12194+            assert isinstance(n2, MutableFileNode)
12195+
12196+            self.mdmf_node = n1
12197+            self.sdmf_node = n2
12198+        dl.addCallback(_then)
12199+        return dl
12200+
12201+
12202+    def test_append(self):
12203+        # We should be able to append data to the middle of a mutable
12204+        # file and get what we expect.
12205+        new_data = self.data + "appended"
12206+        d = self.mdmf_node.get_best_mutable_version()
12207+        d.addCallback(lambda mv:
12208+            mv.update(MutableData("appended"), len(self.data)))
12209+        d.addCallback(lambda ignored:
12210+            self.mdmf_node.download_best_version())
12211+        d.addCallback(lambda results:
12212+            self.failUnlessEqual(results, new_data))
12213+        return d
12214+    test_append.timeout = 15
12215+
12216+
12217+    def test_replace(self):
12218+        # We should be able to replace data in the middle of a mutable
12219+        # file and get what we expect back.
12220+        new_data = self.data[:100]
12221+        new_data += "appended"
12222+        new_data += self.data[108:]
12223+        d = self.mdmf_node.get_best_mutable_version()
12224+        d.addCallback(lambda mv:
12225+            mv.update(MutableData("appended"), 100))
12226+        d.addCallback(lambda ignored:
12227+            self.mdmf_node.download_best_version())
12228+        d.addCallback(lambda results:
12229+            self.failUnlessEqual(results, new_data))
12230+        return d
12231+
12232+
12233+    def test_replace_and_extend(self):
12234+        # We should be able to replace data in the middle of a mutable
12235+        # file and extend that mutable file and get what we expect.
12236+        new_data = self.data[:100]
12237+        new_data += "modified " * 100000
12238+        d = self.mdmf_node.get_best_mutable_version()
12239+        d.addCallback(lambda mv:
12240+            mv.update(MutableData("modified " * 100000), 100))
12241+        d.addCallback(lambda ignored:
12242+            self.mdmf_node.download_best_version())
12243+        d.addCallback(lambda results:
12244+            self.failUnlessEqual(results, new_data))
12245+        return d
12246+
12247+
12248+    def test_append_power_of_two(self):
12249+        # If we attempt to extend a mutable file so that its segment
12250+        # count crosses a power-of-two boundary, the update operation
12251+        # should know how to reencode the file.
12252+
12253+        # Note that the data populating self.mdmf_node is about 900 KiB
12254+        # long -- this is 7 segments in the default segment size. So we
12255+        # need to add 2 segments worth of data to push it over a
12256+        # power-of-two boundary.
12257+        segment = "a" * DEFAULT_MAX_SEGMENT_SIZE
12258+        new_data = self.data + (segment * 2)
12259+        d = self.mdmf_node.get_best_mutable_version()
12260+        d.addCallback(lambda mv:
12261+            mv.update(MutableData(segment * 2), len(self.data)))
12262+        d.addCallback(lambda ignored:
12263+            self.mdmf_node.download_best_version())
12264+        d.addCallback(lambda results:
12265+            self.failUnlessEqual(results, new_data))
12266+        return d
12267+    test_append_power_of_two.timeout = 15
12268+
12269+
12270+    def test_update_sdmf(self):
12271+        # Running update on a single-segment file should still work.
12272+        new_data = self.small_data + "appended"
12273+        d = self.sdmf_node.get_best_mutable_version()
12274+        d.addCallback(lambda mv:
12275+            mv.update(MutableData("appended"), len(self.small_data)))
12276+        d.addCallback(lambda ignored:
12277+            self.sdmf_node.download_best_version())
12278+        d.addCallback(lambda results:
12279+            self.failUnlessEqual(results, new_data))
12280+        return d
12281+
12282+    def test_replace_in_last_segment(self):
12283+        # The wrapper should know how to handle the tail segment
12284+        # appropriately.
12285+        replace_offset = len(self.data) - 100
12286+        new_data = self.data[:replace_offset] + "replaced"
12287+        rest_offset = replace_offset + len("replaced")
12288+        new_data += self.data[rest_offset:]
12289+        d = self.mdmf_node.get_best_mutable_version()
12290+        d.addCallback(lambda mv:
12291+            mv.update(MutableData("replaced"), replace_offset))
12292+        d.addCallback(lambda ignored:
12293+            self.mdmf_node.download_best_version())
12294+        d.addCallback(lambda results:
12295+            self.failUnlessEqual(results, new_data))
12296+        return d
12297+
12298+
12299+    def test_multiple_segment_replace(self):
12300+        replace_offset = 2 * DEFAULT_MAX_SEGMENT_SIZE
12301+        new_data = self.data[:replace_offset]
12302+        new_segment = "a" * DEFAULT_MAX_SEGMENT_SIZE
12303+        new_data += 2 * new_segment
12304+        new_data += "replaced"
12305+        rest_offset = len(new_data)
12306+        new_data += self.data[rest_offset:]
12307+        d = self.mdmf_node.get_best_mutable_version()
12308+        d.addCallback(lambda mv:
12309+            mv.update(MutableData((2 * new_segment) + "replaced"),
12310+                      replace_offset))
12311+        d.addCallback(lambda ignored:
12312+            self.mdmf_node.download_best_version())
12313+        d.addCallback(lambda results:
12314+            self.failUnlessEqual(results, new_data))
12315+        return d
12316hunk ./src/allmydata/test/test_sftp.py 32
12317 
12318 from allmydata.util.consumer import download_to_data
12319 from allmydata.immutable import upload
12320+from allmydata.mutable import publish
12321 from allmydata.test.no_network import GridTestMixin
12322 from allmydata.test.common import ShouldFailMixin
12323 from allmydata.test.common_util import ReallyEqualMixin
12324hunk ./src/allmydata/test/test_sftp.py 84
12325         return d
12326 
12327     def _set_up_tree(self):
12328-        d = self.client.create_mutable_file("mutable file contents")
12329+        u = publish.MutableData("mutable file contents")
12330+        d = self.client.create_mutable_file(u)
12331         d.addCallback(lambda node: self.root.set_node(u"mutable", node))
12332         def _created_mutable(n):
12333             self.mutable = n
12334hunk ./src/allmydata/test/test_sftp.py 1334
12335         d.addCallback(lambda ign: self.failUnlessEqual(sftpd.all_heisenfiles, {}))
12336         d.addCallback(lambda ign: self.failUnlessEqual(self.handler._heisenfiles, {}))
12337         return d
12338+    test_makeDirectory.timeout = 15
12339 
12340     def test_execCommand_and_openShell(self):
12341         class FakeProtocol:
12342hunk ./src/allmydata/test/test_storage.py 27
12343                                      LayoutInvalid, MDMFSIGNABLEHEADER, \
12344                                      SIGNED_PREFIX, MDMFHEADER, \
12345                                      MDMFOFFSETS, SDMFSlotWriteProxy
12346-from allmydata.interfaces import BadWriteEnablerError, MDMF_VERSION, \
12347-                                 SDMF_VERSION
12348+from allmydata.interfaces import BadWriteEnablerError
12349 from allmydata.test.common import LoggingServiceParent, ShouldFailMixin
12350 from allmydata.test.common_web import WebRenderingMixin
12351 from allmydata.web.storage import StorageStatus, remove_prefix
12352hunk ./src/allmydata/test/test_system.py 26
12353 from allmydata.monitor import Monitor
12354 from allmydata.mutable.common import NotWriteableError
12355 from allmydata.mutable import layout as mutable_layout
12356+from allmydata.mutable.publish import MutableData
12357 from foolscap.api import DeadReferenceError
12358 from twisted.python.failure import Failure
12359 from twisted.web.client import getPage
12360hunk ./src/allmydata/test/test_system.py 467
12361     def test_mutable(self):
12362         self.basedir = "system/SystemTest/test_mutable"
12363         DATA = "initial contents go here."  # 25 bytes % 3 != 0
12364+        DATA_uploadable = MutableData(DATA)
12365         NEWDATA = "new contents yay"
12366hunk ./src/allmydata/test/test_system.py 469
12367+        NEWDATA_uploadable = MutableData(NEWDATA)
12368         NEWERDATA = "this is getting old"
12369hunk ./src/allmydata/test/test_system.py 471
12370+        NEWERDATA_uploadable = MutableData(NEWERDATA)
12371 
12372         d = self.set_up_nodes(use_key_generator=True)
12373 
12374hunk ./src/allmydata/test/test_system.py 478
12375         def _create_mutable(res):
12376             c = self.clients[0]
12377             log.msg("starting create_mutable_file")
12378-            d1 = c.create_mutable_file(DATA)
12379+            d1 = c.create_mutable_file(DATA_uploadable)
12380             def _done(res):
12381                 log.msg("DONE: %s" % (res,))
12382                 self._mutable_node_1 = res
12383hunk ./src/allmydata/test/test_system.py 565
12384             self.failUnlessEqual(res, DATA)
12385             # replace the data
12386             log.msg("starting replace1")
12387-            d1 = newnode.overwrite(NEWDATA)
12388+            d1 = newnode.overwrite(NEWDATA_uploadable)
12389             d1.addCallback(lambda res: newnode.download_best_version())
12390             return d1
12391         d.addCallback(_check_download_3)
12392hunk ./src/allmydata/test/test_system.py 579
12393             newnode2 = self.clients[3].create_node_from_uri(uri)
12394             self._newnode3 = self.clients[3].create_node_from_uri(uri)
12395             log.msg("starting replace2")
12396-            d1 = newnode1.overwrite(NEWERDATA)
12397+            d1 = newnode1.overwrite(NEWERDATA_uploadable)
12398             d1.addCallback(lambda res: newnode2.download_best_version())
12399             return d1
12400         d.addCallback(_check_download_4)
12401hunk ./src/allmydata/test/test_system.py 649
12402         def _check_empty_file(res):
12403             # make sure we can create empty files, this usually screws up the
12404             # segsize math
12405-            d1 = self.clients[2].create_mutable_file("")
12406+            d1 = self.clients[2].create_mutable_file(MutableData(""))
12407             d1.addCallback(lambda newnode: newnode.download_best_version())
12408             d1.addCallback(lambda res: self.failUnlessEqual("", res))
12409             return d1
12410hunk ./src/allmydata/test/test_system.py 680
12411                                  self.key_generator_svc.key_generator.pool_size + size_delta)
12412 
12413         d.addCallback(check_kg_poolsize, 0)
12414-        d.addCallback(lambda junk: self.clients[3].create_mutable_file('hello, world'))
12415+        d.addCallback(lambda junk:
12416+            self.clients[3].create_mutable_file(MutableData('hello, world')))
12417         d.addCallback(check_kg_poolsize, -1)
12418         d.addCallback(lambda junk: self.clients[3].create_dirnode())
12419         d.addCallback(check_kg_poolsize, -2)
12420hunk ./src/allmydata/test/test_web.py 28
12421 from allmydata.util.encodingutil import to_str
12422 from allmydata.test.common import FakeCHKFileNode, FakeMutableFileNode, \
12423      create_chk_filenode, WebErrorMixin, ShouldFailMixin, make_mutable_file_uri
12424-from allmydata.interfaces import IMutableFileNode
12425+from allmydata.interfaces import IMutableFileNode, SDMF_VERSION, MDMF_VERSION
12426 from allmydata.mutable import servermap, publish, retrieve
12427 import allmydata.test.common_util as testutil
12428 from allmydata.test.no_network import GridTestMixin
12429hunk ./src/allmydata/test/test_web.py 57
12430         return FakeCHKFileNode(cap)
12431     def _create_mutable(self, cap):
12432         return FakeMutableFileNode(None, None, None, None).init_from_cap(cap)
12433-    def create_mutable_file(self, contents="", keysize=None):
12434+    def create_mutable_file(self, contents="", keysize=None,
12435+                            version=SDMF_VERSION):
12436         n = FakeMutableFileNode(None, None, None, None)
12437hunk ./src/allmydata/test/test_web.py 60
12438+        n.set_version(version)
12439         return n.create(contents)
12440 
12441 class FakeUploader(service.Service):
12442hunk ./src/allmydata/test/test_web.py 157
12443         self.nodemaker = FakeNodeMaker(None, self._secret_holder, None,
12444                                        self.uploader, None,
12445                                        None, None)
12446+        self.mutable_file_default = SDMF_VERSION
12447 
12448     def startService(self):
12449         return service.MultiService.startService(self)
12450hunk ./src/allmydata/test/test_web.py 762
12451                              self.PUT, base + "/@@name=/blah.txt", "")
12452         return d
12453 
12454+
12455     def test_GET_DIRURL_named_bad(self):
12456         base = "/file/%s" % urllib.quote(self._foo_uri)
12457         d = self.shouldFail2(error.Error, "test_PUT_DIRURL_named_bad",
12458hunk ./src/allmydata/test/test_web.py 878
12459                                                       self.NEWFILE_CONTENTS))
12460         return d
12461 
12462+    def test_PUT_NEWFILEURL_unlinked_mdmf(self):
12463+        # this should get us a few segments of an MDMF mutable file,
12464+        # which we can then test for.
12465+        contents = self.NEWFILE_CONTENTS * 300000
12466+        d = self.PUT("/uri?mutable=true&mutable-type=mdmf",
12467+                     contents)
12468+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12469+        d.addCallback(lambda json: self.failUnlessIn("mdmf", json))
12470+        return d
12471+
12472+    def test_PUT_NEWFILEURL_unlinked_sdmf(self):
12473+        contents = self.NEWFILE_CONTENTS * 300000
12474+        d = self.PUT("/uri?mutable=true&mutable-type=sdmf",
12475+                     contents)
12476+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12477+        d.addCallback(lambda json: self.failUnlessIn("sdmf", json))
12478+        return d
12479+
12480     def test_PUT_NEWFILEURL_range_bad(self):
12481         headers = {"content-range": "bytes 1-10/%d" % len(self.NEWFILE_CONTENTS)}
12482         target = self.public_url + "/foo/new.txt"
12483hunk ./src/allmydata/test/test_web.py 928
12484         return d
12485 
12486     def test_PUT_NEWFILEURL_mutable_toobig(self):
12487-        d = self.shouldFail2(error.Error, "test_PUT_NEWFILEURL_mutable_toobig",
12488-                             "413 Request Entity Too Large",
12489-                             "SDMF is limited to one segment, and 10001 > 10000",
12490-                             self.PUT,
12491-                             self.public_url + "/foo/new.txt?mutable=true",
12492-                             "b" * (self.s.MUTABLE_SIZELIMIT+1))
12493+        # It is okay to upload large mutable files, so we should be able
12494+        # to do that.
12495+        d = self.PUT(self.public_url + "/foo/new.txt?mutable=true",
12496+                     "b" * (self.s.MUTABLE_SIZELIMIT + 1))
12497         return d
12498 
12499     def test_PUT_NEWFILEURL_replace(self):
12500hunk ./src/allmydata/test/test_web.py 1026
12501         d.addCallback(_check1)
12502         return d
12503 
12504+    def test_GET_FILEURL_json_mutable_type(self):
12505+        # The JSON should include mutable-type, which says whether the
12506+        # file is SDMF or MDMF
12507+        d = self.PUT("/uri?mutable=true&mutable-type=mdmf",
12508+                     self.NEWFILE_CONTENTS * 300000)
12509+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12510+        def _got_json(json, version):
12511+            data = simplejson.loads(json)
12512+            assert "filenode" == data[0]
12513+            data = data[1]
12514+            assert isinstance(data, dict)
12515+
12516+            self.failUnlessIn("mutable-type", data)
12517+            self.failUnlessEqual(data['mutable-type'], version)
12518+
12519+        d.addCallback(_got_json, "mdmf")
12520+        # Now make an SDMF file and check that it is reported correctly.
12521+        d.addCallback(lambda ignored:
12522+            self.PUT("/uri?mutable=true&mutable-type=sdmf",
12523+                      self.NEWFILE_CONTENTS * 300000))
12524+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12525+        d.addCallback(_got_json, "sdmf")
12526+        return d
12527+
12528     def test_GET_FILEURL_json_missing(self):
12529         d = self.GET(self.public_url + "/foo/missing?json")
12530         d.addBoth(self.should404, "test_GET_FILEURL_json_missing")
12531hunk ./src/allmydata/test/test_web.py 1088
12532         d.addBoth(self.should404, "test_GET_FILEURL_uri_missing")
12533         return d
12534 
12535-    def test_GET_DIRECTORY_html_banner(self):
12536+    def test_GET_DIRECTORY_html(self):
12537         d = self.GET(self.public_url + "/foo", followRedirect=True)
12538         def _check(res):
12539             self.failUnlessIn('<div class="toolbar-item"><a href="../../..">Return to Welcome page</a></div>',res)
12540hunk ./src/allmydata/test/test_web.py 1092
12541+            self.failUnlessIn("mutable-type-mdmf", res)
12542+            self.failUnlessIn("mutable-type-sdmf", res)
12543         d.addCallback(_check)
12544         return d
12545 
12546hunk ./src/allmydata/test/test_web.py 1097
12547+    def test_GET_root_html(self):
12548+        # make sure that we have the option to upload an unlinked
12549+        # mutable file in SDMF and MDMF formats.
12550+        d = self.GET("/")
12551+        def _got_html(html):
12552+            # These are radio buttons that allow the user to toggle
12553+            # whether a particular mutable file is MDMF or SDMF.
12554+            self.failUnlessIn("mutable-type-mdmf", html)
12555+            self.failUnlessIn("mutable-type-sdmf", html)
12556+        d.addCallback(_got_html)
12557+        return d
12558+
12559+    def test_mutable_type_defaults(self):
12560+        # The checked="checked" attribute of the inputs corresponding to
12561+        # the mutable-type parameter should change as expected with the
12562+        # value configured in tahoe.cfg.
12563+        #
12564+        # By default, the value configured with the client is
12565+        # SDMF_VERSION, so that should be checked.
12566+        assert self.s.mutable_file_default == SDMF_VERSION
12567+
12568+        d = self.GET("/")
12569+        def _got_html(html, value):
12570+            i = 'input checked="checked" type="radio" id="mutable-type-%s"'
12571+            self.failUnlessIn(i % value, html)
12572+        d.addCallback(_got_html, "sdmf")
12573+        d.addCallback(lambda ignored:
12574+            self.GET(self.public_url + "/foo", followRedirect=True))
12575+        d.addCallback(_got_html, "sdmf")
12576+        # Now switch the configuration value to MDMF. The MDMF radio
12577+        # buttons should now be checked on these pages.
12578+        def _swap_values(ignored):
12579+            self.s.mutable_file_default = MDMF_VERSION
12580+        d.addCallback(_swap_values)
12581+        d.addCallback(lambda ignored: self.GET("/"))
12582+        d.addCallback(_got_html, "mdmf")
12583+        d.addCallback(lambda ignored:
12584+            self.GET(self.public_url + "/foo", followRedirect=True))
12585+        d.addCallback(_got_html, "mdmf")
12586+        return d
12587+
12588     def test_GET_DIRURL(self):
12589         # the addSlash means we get a redirect here
12590         # from /uri/$URI/foo/ , we need ../../../ to get back to the root
12591hunk ./src/allmydata/test/test_web.py 1227
12592         d.addCallback(self.failUnlessIsFooJSON)
12593         return d
12594 
12595+    def test_GET_DIRURL_json_mutable_type(self):
12596+        d = self.PUT(self.public_url + \
12597+                     "/foo/sdmf.txt?mutable=true&mutable-type=sdmf",
12598+                     self.NEWFILE_CONTENTS * 300000)
12599+        d.addCallback(lambda ignored:
12600+            self.PUT(self.public_url + \
12601+                     "/foo/mdmf.txt?mutable=true&mutable-type=mdmf",
12602+                     self.NEWFILE_CONTENTS * 300000))
12603+        # Now we have an MDMF and SDMF file in the directory. If we GET
12604+        # its JSON, we should see their encodings.
12605+        d.addCallback(lambda ignored:
12606+            self.GET(self.public_url + "/foo?t=json"))
12607+        def _got_json(json):
12608+            data = simplejson.loads(json)
12609+            assert data[0] == "dirnode"
12610+
12611+            data = data[1]
12612+            kids = data['children']
12613+
12614+            mdmf_data = kids['mdmf.txt'][1]
12615+            self.failUnlessIn("mutable-type", mdmf_data)
12616+            self.failUnlessEqual(mdmf_data['mutable-type'], "mdmf")
12617+
12618+            sdmf_data = kids['sdmf.txt'][1]
12619+            self.failUnlessIn("mutable-type", sdmf_data)
12620+            self.failUnlessEqual(sdmf_data['mutable-type'], "sdmf")
12621+        d.addCallback(_got_json)
12622+        return d
12623+
12624 
12625     def test_POST_DIRURL_manifest_no_ophandle(self):
12626         d = self.shouldFail2(error.Error,
12627hunk ./src/allmydata/test/test_web.py 1810
12628         return d
12629 
12630     def test_POST_upload_no_link_mutable_toobig(self):
12631-        d = self.shouldFail2(error.Error,
12632-                             "test_POST_upload_no_link_mutable_toobig",
12633-                             "413 Request Entity Too Large",
12634-                             "SDMF is limited to one segment, and 10001 > 10000",
12635-                             self.POST,
12636-                             "/uri", t="upload", mutable="true",
12637-                             file=("new.txt",
12638-                                   "b" * (self.s.MUTABLE_SIZELIMIT+1)) )
12639+        # The SDMF size limit is no longer in place, so we should be
12640+        # able to upload mutable files that are as large as we want them
12641+        # to be.
12642+        d = self.POST("/uri", t="upload", mutable="true",
12643+                      file=("new.txt", "b" * (self.s.MUTABLE_SIZELIMIT + 1)))
12644         return d
12645 
12646hunk ./src/allmydata/test/test_web.py 1817
12647+
12648+    def test_POST_upload_mutable_type_unlinked(self):
12649+        d = self.POST("/uri?t=upload&mutable=true&mutable-type=sdmf",
12650+                      file=("sdmf.txt", self.NEWFILE_CONTENTS * 300000))
12651+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12652+        def _got_json(json, version):
12653+            data = simplejson.loads(json)
12654+            data = data[1]
12655+
12656+            self.failUnlessIn("mutable-type", data)
12657+            self.failUnlessEqual(data['mutable-type'], version)
12658+        d.addCallback(_got_json, "sdmf")
12659+        d.addCallback(lambda ignored:
12660+            self.POST("/uri?t=upload&mutable=true&mutable-type=mdmf",
12661+                      file=('mdmf.txt', self.NEWFILE_CONTENTS * 300000)))
12662+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12663+        d.addCallback(_got_json, "mdmf")
12664+        return d
12665+
12666+    def test_POST_upload_mutable_type(self):
12667+        d = self.POST(self.public_url + \
12668+                      "/foo?t=upload&mutable=true&mutable-type=sdmf",
12669+                      file=("sdmf.txt", self.NEWFILE_CONTENTS * 300000))
12670+        fn = self._foo_node
12671+        def _got_cap(filecap, filename):
12672+            filenameu = unicode(filename)
12673+            self.failUnlessURIMatchesRWChild(filecap, fn, filenameu)
12674+            return self.GET(self.public_url + "/foo/%s?t=json" % filename)
12675+        d.addCallback(_got_cap, "sdmf.txt")
12676+        def _got_json(json, version):
12677+            data = simplejson.loads(json)
12678+            data = data[1]
12679+
12680+            self.failUnlessIn("mutable-type", data)
12681+            self.failUnlessEqual(data['mutable-type'], version)
12682+        d.addCallback(_got_json, "sdmf")
12683+        d.addCallback(lambda ignored:
12684+            self.POST(self.public_url + \
12685+                      "/foo?t=upload&mutable=true&mutable-type=mdmf",
12686+                      file=("mdmf.txt", self.NEWFILE_CONTENTS * 300000)))
12687+        d.addCallback(_got_cap, "mdmf.txt")
12688+        d.addCallback(_got_json, "mdmf")
12689+        return d
12690+
12691     def test_POST_upload_mutable(self):
12692         # this creates a mutable file
12693         d = self.POST(self.public_url + "/foo", t="upload", mutable="true",
12694hunk ./src/allmydata/test/test_web.py 1985
12695             self.failUnlessReallyEqual(headers["content-type"], ["text/plain"])
12696         d.addCallback(_got_headers)
12697 
12698-        # make sure that size errors are displayed correctly for overwrite
12699-        d.addCallback(lambda res:
12700-                      self.shouldFail2(error.Error,
12701-                                       "test_POST_upload_mutable-toobig",
12702-                                       "413 Request Entity Too Large",
12703-                                       "SDMF is limited to one segment, and 10001 > 10000",
12704-                                       self.POST,
12705-                                       self.public_url + "/foo", t="upload",
12706-                                       mutable="true",
12707-                                       file=("new.txt",
12708-                                             "b" * (self.s.MUTABLE_SIZELIMIT+1)),
12709-                                       ))
12710-
12711+        # make sure that outdated size limits aren't enforced anymore.
12712+        d.addCallback(lambda ignored:
12713+            self.POST(self.public_url + "/foo", t="upload",
12714+                      mutable="true",
12715+                      file=("new.txt",
12716+                            "b" * (self.s.MUTABLE_SIZELIMIT+1))))
12717         d.addErrback(self.dump_error)
12718         return d
12719 
12720hunk ./src/allmydata/test/test_web.py 1995
12721     def test_POST_upload_mutable_toobig(self):
12722-        d = self.shouldFail2(error.Error,
12723-                             "test_POST_upload_mutable_toobig",
12724-                             "413 Request Entity Too Large",
12725-                             "SDMF is limited to one segment, and 10001 > 10000",
12726-                             self.POST,
12727-                             self.public_url + "/foo",
12728-                             t="upload", mutable="true",
12729-                             file=("new.txt",
12730-                                   "b" * (self.s.MUTABLE_SIZELIMIT+1)) )
12731+        # SDMF had a size limti that was removed a while ago. MDMF has
12732+        # never had a size limit. Test to make sure that we do not
12733+        # encounter errors when trying to upload large mutable files,
12734+        # since there should be no coded prohibitions regarding large
12735+        # mutable files.
12736+        d = self.POST(self.public_url + "/foo",
12737+                      t="upload", mutable="true",
12738+                      file=("new.txt", "b" * (self.s.MUTABLE_SIZELIMIT + 1)))
12739         return d
12740 
12741     def dump_error(self, f):
12742hunk ./src/allmydata/test/test_web.py 3005
12743                                                       contents))
12744         return d
12745 
12746+    def test_PUT_NEWFILEURL_mdmf(self):
12747+        new_contents = self.NEWFILE_CONTENTS * 300000
12748+        d = self.PUT(self.public_url + \
12749+                     "/foo/mdmf.txt?mutable=true&mutable-type=mdmf",
12750+                     new_contents)
12751+        d.addCallback(lambda ignored:
12752+            self.GET(self.public_url + "/foo/mdmf.txt?t=json"))
12753+        def _got_json(json):
12754+            data = simplejson.loads(json)
12755+            data = data[1]
12756+            self.failUnlessIn("mutable-type", data)
12757+            self.failUnlessEqual(data['mutable-type'], "mdmf")
12758+        d.addCallback(_got_json)
12759+        return d
12760+
12761+    def test_PUT_NEWFILEURL_sdmf(self):
12762+        new_contents = self.NEWFILE_CONTENTS * 300000
12763+        d = self.PUT(self.public_url + \
12764+                     "/foo/sdmf.txt?mutable=true&mutable-type=sdmf",
12765+                     new_contents)
12766+        d.addCallback(lambda ignored:
12767+            self.GET(self.public_url + "/foo/sdmf.txt?t=json"))
12768+        def _got_json(json):
12769+            data = simplejson.loads(json)
12770+            data = data[1]
12771+            self.failUnlessIn("mutable-type", data)
12772+            self.failUnlessEqual(data['mutable-type'], "sdmf")
12773+        d.addCallback(_got_json)
12774+        return d
12775+
12776     def test_PUT_NEWFILEURL_uri_replace(self):
12777         contents, n, new_uri = self.makefile(8)
12778         d = self.PUT(self.public_url + "/foo/bar.txt?t=uri", new_uri)
12779hunk ./src/allmydata/test/test_web.py 3156
12780         d.addCallback(_done)
12781         return d
12782 
12783+
12784+    def test_PUT_update_at_offset(self):
12785+        file_contents = "test file" * 100000 # about 900 KiB
12786+        d = self.PUT("/uri?mutable=true", file_contents)
12787+        def _then(filecap):
12788+            self.filecap = filecap
12789+            new_data = file_contents[:100]
12790+            new = "replaced and so on"
12791+            new_data += new
12792+            new_data += file_contents[len(new_data):]
12793+            assert len(new_data) == len(file_contents)
12794+            self.new_data = new_data
12795+        d.addCallback(_then)
12796+        d.addCallback(lambda ignored:
12797+            self.PUT("/uri/%s?replace=True&offset=100" % self.filecap,
12798+                     "replaced and so on"))
12799+        def _get_data(filecap):
12800+            n = self.s.create_node_from_uri(filecap)
12801+            return n.download_best_version()
12802+        d.addCallback(_get_data)
12803+        d.addCallback(lambda results:
12804+            self.failUnlessEqual(results, self.new_data))
12805+        # Now try appending things to the file
12806+        d.addCallback(lambda ignored:
12807+            self.PUT("/uri/%s?offset=%d" % (self.filecap, len(self.new_data)),
12808+                     "puppies" * 100))
12809+        d.addCallback(_get_data)
12810+        d.addCallback(lambda results:
12811+            self.failUnlessEqual(results, self.new_data + ("puppies" * 100)))
12812+        return d
12813+
12814+
12815+    def test_PUT_update_at_offset_immutable(self):
12816+        file_contents = "Test file" * 100000
12817+        d = self.PUT("/uri", file_contents)
12818+        def _then(filecap):
12819+            self.filecap = filecap
12820+        d.addCallback(_then)
12821+        d.addCallback(lambda ignored:
12822+            self.shouldHTTPError("test immutable update",
12823+                                 400, "Bad Request",
12824+                                 "immutable",
12825+                                 self.PUT,
12826+                                 "/uri/%s?offset=50" % self.filecap,
12827+                                 "foo"))
12828+        return d
12829+
12830+
12831     def test_bad_method(self):
12832         url = self.webish_url + self.public_url + "/foo/bar.txt"
12833         d = self.shouldHTTPError("test_bad_method",
12834hunk ./src/allmydata/test/test_web.py 3473
12835         def _stash_mutable_uri(n, which):
12836             self.uris[which] = n.get_uri()
12837             assert isinstance(self.uris[which], str)
12838-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"3"))
12839+        d.addCallback(lambda ign:
12840+            c0.create_mutable_file(publish.MutableData(DATA+"3")))
12841         d.addCallback(_stash_mutable_uri, "corrupt")
12842         d.addCallback(lambda ign:
12843                       c0.upload(upload.Data("literal", convergence="")))
12844hunk ./src/allmydata/test/test_web.py 3620
12845         def _stash_mutable_uri(n, which):
12846             self.uris[which] = n.get_uri()
12847             assert isinstance(self.uris[which], str)
12848-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"3"))
12849+        d.addCallback(lambda ign:
12850+            c0.create_mutable_file(publish.MutableData(DATA+"3")))
12851         d.addCallback(_stash_mutable_uri, "corrupt")
12852 
12853         def _compute_fileurls(ignored):
12854hunk ./src/allmydata/test/test_web.py 4283
12855         def _stash_mutable_uri(n, which):
12856             self.uris[which] = n.get_uri()
12857             assert isinstance(self.uris[which], str)
12858-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"2"))
12859+        d.addCallback(lambda ign:
12860+            c0.create_mutable_file(publish.MutableData(DATA+"2")))
12861         d.addCallback(_stash_mutable_uri, "mutable")
12862 
12863         def _compute_fileurls(ignored):
12864hunk ./src/allmydata/test/test_web.py 4383
12865                                                         convergence="")))
12866         d.addCallback(_stash_uri, "small")
12867 
12868-        d.addCallback(lambda ign: c0.create_mutable_file("mutable"))
12869+        d.addCallback(lambda ign:
12870+            c0.create_mutable_file(publish.MutableData("mutable")))
12871         d.addCallback(lambda fn: self.rootnode.set_node(u"mutable", fn))
12872         d.addCallback(_stash_uri, "mutable")
12873 
12874}
12875[resolve conflicts between 393-MDMF patches and trunk as of 1.8.2
12876"Brian Warner <warner@lothar.com>"**20110220230201
12877 Ignore-this: 9bbf5d26c994e8069202331dcb4cdd95
12878] {
12879hunk ./docs/configuration.rst 323
12880     (Mutable files use a different share placement algorithm that does not
12881     currently consider this parameter.)
12882 
12883+``mutable.format = sdmf or mdmf``
12884+
12885+    This value tells Tahoe what the default mutable file format should
12886+    be. If ``mutable.format=sdmf``, then newly created mutable files will be
12887+    in the old SDMF format. This is desirable for clients that operate on
12888+    grids where some peers run older versions of Tahoe, as these older
12889+    versions cannot read the new MDMF mutable file format. If
12890+    ``mutable.format`` is ``mdmf``, then newly created mutable files will use
12891+    the new MDMF format, which supports efficient in-place modification and
12892+    streaming downloads. You can overwrite this value using a special
12893+    mutable-type parameter in the webapi. If you do not specify a value here,
12894+    Tahoe will use SDMF for all newly-created mutable files.
12895+
12896+    Note that this parameter only applies to mutable files. Mutable
12897+    directories, which are stored as mutable files, are not controlled by
12898+    this parameter and will always use SDMF. We may revisit this decision
12899+    in future versions of Tahoe-LAFS.
12900+
12901 
12902 Storage Server Configuration
12903 ============================
12904hunk ./docs/configuration.rst 401
12905     `<garbage-collection.rst>`_ for full details.
12906 
12907 
12908-shares.needed = (int, optional) aka "k", default 3
12909-shares.total = (int, optional) aka "N", N >= k, default 10
12910-shares.happy = (int, optional) 1 <= happy <= N, default 7
12911-
12912- These three values set the default encoding parameters. Each time a new file
12913- is uploaded, erasure-coding is used to break the ciphertext into separate
12914- pieces. There will be "N" (i.e. shares.total) pieces created, and the file
12915- will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved.
12916- The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10).
12917- Setting k to 1 is equivalent to simple replication (uploading N copies of
12918- the file).
12919-
12920- These values control the tradeoff between storage overhead, performance, and
12921- reliability. To a first approximation, a 1MB file will use (1MB*N/k) of
12922- backend storage space (the actual value will be a bit more, because of other
12923- forms of overhead). Up to N-k shares can be lost before the file becomes
12924- unrecoverable, so assuming there are at least N servers, up to N-k servers
12925- can be offline without losing the file. So large N/k ratios are more
12926- reliable, and small N/k ratios use less disk space. Clearly, k must never be
12927- smaller than N.
12928-
12929- Large values of N will slow down upload operations slightly, since more
12930- servers must be involved, and will slightly increase storage overhead due to
12931- the hash trees that are created. Large values of k will cause downloads to
12932- be marginally slower, because more servers must be involved. N cannot be
12933- larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe
12934- uses.
12935-
12936- shares.happy allows you control over the distribution of your immutable file.
12937- For a successful upload, shares are guaranteed to be initially placed on
12938- at least 'shares.happy' distinct servers, the correct functioning of any
12939- k of which is sufficient to guarantee the availability of the uploaded file.
12940- This value should not be larger than the number of servers on your grid.
12941-
12942- A value of shares.happy <= k is allowed, but does not provide any redundancy
12943- if some servers fail or lose shares.
12944-
12945- (Mutable files use a different share placement algorithm that does not
12946-  consider this parameter.)
12947-
12948-
12949-== Storage Server Configuration ==
12950-
12951-[storage]
12952-enabled = (boolean, optional)
12953-
12954- If this is True, the node will run a storage server, offering space to other
12955- clients. If it is False, the node will not run a storage server, meaning
12956- that no shares will be stored on this node. Use False this for clients who
12957- do not wish to provide storage service. The default value is True.
12958-
12959-readonly = (boolean, optional)
12960-
12961- If True, the node will run a storage server but will not accept any shares,
12962- making it effectively read-only. Use this for storage servers which are
12963- being decommissioned: the storage/ directory could be mounted read-only,
12964- while shares are moved to other servers. Note that this currently only
12965- affects immutable shares. Mutable shares (used for directories) will be
12966- written and modified anyway. See ticket #390 for the current status of this
12967- bug. The default value is False.
12968-
12969-reserved_space = (str, optional)
12970-
12971- If provided, this value defines how much disk space is reserved: the storage
12972- server will not accept any share which causes the amount of free disk space
12973- to drop below this value. (The free space is measured by a call to statvfs(2)
12974- on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the
12975- user account under which the storage server runs.)
12976-
12977- This string contains a number, with an optional case-insensitive scale
12978- suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So
12979- "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same
12980- thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing.
12981-
12982-expire.enabled =
12983-expire.mode =
12984-expire.override_lease_duration =
12985-expire.cutoff_date =
12986-expire.immutable =
12987-expire.mutable =
12988-
12989- These settings control garbage-collection, in which the server will delete
12990- shares that no longer have an up-to-date lease on them. Please see the
12991- neighboring "garbage-collection.txt" document for full details.
12992-
12993-
12994-== Running A Helper ==
12995+Running A Helper
12996+================
12997 
12998 A "helper" is a regular client node that also offers the "upload helper"
12999 service.
13000replace ./docs/configuration.rst [A-Za-z_0-9\-\.] Tahoe Tahoe-LAFS
13001hunk ./src/allmydata/mutable/retrieve.py 7
13002 from zope.interface import implements
13003 from twisted.internet import defer
13004 from twisted.python import failure
13005-from foolscap.api import DeadReferenceError, eventually, fireEventually
13006-from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError
13007-from allmydata.util import hashutil, idlib, log
13008+from twisted.internet.interfaces import IPushProducer, IConsumer
13009+from foolscap.api import eventually, fireEventually
13010+from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError, \
13011+                                 MDMF_VERSION, SDMF_VERSION
13012+from allmydata.util import hashutil, log, mathutil
13013+from allmydata.util.dictutil import DictOfSets
13014 from allmydata import hashtree, codec
13015 from allmydata.storage.server import si_b2a
13016 from pycryptopp.cipher.aes import AES
13017hunk ./src/allmydata/mutable/retrieve.py 239
13018             # KiB, so we ask for that much.
13019             # TODO: Change the cache methods to allow us to fetch all of the
13020             # data that they have, then change this method to do that.
13021-            any_cache, timestamp = self._node._read_from_cache(self.verinfo,
13022-                                                               shnum,
13023-                                                               0,
13024-                                                               1000)
13025+            any_cache = self._node._read_from_cache(self.verinfo, shnum,
13026+                                                    0, 1000)
13027             ss = self.servermap.connections[peerid]
13028             reader = MDMFSlotReadProxy(ss,
13029                                        self._storage_index,
13030hunk ./src/allmydata/mutable/retrieve.py 373
13031                  (k, n, self._num_segments, self._segment_size,
13032                   self._tail_segment_size))
13033 
13034-        # ask the cache first
13035-        got_from_cache = False
13036-        datavs = []
13037-        for (offset, length) in readv:
13038-            (data, timestamp) = self._node._read_from_cache(self.verinfo, shnum,
13039-                                                            offset, length)
13040-            if data is not None:
13041-                datavs.append(data)
13042-        if len(datavs) == len(readv):
13043-            self.log("got data from cache")
13044-            got_from_cache = True
13045-            d = fireEventually({shnum: datavs})
13046-            # datavs is a dict mapping shnum to a pair of strings
13047+        for i in xrange(self._total_shares):
13048+            # So we don't have to do this later.
13049+            self._block_hash_trees[i] = hashtree.IncompleteHashTree(self._num_segments)
13050+
13051+        # Our last task is to tell the downloader where to start and
13052+        # where to stop. We use three parameters for that:
13053+        #   - self._start_segment: the segment that we need to start
13054+        #     downloading from.
13055+        #   - self._current_segment: the next segment that we need to
13056+        #     download.
13057+        #   - self._last_segment: The last segment that we were asked to
13058+        #     download.
13059+        #
13060+        #  We say that the download is complete when
13061+        #  self._current_segment > self._last_segment. We use
13062+        #  self._start_segment and self._last_segment to know when to
13063+        #  strip things off of segments, and how much to strip.
13064+        if self._offset:
13065+            self.log("got offset: %d" % self._offset)
13066+            # our start segment is the first segment containing the
13067+            # offset we were given.
13068+            start = mathutil.div_ceil(self._offset,
13069+                                      self._segment_size)
13070+            # this gets us the first segment after self._offset. Then
13071+            # our start segment is the one before it.
13072+            start -= 1
13073+
13074+            assert start < self._num_segments
13075+            self._start_segment = start
13076+            self.log("got start segment: %d" % self._start_segment)
13077         else:
13078             self._start_segment = 0
13079 
13080hunk ./src/allmydata/mutable/servermap.py 7
13081 from itertools import count
13082 from twisted.internet import defer
13083 from twisted.python import failure
13084-from foolscap.api import DeadReferenceError, RemoteException, eventually
13085-from allmydata.util import base32, hashutil, idlib, log
13086+from foolscap.api import DeadReferenceError, RemoteException, eventually, \
13087+                         fireEventually
13088+from allmydata.util import base32, hashutil, idlib, log, deferredutil
13089+from allmydata.util.dictutil import DictOfSets
13090 from allmydata.storage.server import si_b2a
13091 from allmydata.interfaces import IServermapUpdaterStatus
13092 from pycryptopp.publickey import rsa
13093hunk ./src/allmydata/mutable/servermap.py 16
13094 
13095 from allmydata.mutable.common import MODE_CHECK, MODE_ANYTHING, MODE_WRITE, MODE_READ, \
13096-     DictOfSets, CorruptShareError, NeedMoreDataError
13097-from allmydata.mutable.layout import unpack_prefix_and_signature, unpack_header, unpack_share, \
13098-     SIGNED_PREFIX_LENGTH
13099+     CorruptShareError
13100+from allmydata.mutable.layout import SIGNED_PREFIX_LENGTH, MDMFSlotReadProxy
13101 
13102 class UpdateStatus:
13103     implements(IServermapUpdaterStatus)
13104hunk ./src/allmydata/mutable/servermap.py 391
13105         #  * if we need the encrypted private key, we want [-1216ish:]
13106         #   * but we can't read from negative offsets
13107         #   * the offset table tells us the 'ish', also the positive offset
13108-        # A future version of the SMDF slot format should consider using
13109-        # fixed-size slots so we can retrieve less data. For now, we'll just
13110-        # read 2000 bytes, which also happens to read enough actual data to
13111-        # pre-fetch a 9-entry dirnode.
13112+        # MDMF:
13113+        #  * Checkstring? [0:72]
13114+        #  * If we want to validate the checkstring, then [0:72], [143:?] --
13115+        #    the offset table will tell us for sure.
13116+        #  * If we need the verification key, we have to consult the offset
13117+        #    table as well.
13118+        # At this point, we don't know which we are. Our filenode can
13119+        # tell us, but it might be lying -- in some cases, we're
13120+        # responsible for telling it which kind of file it is.
13121         self._read_size = 4000
13122         if mode == MODE_CHECK:
13123             # we use unpack_prefix_and_signature, so we need 1k
13124hunk ./src/allmydata/mutable/servermap.py 633
13125         updated.
13126         """
13127         if verinfo:
13128-            self._node._add_to_cache(verinfo, shnum, 0, data, now)
13129+            self._node._add_to_cache(verinfo, shnum, 0, data)
13130 
13131 
13132     def _got_results(self, datavs, peerid, readsize, stuff, started):
13133hunk ./src/allmydata/mutable/servermap.py 664
13134 
13135         for shnum,datav in datavs.items():
13136             data = datav[0]
13137-            try:
13138-                verinfo = self._got_results_one_share(shnum, data, peerid, lp)
13139-                last_verinfo = verinfo
13140-                last_shnum = shnum
13141-                self._node._add_to_cache(verinfo, shnum, 0, data, now)
13142-            except CorruptShareError, e:
13143-                # log it and give the other shares a chance to be processed
13144-                f = failure.Failure()
13145-                self.log(format="bad share: %(f_value)s", f_value=str(f.value),
13146-                         failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
13147-                self.notify_server_corruption(peerid, shnum, str(e))
13148-                self._bad_peers.add(peerid)
13149-                self._last_failure = f
13150-                checkstring = data[:SIGNED_PREFIX_LENGTH]
13151-                self._servermap.mark_bad_share(peerid, shnum, checkstring)
13152-                self._servermap.problems.append(f)
13153-                pass
13154+            reader = MDMFSlotReadProxy(ss,
13155+                                       storage_index,
13156+                                       shnum,
13157+                                       data)
13158+            self._readers.setdefault(peerid, dict())[shnum] = reader
13159+            # our goal, with each response, is to validate the version
13160+            # information and share data as best we can at this point --
13161+            # we do this by validating the signature. To do this, we
13162+            # need to do the following:
13163+            #   - If we don't already have the public key, fetch the
13164+            #     public key. We use this to validate the signature.
13165+            if not self._node.get_pubkey():
13166+                # fetch and set the public key.
13167+                d = reader.get_verification_key(queue=True)
13168+                d.addCallback(lambda results, shnum=shnum, peerid=peerid:
13169+                    self._try_to_set_pubkey(results, peerid, shnum, lp))
13170+                # XXX: Make self._pubkey_query_failed?
13171+                d.addErrback(lambda error, shnum=shnum, peerid=peerid:
13172+                    self._got_corrupt_share(error, shnum, peerid, data, lp))
13173+            else:
13174+                # we already have the public key.
13175+                d = defer.succeed(None)
13176 
13177             # Neither of these two branches return anything of
13178             # consequence, so the first entry in our deferredlist will
13179hunk ./src/allmydata/test/test_storage.py 1
13180-import time, os.path, platform, stat, re, simplejson, struct
13181+import time, os.path, platform, stat, re, simplejson, struct, shutil
13182 
13183hunk ./src/allmydata/test/test_storage.py 3
13184-import time, os.path, stat, re, simplejson, struct
13185+import mock
13186 
13187 from twisted.trial import unittest
13188 
13189}
13190[mutable/filenode.py: fix create_mutable_file('string')
13191"Brian Warner <warner@lothar.com>"**20110221014659
13192 Ignore-this: dc6bdad761089f0199681eeb784f1001
13193] hunk ./src/allmydata/mutable/filenode.py 137
13194         if contents is None:
13195             return MutableData("")
13196 
13197+        if isinstance(contents, str):
13198+            return MutableData(contents)
13199+
13200         if IMutableUploadable.providedBy(contents):
13201             return contents
13202 
13203[resolve more conflicts with current trunk
13204"Brian Warner <warner@lothar.com>"**20110221055600
13205 Ignore-this: 77ad038a478dbf5d9b34f7a68159a3e0
13206] hunk ./src/allmydata/mutable/servermap.py 461
13207         self._queries_completed = 0
13208 
13209         sb = self._storage_broker
13210-        full_peerlist = sb.get_servers_for_index(self._storage_index)
13211+        # All of the peers, permuted by the storage index, as usual.
13212+        full_peerlist = [(s.get_serverid(), s.get_rref())
13213+                         for s in sb.get_servers_for_psi(self._storage_index)]
13214         self.full_peerlist = full_peerlist # for use later, immutable
13215         self.extra_peers = full_peerlist[:] # peers are removed as we use them
13216         self._good_peers = set() # peers who had some shares
13217[update MDMF code with StorageFarmBroker changes
13218"Brian Warner <warner@lothar.com>"**20110221061004
13219 Ignore-this: a693b201d31125b391cebe0412ddd027
13220] {
13221hunk ./src/allmydata/mutable/publish.py 203
13222         self._encprivkey = self._node.get_encprivkey()
13223 
13224         sb = self._storage_broker
13225-        full_peerlist = sb.get_servers_for_index(self._storage_index)
13226+        full_peerlist = [(s.get_serverid(), s.get_rref())
13227+                         for s in sb.get_servers_for_psi(self._storage_index)]
13228         self.full_peerlist = full_peerlist # for use later, immutable
13229         self.bad_peers = set() # peerids who have errbacked/refused requests
13230 
13231hunk ./src/allmydata/test/test_mutable.py 2538
13232             # for either a block and salt or for hashes, either of which
13233             # will exercise the error handling code.
13234             killer = FirstServerGetsKilled()
13235-            for (serverid, ss) in nm.storage_broker.get_all_servers():
13236-                ss.post_call_notifier = killer.notify
13237+            for s in nm.storage_broker.get_connected_servers():
13238+                s.get_rref().post_call_notifier = killer.notify
13239             ver = servermap.best_recoverable_version()
13240             assert ver
13241             return self._node.download_version(servermap, ver)
13242}
13243
13244Context:
13245
13246[web/filenode.py: avoid calling req.finish() on closed HTTP connections. Closes #1366
13247"Brian Warner <warner@lothar.com>"**20110221061544
13248 Ignore-this: 799d4de19933f2309b3c0c19a63bb888
13249] 
13250[Refactor StorageFarmBroker handling of servers
13251Brian Warner <warner@lothar.com>**20110221015804
13252 Ignore-this: 842144ed92f5717699b8f580eab32a51
13253 
13254 Pass around IServer instance instead of (peerid, rref) tuple. Replace
13255 "descriptor" with "server". Other replacements:
13256 
13257  get_all_servers -> get_connected_servers/get_known_servers
13258  get_servers_for_index -> get_servers_for_psi (now returns IServers)
13259 
13260 This change still needs to be pushed further down: lots of code is now
13261 getting the IServer and then distributing (peerid, rref) internally.
13262 Instead, it ought to distribute the IServer internally and delay
13263 extracting a serverid or rref until the last moment.
13264 
13265 no_network.py was updated to retain parallelism.
13266] 
13267[Add unit tests for cross_check_pkg_resources_versus_import, and a regression test for ref #1355. This requires a little refactoring to make it testable.
13268david-sarah@jacaranda.org**20110221015817
13269 Ignore-this: 51d181698f8c20d3aca58b057e9c475a
13270] 
13271[allmydata/__init__.py: .name was used in place of the correct .__name__ when printing an exception. Also, robustify string formatting by using %r instead of %s in some places. fixes #1355.
13272david-sarah@jacaranda.org**20110221020125
13273 Ignore-this: b0744ed58f161bf188e037bad077fc48
13274] 
13275[TAG allmydata-tahoe-1.8.2
13276warner@lothar.com**20110131020101] 
13277Patch bundle hash:
13278e288e8563ec43b3cc70be7fa050899dda0f6903e