Ticket #993: interface-changes.py

File interface-changes.py, 21.1 KB (added by davidsarah, at 2010-03-12T06:01:13Z)

Proposed refactoring of file download interfaces

Line 
1
2class IReadable():
3    """I represent a readable object -- either an immutable file, or a
4    specific version of a mutable file.
5    """
6
7    def is_readonly():
8        """Return True if this reference provides mutable access to the given
9        file or directory (i.e. if you can modify it), or False if not. Note
10        that even if this reference is read-only, someone else may hold a
11        read-write reference to it.
12
13        For an IReadable returned by get_best_readable_version(), this will
14        always return True, but for instances of subinterfaces such as
15        IMutableFileVersion, it may return False."""
16
17    def is_mutable():
18        """Return True if this file or directory is mutable (by *somebody*,
19        not necessarily you), False if it is is immutable. Note that a file
20        might be mutable overall, but your reference to it might be
21        read-only. On the other hand, all references to an immutable file
22        will be read-only; there are no read-write references to an immutable
23        file."""
24
25    def get_storage_index():
26        """Return the storage index of the file."""
27
28    def get_size():
29        """Return the length (in bytes) of this readable object."""
30
31    def download_to_data():
32        """Download all of the file contents. I return a Deferred that fires
33        with the contents as a byte string."""
34
35    def read(consumer, offset=0, size=None):
36        """Download a portion (possibly all) of the file's contents, making
37        them available to the given IConsumer. Return a Deferred that fires
38        (with the consumer) when the consumer is unregistered (either because
39        the last byte has been given to it, or because the consumer threw an
40        exception during write(), possibly because it no longer wants to
41        receive data). The portion downloaded will start at 'offset' and
42        contain 'size' bytes (or the remainder of the file if size==None).
43
44        The consumer will be used in non-streaming mode: an IPullProducer
45        will be attached to it.
46
47        The consumer will not receive data right away: several network trips
48        must occur first. The order of events will be::
49
50         consumer.registerProducer(p, streaming)
51          (if streaming == False)::
52           consumer does p.resumeProducing()
53            consumer.write(data)
54           consumer does p.resumeProducing()
55            consumer.write(data).. (repeat until all data is written)
56         consumer.unregisterProducer()
57         deferred.callback(consumer)
58
59        If a download error occurs, or an exception is raised by
60        consumer.registerProducer() or consumer.write(), I will call
61        consumer.unregisterProducer() and then deliver the exception via
62        deferred.errback(). To cancel the download, the consumer should call
63        p.stopProducing(), which will result in an exception being delivered
64        via deferred.errback().
65
66        See src/allmydata/util/consumer.py for an example of a simple
67        download-to-memory consumer.
68        """
69
70
71class IMutableFileVersion(IReadable):
72    """I provide access to a particular version of a mutable file. The
73    access is read/write if I was obtained from a filenode derived from
74    a write cap, or read-only if the filenode was derived from a read cap.
75    """
76
77    def get_sequence_number():
78        """Return the sequence number of this version."""
79
80    def get_servermap():
81        """Return the IMutableFileServerMap instance that was used to create
82        this object.
83        """
84
85    def get_writekey():
86        """Return this filenode's writekey, or None if the node does not have
87        write-capability. This may be used to assist with data structures
88        that need to make certain data available only to writers, such as the
89        read-write child caps in dirnodes. The recommended process is to have
90        reader-visible data be submitted to the filenode in the clear (where
91        it will be encrypted by the filenode using the readkey), but encrypt
92        writer-visible data using this writekey.
93        """
94
95    def replace(new_contents):
96        """Replace the contents of the mutable file, provided that no other
97        node has published (or is attempting to publish, concurrently) a
98        newer version of the file than this one.
99
100        I will avoid modifying any share that is different than the version
101        given by get_sequence_number(). However, if another node is writing
102        to the file at the same time as me, I may manage to update some shares
103        while they update others. If I see any evidence of this, I will signal
104        UncoordinatedWriteError, and the file will be left in an inconsistent
105        state (possibly the version you provided, possibly the old version,
106        possibly somebody else's version, and possibly a mix of shares from
107        all of these).
108
109        The recommended response to UncoordinatedWriteError is to either
110        return it to the caller (since they failed to coordinate their
111        writes), or to attempt some sort of recovery. It may be sufficient to
112        wait a random interval (with exponential backoff) and repeat your
113        operation. If I do not signal UncoordinatedWriteError, then I was
114        able to write the new version without incident.
115
116        I return a Deferred that fires (with a PublishStatus object) when the
117        update has completed.
118        """
119
120    def modify(modifier_cb):
121        """Modify the contents of the file, by downloading this version,
122        applying the modifier function (or bound method), then uploading
123        the new version. This will succeed as long as no other node
124        publishes a version between the download and the upload.
125        I return a Deferred that fires (with a PublishStatus object) when
126        the update is complete.
127
128        The modifier callable will be given three arguments: a string (with
129        the old contents), a 'first_time' boolean, and a servermap. As with
130        download_to_data(), the old contents will be from this version,
131        but the modifier can use the servermap to make other decisions
132        (such as refusing to apply the delta if there are multiple parallel
133        versions, or if there is evidence of a newer unrecoverable version).
134        'first_time' will be True the first time the modifier is called,
135        and False on any subsequent calls.
136
137        The callable should return a string with the new contents. The
138        callable must be prepared to be called multiple times, and must
139        examine the input string to see if the change that it wants to make
140        is already present in the old version. If it does not need to make
141        any changes, it can either return None, or return its input string.
142
143        If the modifier raises an exception, it will be returned in the
144        errback.
145        """
146
147
148# The hierarchy looks like this:
149#  IFilesystemNode
150#   IFileNode
151#    IMutableFileNode
152#    IImmutableFileNode
153#   IDirectoryNode
154
155class IFilesystemNode(Interface):
156    def get_cap():
157        """Return the strongest 'cap instance' associated with this node.
158        (writecap for writeable-mutable files/directories, readcap for
159        immutable or readonly-mutable files/directories). To convert this
160        into a string, call .to_string() on the result."""
161
162    def get_readcap():
163        """Return a readonly cap instance for this node. For immutable or
164        readonly nodes, get_cap() and get_readcap() return the same thing."""
165
166    def get_repair_cap():
167        """Return an IURI instance that can be used to repair the file, or
168        None if this node cannot be repaired (either because it is not
169        distributed, like a LIT file, or because the node does not represent
170        sufficient authority to create a repair-cap, like a read-only RSA
171        mutable file node [which cannot create the correct write-enablers]).
172        """
173
174    def get_verify_cap():
175        """Return an IVerifierURI instance that represents the
176        'verifiy/refresh capability' for this node. The holder of this
177        capability will be able to renew the lease for this node, protecting
178        it from garbage-collection. They will also be able to ask a server if
179        it holds a share for the file or directory.
180        """
181
182    def get_uri():
183        """Return the URI string corresponding to the strongest cap associated
184        with this node. If this node is read-only, the URI will only offer
185        read-only access. If this node is read-write, the URI will offer
186        read-write access.
187
188        If you have read-write access to a node and wish to share merely
189        read-only access with others, use get_readonly_uri().
190        """
191
192    def get_write_uri(n):
193        """Return the URI string that can be used by others to get write
194        access to this node, if it is writeable. If this is a read-only node,
195        return None."""
196
197    def get_readonly_uri():
198        """Return the URI string that can be used by others to get read-only
199        access to this node. The result is a read-only URI, regardless of
200        whether this node is read-only or read-write.
201
202        If you have merely read-only access to this node, get_readonly_uri()
203        will return the same thing as get_uri().
204        """
205
206    def get_storage_index():
207        """Return a string with the (binary) storage index in use on this
208        download. This may be None if there is no storage index (i.e. LIT
209        files)."""
210
211    def is_readonly():
212        """Return True if this reference provides mutable access to the given
213        file or directory (i.e. if you can modify it), or False if not. Note
214        that even if this reference is read-only, someone else may hold a
215        read-write reference to it."""
216
217    def is_mutable():
218        """Return True if this file or directory is mutable (by *somebody*,
219        not necessarily you), False if it is is immutable. Note that a file
220        might be mutable overall, but your reference to it might be
221        read-only. On the other hand, all references to an immutable file
222        will be read-only; there are no read-write references to an immutable
223        file.
224        """
225
226    def is_unknown():
227        """Return True if this is an unknown node."""
228
229    def is_allowed_in_immutable_directory():
230        """Return True if this node is allowed as a child of a deep-immutable
231        directory. This is true if either the node is of a known-immutable type,
232        or it is unknown and read-only.
233        """
234
235    def raise_error():
236        """Raise any error associated with this node."""
237
238    def get_size():
239        """Return the length (in bytes) of the data this node represents. For
240        directory nodes, I return the size of the backing store. I return
241        synchronously and do not consult the network, so for mutable objects,
242        I will return the most recently observed size for the object, or None
243        if I don't remember a size. Use get_current_size, which returns a
244        Deferred, if you want more up-to-date information."""
245
246    def get_current_size():
247        """I return a Deferred that fires with the length (in bytes) of the
248        data this node represents.
249        """
250
251
252class IFileNode(IFilesystemNode):
253    """I am a node representing a file: a sequence of bytes. I am not a
254    container, like IDirectoryNode."""
255
256    def get_best_readable_version():
257        """Return a Deferred that fires with an IReadable for the 'best'
258        available version of the file. The IReadable provides only read
259        access, even if this filenode was derived from a write cap.
260
261        For an immutable file, there is only one version. For a mutable
262        file, the 'best' version is the recoverable version with the
263        highest sequence number. If no uncoordinated writes have occurred,
264        and if enough shares are available, then this will be the most
265        recent version that has been uploaded. If no version is recoverable,
266        the Deferred will errback with an UnrecoverableFileError.
267        """
268
269    def download_best_version():
270        """Download the contents of the version that would be returned
271        by get_best_readable_version(). This is equivalent to calling
272        download_to_data() on the IReadable given by that method.
273
274        I return a Deferred that fires with a byte string when the file
275        has been fully downloaded. To support streaming download, use
276        the 'read' method of IReadable. If no version is recoverable,
277        the Deferred will errback with an UnrecoverableFileError.
278        """
279
280    def get_size_of_best_version():
281        """Find the size of the version that would be returned by
282        get_best_readable_version().
283
284        I return a Deferred that fires with an integer. If no version
285        is recoverable, the Deferred will errback with an
286        UnrecoverableFileError.
287        """
288
289
290class IImmutableFileNode(IFileNode, IReadable):
291    """I am a node representing an immutable file. Immutable files have
292    only one version."""
293
294
295class IMutableFileNode(IFileNode):
296    """I provide access to a 'mutable file', which retains its identity
297    regardless of what contents are put in it.
298
299    The consistency-vs-availability problem means that there might be
300    multiple versions of a file present in the grid, some of which might be
301    unrecoverable (i.e. have fewer than 'k' shares). These versions are
302    loosely ordered: each has a sequence number and a hash, and any version
303    with seqnum=N was uploaded by a node which has seen at least one version
304    with seqnum=N-1.
305
306    The 'servermap' (an instance of IMutableFileServerMap) is used to
307    describe the versions that are known to be present in the grid, and which
308    servers are hosting their shares. It is used to represent the 'state of
309    the world', and is used for this purpose by my test-and-set operations.
310    Downloading the contents of the mutable file will also return a
311    servermap. Uploading a new version into the mutable file requires a
312    servermap as input, and the semantics of the replace operation is
313    'replace the file with my new version if it looks like nobody else has
314    changed the file since my previous download'. Because the file is
315    distributed, this is not a perfect test-and-set operation, but it will do
316    its best. If the replace process sees evidence of a simultaneous write,
317    it will signal an UncoordinatedWriteError, so that the caller can take
318    corrective action.
319
320
321    Most readers will want to use the 'best' current version of the file,
322    and should use my 'get_best_mutable_version()' method (or
323    'get_best_readable_version()' for read-only access).
324
325    To unconditionally replace the file, callers should use overwrite(). This
326    is the mode that user-visible mutable files will probably use.
327
328    To apply some delta to the file, call modify() with a callable modifier
329    function that can apply the modification that you want to make. This is
330    the mode that dirnodes will use, since most directory modification
331    operations can be expressed in terms of deltas to the directory state.
332
333
334    Three methods are available for users who need to perform more complex
335    operations. The first is get_servermap(), which returns an up-to-date
336    servermap using a specified mode. The second is download_version(), which
337    downloads a specific version (not necessarily the 'best' one). The third
338    is 'upload', which accepts new contents and a servermap (which must have
339    been updated with MODE_WRITE). The upload method will attempt to apply
340    the new contents as long as no other node has modified the file since the
341    servermap was updated. This might be useful to a caller who wants to
342    merge multiple versions into a single new one.
343
344    Note that each time the servermap is updated, a specific 'mode' is used,
345    which determines how many peers are queried. To use a servermap for my
346    replace() method, that servermap must have been updated in MODE_WRITE.
347    These modes are defined in allmydata.mutable.common, and consist of
348    MODE_READ, MODE_WRITE, MODE_ANYTHING, and MODE_CHECK. Please look in
349    allmydata/mutable/servermap.py for details about the differences.
350
351    Mutable files are currently limited in size (about 3.5MB max) and can
352    only be retrieved and updated all-at-once, as a single big string. Future
353    versions of our mutable files will remove this restriction.
354    """
355
356    def get_best_mutable_version():
357        """Return a Deferred that fires with an IMutableFileVersion for
358        the 'best' available version of the file. The best version is
359        the recoverable version with the highest sequence number. If no
360        uncoordinated writes have occurred, and if enough shares are
361        available, then this will be the most recent version that has
362        been uploaded.
363
364        If no version is recoverable, the Deferred will errback with an
365        UnrecoverableFileError.
366        """
367
368    def overwrite(new_contents):
369        """Unconditionally replace the contents of the mutable file with new
370        ones. This simply chains get_servermap(MODE_WRITE) and upload(). This
371        is only appropriate to use when the new contents of the file are
372        completely unrelated to the old ones, and you do not care about other
373        clients' changes.
374
375        I return a Deferred that fires (with a PublishStatus object) when the
376        update has completed.
377        """
378
379    def modify(modifier_cb):
380        """Modify the contents of the file, by downloading the current
381        version, applying the modifier function (or bound method), then
382        uploading the new version. I return a Deferred that fires (with a
383        PublishStatus object) when the update is complete.
384
385        The modifier callable will be given three arguments: a string (with
386        the old contents), a 'first_time' boolean, and a servermap. As with
387        download_best_version(), the old contents will be from the best
388        recoverable version, but the modifier can use the servermap to make
389        other decisions (such as refusing to apply the delta if there are
390        multiple parallel versions, or if there is evidence of a newer
391        unrecoverable version). 'first_time' will be True the first time the
392        modifier is called, and False on any subsequent calls.
393
394        The callable should return a string with the new contents. The
395        callable must be prepared to be called multiple times, and must
396        examine the input string to see if the change that it wants to make
397        is already present in the old version. If it does not need to make
398        any changes, it can either return None, or return its input string.
399
400        If the modifier raises an exception, it will be returned in the
401        errback.
402        """
403
404    def get_servermap(mode):
405        """Return a Deferred that fires with an IMutableFileServerMap
406        instance, updated using the given mode.
407        """
408
409    def download_version(servermap, version):
410        """Download a specific version of the file, using the servermap
411        as a guide to where the shares are located.
412
413        I return a Deferred that fires with the requested contents, or
414        errbacks with UnrecoverableFileError. Note that a servermap which was
415        updated with MODE_ANYTHING or MODE_READ may not know about shares for
416        all versions (those modes stop querying servers as soon as they can
417        fulfil their goals), so you may want to use MODE_CHECK (which checks
418        everything) to get increased visibility.
419        """
420
421    def upload(new_contents, servermap):
422        """Replace the contents of the file with new ones. This requires a
423        servermap that was previously updated with MODE_WRITE.
424
425        I attempt to provide test-and-set semantics, in that I will avoid
426        modifying any share that is different than the version I saw in the
427        servermap. However, if another node is writing to the file at the
428        same time as me, I may manage to update some shares while they update
429        others. If I see any evidence of this, I will signal
430        UncoordinatedWriteError, and the file will be left in an inconsistent
431        state (possibly the version you provided, possibly the old version,
432        possibly somebody else's version, and possibly a mix of shares from
433        all of these).
434
435        The recommended response to UncoordinatedWriteError is to either
436        return it to the caller (since they failed to coordinate their
437        writes), or to attempt some sort of recovery. It may be sufficient to
438        wait a random interval (with exponential backoff) and repeat your
439        operation. If I do not signal UncoordinatedWriteError, then I was
440        able to write the new version without incident.
441
442        I return a Deferred that fires (with a PublishStatus object) when the
443        publish has completed. I will update the servermap in-place with the
444        location of all new shares.
445        """
446
447    def get_writekey():
448        """Return this filenode's writekey, or None if the node does not have
449        write-capability. This may be used to assist with data structures
450        that need to make certain data available only to writers, such as the
451        read-write child caps in dirnodes. The recommended process is to have
452        reader-visible data be submitted to the filenode in the clear (where
453        it will be encrypted by the filenode using the readkey), but encrypt
454        writer-visible data using this writekey.
455        """