Ticket #393: fencepost-test.dpatch

File fencepost-test.dpatch, 12.8 KB (added by warner, at 2011-02-28T02:15:13Z)

exercise fencepost-update bug

Line 
11 patch for repository /Users/warner2/stuff/tahoe/t4:
2
3Sun Feb 27 18:10:56 PST 2011  warner@lothar.com
4  * test_mutable.py: add test to exercise fencepost bug
5
6New patches:
7
8[test_mutable.py: add test to exercise fencepost bug
9warner@lothar.com**20110228021056
10 Ignore-this: d2f9cf237ce6db42fb250c8ad71a4fc3
11] {
12hunk ./src/allmydata/test/test_mutable.py 2
13 
14-import os
15+import os, re
16 from cStringIO import StringIO
17 from twisted.trial import unittest
18 from twisted.internet import defer, reactor
19hunk ./src/allmydata/test/test_mutable.py 2931
20         self.set_up_grid()
21         self.c = self.g.clients[0]
22         self.nm = self.c.nodemaker
23-        self.data = "test data" * 100000 # about 900 KiB; MDMF
24+        self.data = "testdata " * 100000 # about 900 KiB; MDMF
25         self.small_data = "test data" * 10 # about 90 B; SDMF
26         return self.do_upload()
27 
28hunk ./src/allmydata/test/test_mutable.py 2981
29             self.failUnlessEqual(results, new_data))
30         return d
31 
32+    def test_replace_segstart1(self):
33+        offset = 128*1024+1
34+        new_data = "NNNN"
35+        expected = self.data[:offset]+new_data+self.data[offset+4:]
36+        d = self.mdmf_node.get_best_mutable_version()
37+        d.addCallback(lambda mv:
38+            mv.update(MutableData(new_data), offset))
39+        d.addCallback(lambda ignored:
40+            self.mdmf_node.download_best_version())
41+        def _check(results):
42+            if results != expected:
43+                print
44+                print "got: %s ... %s" % (results[:20], results[-20:])
45+                print "exp: %s ... %s" % (expected[:20], expected[-20:])
46+                self.fail("results != expected")
47+        d.addCallback(_check)
48+        return d
49+
50+    def _check_differences(self, got, expected):
51+        # displaying arbitrary file corruption is tricky for a
52+        # 1MB file of repeating data,, so look for likely places
53+        # with problems and display them separately
54+        gotmods = [mo.span() for mo in re.finditer('([A-Z]+)', got)]
55+        expmods = [mo.span() for mo in re.finditer('([A-Z]+)', expected)]
56+        gotspans = ["%d:%d=%s" % (start,end,got[start:end])
57+                    for (start,end) in gotmods]
58+        expspans = ["%d:%d=%s" % (start,end,expected[start:end])
59+                    for (start,end) in expmods]
60+        #print "expecting: %s" % expspans
61+
62+        SEGSIZE = 128*1024
63+        if got != expected:
64+            print "differences:"
65+            for segnum in range(len(expected)//SEGSIZE):
66+                start = segnum * SEGSIZE
67+                end = (segnum+1) * SEGSIZE
68+                got_ends = "%s .. %s" % (got[start:start+20], got[end-20:end])
69+                exp_ends = "%s .. %s" % (expected[start:start+20], expected[end-20:end])
70+                if got_ends != exp_ends:
71+                    print "expected[%d]: %s" % (start, exp_ends)
72+                    print "got     [%d]: %s" % (start, got_ends)
73+            if expspans != gotspans:
74+                print "expected: %s" % expspans
75+                print "got     : %s" % gotspans
76+            open("EXPECTED","wb").write(expected)
77+            open("GOT","wb").write(got)
78+            print "wrote data to EXPECTED and GOT"
79+            self.fail("didn't get expected data")
80+
81+
82+    def test_replace_locations(self):
83+        # exercise fencepost conditions
84+        expected = self.data
85+        SEGSIZE = 128*1024
86+        suspects = range(SEGSIZE-3, SEGSIZE+1)+range(2*SEGSIZE-3, 2*SEGSIZE+1)
87+        letters = iter("ABCDEFGHIJKLMNOPQRSTUVWXYZ")
88+        d = defer.succeed(None)
89+        for offset in suspects:
90+            new_data = letters.next()*2 # "AA", then "BB", etc
91+            expected = expected[:offset]+new_data+expected[offset+2:]
92+            d.addCallback(lambda ign:
93+                          self.mdmf_node.get_best_mutable_version())
94+            def _modify(mv, offset=offset, new_data=new_data):
95+                # close over 'offset','new_data'
96+                md = MutableData(new_data)
97+                return mv.update(md, offset)
98+            d.addCallback(_modify)
99+            d.addCallback(lambda ignored:
100+                          self.mdmf_node.download_best_version())
101+            d.addCallback(self._check_differences, expected)
102+        return d
103+
104 
105     def test_replace_and_extend(self):
106         # We should be able to replace data in the middle of a mutable
107}
108
109Context:
110
111[web/filenode.py: avoid calling req.finish() on closed HTTP connections. Closes #1366
112"Brian Warner <warner@lothar.com>"**20110221061544
113 Ignore-this: 799d4de19933f2309b3c0c19a63bb888
114] 
115[update MDMF code with StorageFarmBroker changes
116"Brian Warner <warner@lothar.com>"**20110221061004
117 Ignore-this: a693b201d31125b391cebe0412ddd027
118] 
119[resolve more conflicts with current trunk
120"Brian Warner <warner@lothar.com>"**20110221055600
121 Ignore-this: 77ad038a478dbf5d9b34f7a68159a3e0
122] 
123[Refactor StorageFarmBroker handling of servers
124Brian Warner <warner@lothar.com>**20110221015804
125 Ignore-this: 842144ed92f5717699b8f580eab32a51
126 
127 Pass around IServer instance instead of (peerid, rref) tuple. Replace
128 "descriptor" with "server". Other replacements:
129 
130  get_all_servers -> get_connected_servers/get_known_servers
131  get_servers_for_index -> get_servers_for_psi (now returns IServers)
132 
133 This change still needs to be pushed further down: lots of code is now
134 getting the IServer and then distributing (peerid, rref) internally.
135 Instead, it ought to distribute the IServer internally and delay
136 extracting a serverid or rref until the last moment.
137 
138 no_network.py was updated to retain parallelism.
139] 
140[Add unit tests for cross_check_pkg_resources_versus_import, and a regression test for ref #1355. This requires a little refactoring to make it testable.
141david-sarah@jacaranda.org**20110221015817
142 Ignore-this: 51d181698f8c20d3aca58b057e9c475a
143] 
144[allmydata/__init__.py: .name was used in place of the correct .__name__ when printing an exception. Also, robustify string formatting by using %r instead of %s in some places. fixes #1355.
145david-sarah@jacaranda.org**20110221020125
146 Ignore-this: b0744ed58f161bf188e037bad077fc48
147] 
148[mutable/filenode.py: fix create_mutable_file('string')
149"Brian Warner <warner@lothar.com>"**20110221014659
150 Ignore-this: dc6bdad761089f0199681eeb784f1001
151] 
152[resolve conflicts between 393-MDMF patches and trunk as of 1.8.2
153"Brian Warner <warner@lothar.com>"**20110220230201
154 Ignore-this: 9bbf5d26c994e8069202331dcb4cdd95
155] 
156[tests:
157Kevan Carstensen <kevan@isnotajoke.com>**20100819003531
158 Ignore-this: 314e8bbcce532ea4d5d2cecc9f31cca0
159 
160     - A lot of existing tests relied on aspects of the mutable file
161       implementation that were changed. This patch updates those tests
162       to work with the changes.
163     - This patch also adds tests for new features.
164] 
165[mutable/servermap.py: Alter the servermap updater to work with MDMF files
166Kevan Carstensen <kevan@isnotajoke.com>**20100819003439
167 Ignore-this: 7e408303194834bd59a2f27efab3bdb
168 
169 These modifications were basically all to the end of having the
170 servermap updater use the unified MDMF + SDMF read interface whenever
171 possible -- this reduces the complexity of the code, making it easier to
172 read and maintain. To do this, I needed to modify the process of
173 updating the servermap a little bit.
174 
175 To support partial-file updates, I also modified the servermap updater
176 to fetch the block hash trees and certain segments of files while it
177 performed a servermap update (this can be done without adding any new
178 roundtrips because of batch-read functionality that the read proxy has).
179 
180] 
181[mutable/retrieve.py: Modify the retrieval process to support MDMF
182Kevan Carstensen <kevan@isnotajoke.com>**20100819003409
183 Ignore-this: c03f4e41aaa0366a9bf44847f2caf9db
184 
185 The logic behind a mutable file download had to be adapted to work with
186 segmented mutable files; this patch performs those adaptations. It also
187 exposes some decoding and decrypting functionality to make partial-file
188 updates a little easier, and supports efficient random-access downloads
189 of parts of an MDMF file.
190] 
191[mutable/layout.py and interfaces.py: add MDMF writer and reader
192Kevan Carstensen <kevan@isnotajoke.com>**20100819003304
193 Ignore-this: 44400fec923987b62830da2ed5075fb4
194 
195 The MDMF writer is responsible for keeping state as plaintext is
196 gradually processed into share data by the upload process. When the
197 upload finishes, it will write all of its share data to a remote server,
198 reporting its status back to the publisher.
199 
200 The MDMF reader is responsible for abstracting an MDMF file as it sits
201 on the grid from the downloader; specifically, by receiving and
202 responding to requests for arbitrary data within the MDMF file.
203 
204 The interfaces.py file has also been modified to contain an interface
205 for the writer.
206] 
207[docs: update docs to mention MDMF
208Kevan Carstensen <kevan@isnotajoke.com>**20100814225644
209 Ignore-this: 1c3caa3cd44831007dcfbef297814308
210] 
211[nodemaker.py: Make nodemaker expose a way to create MDMF files
212Kevan Carstensen <kevan@isnotajoke.com>**20100819003509
213 Ignore-this: a6701746d6b992fc07bc0556a2b4a61d
214] 
215[mutable/publish.py: Modify the publish process to support MDMF
216Kevan Carstensen <kevan@isnotajoke.com>**20100819003342
217 Ignore-this: 2bb379974927e2e20cff75bae8302d1d
218 
219 The inner workings of the publishing process needed to be reworked to a
220 large extend to cope with segmented mutable files, and to cope with
221 partial-file updates of mutable files. This patch does that. It also
222 introduces wrappers for uploadable data, allowing the use of
223 filehandle-like objects as data sources, in addition to strings. This
224 reduces memory inefficiency when dealing with large files through the
225 webapi, and clarifies update code there.
226] 
227[mutable/filenode.py: add versions and partial-file updates to the mutable file node
228Kevan Carstensen <kevan@isnotajoke.com>**20100819003231
229 Ignore-this: b7b5434201fdb9b48f902d7ab25ef45c
230 
231 One of the goals of MDMF as a GSoC project is to lay the groundwork for
232 LDMF, a format that will allow Tahoe-LAFS to deal with and encourage
233 multiple versions of a single cap on the grid. In line with this, there
234 is a now a distinction between an overriding mutable file (which can be
235 thought to correspond to the cap/unique identifier for that mutable
236 file) and versions of the mutable file (which we can download, update,
237 and so on). All download, upload, and modification operations end up
238 happening on a particular version of a mutable file, but there are
239 shortcut methods on the object representing the overriding mutable file
240 that perform these operations on the best version of the mutable file
241 (which is what code should be doing until we have LDMF and better
242 support for other paradigms).
243 
244 Another goal of MDMF was to take advantage of segmentation to give
245 callers more efficient partial file updates or appends. This patch
246 implements methods that do that, too.
247 
248] 
249[mutable/checker.py and mutable/repair.py: Modify checker and repairer to work with MDMF
250Kevan Carstensen <kevan@isnotajoke.com>**20100819003216
251 Ignore-this: d3bd3260742be8964877f0a53543b01b
252 
253 The checker and repairer required minimal changes to work with the MDMF
254 modifications made elsewhere. The checker duplicated a lot of the code
255 that was already in the downloader, so I modified the downloader
256 slightly to expose this functionality to the checker and removed the
257 duplicated code. The repairer only required a minor change to deal with
258 data representation.
259] 
260[client.py: learn how to create different kinds of mutable files
261Kevan Carstensen <kevan@isnotajoke.com>**20100814225711
262 Ignore-this: 61ff665bc050cba5f58bf2ed779d692b
263] 
264[web: Alter the webapi to get along with and take advantage of the MDMF changes
265Kevan Carstensen <kevan@isnotajoke.com>**20100814081012
266 Ignore-this: 96c2ed4e4a9f450fb84db5d711d10bd6
267 
268 The main benefit that the webapi gets from MDMF, at least initially, is
269 the ability to do a streaming download of an MDMF mutable file. It also
270 exposes a way (through the PUT verb) to append to or otherwise modify
271 (in-place) an MDMF mutable file.
272] 
273[scripts: tell 'tahoe put' about MDMF
274Kevan Carstensen <kevan@isnotajoke.com>**20100813234957
275 Ignore-this: c106b3384fc676bd3c0fb466d2a52b1b
276] 
277[immutable/literal.py: implement the same interfaces as other filenodes
278Kevan Carstensen <kevan@isnotajoke.com>**20100810000633
279 Ignore-this: b50dd5df2d34ecd6477b8499a27aef13
280] 
281[immutable/filenode.py: Make the immutable file node implement the same interfaces as the mutable one
282Kevan Carstensen <kevan@isnotajoke.com>**20100810000619
283 Ignore-this: 93e536c0f8efb705310f13ff64621527
284] 
285[frontends/sftpd.py: Modify the sftp frontend to work with the MDMF changes
286Kevan Carstensen <kevan@isnotajoke.com>**20100809233535
287 Ignore-this: 2d25e2cfcd0d7bbcbba660c7e1da12f
288] 
289[interfaces.py: Add #993 interfaces
290Kevan Carstensen <kevan@isnotajoke.com>**20100809233244
291 Ignore-this: b58621ac5cc86f1b4b4149f9e6c6a1ce
292] 
293[TAG allmydata-tahoe-1.8.2
294warner@lothar.com**20110131020101] 
295Patch bundle hash:
29685ba2dfc67d9e255e8b82316f147ac92a2b7896e