#698 closed defect (invalid)
corrupted file displayed to user after failure to download followed by retry
Reported by: | zooko | Owned by: | |
---|---|---|---|
Priority: | critical | Milestone: | 1.5.0 |
Component: | code-network | Version: | 1.4.1 |
Keywords: | integrity | Cc: | |
Launchpad Bug: |
Description
I clicked the bookmark to load my blog writably, and I got an HTML page saying:
<class 'twisted.internet.defer.FirstError'>: FirstError(<twisted.python.failure.Failure <class 'foolscap.ipb.DeadReferenceError'>>, 2) <class 'twisted.internet.defer.FirstError'>: FirstError(<twisted.python.failure.Failure <class 'foolscap.ipb.DeadReferenceError'>>, 2)
I looked at the "Recent Uploads/Downloads?" page and saw that my attempt to load it had failed:
09:23:12 07-May-2009 download lxershd2xflho66w6yikhwg3ne No 588.7kB 0.0% Failed 09:23:12 07-May-2009 retrieve 6s64wyhfbm7yxb5cwzqblnndpe No 2.0kB 100.0% Done 09:23:12 07-May-2009 mapupdate MODE_READ 6s64wyhfbm7yxb5cwzqblnndpe No -NA- 100.0% Done
The three details pages are attached: mapupdate-35.html, retrieve-35.html, and down-13.html.
I looked in the logs directory of my Tahoe node, and saw that the twistd.log had these same error messages about DeadReferenceError. twistd.log is attached (bzipped).
I looked in the logs/incidents directory and saw that there was one incident that was recorded at the time of this attempt to load. It is attached as incident-2009-05-07-094319-jg54cni.flog.bz2. The triggering incident line is
13:51:05.704 [6928]: SCARY <CiphertextDownloader #10>(hekksgbfsn6w): download failed! FAILURE: [CopiedFailure instance: Traceback from remote host -- Traceback (most recent call last): Failure: twisted.internet.defer.FirstError: FirstError(<twisted.python.failure.Failure <class 'foolscap.ipb.DeadReferenceError'>>, 0) ]
So far I think that this Tahoe demonstrating suboptimal handling of a network failure -- it should probably have returned an HTTP 503 "Service Unavailable" (or maybe 504 "Gateway Timeout" or just 500 "Internal Server Error"?) instead of an HTML page containing cryptic error messages. But it gets worse:
Then I hit the "Reload" button on my web browser, and I got the same two error message lines followed by a partial copy of the contents of my blog source code! This result is attached as wiki.html (bzipped). This is what I mean by a corrupted file being displayed to the user.
The "Recent Uploads and Downloads" now says:
09:47:10 07-May-2009 retrieve 6s64wyhfbm7yxb5cwzqblnndpe No 2.0kB 100.0% Done 09:47:10 07-May-2009 mapupdate MODE_READ 6s64wyhfbm7yxb5cwzqblnndpe No -NA- 100.0% Done 09:23:12 07-May-2009 download lxershd2xflho66w6yikhwg3ne No 588.7kB 0.0% Failed 09:23:12 07-May-2009 retrieve 6s64wyhfbm7yxb5cwzqblnndpe No 2.0kB 100.0% Done 09:23:12 07-May-2009 mapupdate MODE_READ 6s64wyhfbm7yxb5cwzqblnndpe No -NA- 100.0% Done
The fact that there is no download following the map-update is surprising to me.
The details from the new mapupdate-36.html and retrieve-36.html are attached. There are no new problems reported in the twistd.log or the logs/incidents/.
I'm going to go ahead and mark this with Priority: critical because I see a corrupted file and I don't understand why. Hopefully it will turn out to be a bug in the web browser, which is an unstable release: firefox-3.5 3.5~b4~hg20090330r24021+nobinonly-0ubuntu1.
Attachments (9)
Change History (19)
Changed at 2009-05-07T16:22:51Z by zooko
Changed at 2009-05-07T16:23:00Z by zooko
Changed at 2009-05-07T16:23:07Z by zooko
Changed at 2009-05-07T16:23:55Z by zooko
Changed at 2009-05-07T16:24:45Z by zooko
Changed at 2009-05-07T16:25:33Z by zooko
Changed at 2009-05-07T16:25:52Z by zooko
Changed at 2009-05-07T16:26:01Z by zooko
comment:1 Changed at 2009-05-07T19:47:52Z by warner
comment:2 Changed at 2009-05-14T21:16:24Z by zooko
By the way, the client doing the fetch was Firefox 3.5 beta 4 -- more details in the original report. It's the package named "firefox-3.5" in Ubuntu Jaunty.
Also, Brian, what do you think about the fact that there was no download step after the
09:47:10 07-May-2009 retrieve 6s64wyhfbm7yxb5cwzqblnndpe No 2.0kB 100.0% Done 09:47:10 07-May-2009 mapupdate MODE_READ 6s64wyhfbm7yxb5cwzqblnndpe No -NA- 100.0% Done
When I reloaded, but there was a download step after those two steps when I initially loaded? This means the client (Firefox-3.5b4 web browser) didn't actually fetch the contents of my blog when I hit "Reload" and instead just spewed that messed-up page (attached as "wiki.html.bz2"), right? In that case the "displaying corrupt page" is actually Firefox's problem (although as David-Sarah Hopwood pointed out on the mailing list, the way that Tahoe commits to a 200 success code before completely downloading the file means this can always happen by accident).
Certainly something that Tahoe did was unusual, see the attached file incident-2009-05-07-094319-jg54cni.flog.bz2, which, when viewed with flogtool web-viewer incident-2009-05-07-094319-jg54cni.flog.bz2 shows this among other things:
# Incident Triggers: * 09:43:19.240 [31906]: SCARY <CiphertextDownloader #14>(lxershd2xflh): download failed! FAILURE: [CopiedFailure instance: Traceback from remote host -- Traceback (most recent call last): Failure: twisted.internet.defer.FirstError: FirstError(<twisted.python.failure.Failure <class 'foolscap.ipb.DeadReferenceError'>>, 2) ] [INCIDENT-TRIGGER]
comment:3 Changed at 2009-05-14T21:45:09Z by zooko
I just got this failure again. The file in question (my blog) has been appended-to since the original report, so now its total size is different from last time. Here is the accompanying foolscap incident report. I did *not* hit re-load in this case, I just opened a new tab and clicked on the bookmark that opens up my blog in writeable mode.
Changed at 2009-05-14T21:46:13Z by zooko
comment:4 Changed at 2009-05-14T21:55:27Z by zooko
The triggering incident in that new incident log (as displayed by flogtool web-viewer is:
# Incident Triggers: * 14:29:35.644 [816]: WEIRD Tub.connectorFinished: WEIRD, <foolscap.negotiate.TubConnector object at 0x21ace10 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to qvqv7jmm76yfhdjuuzeqyfuwlqinpofd> is not in [<foolscap.negotiate.TubConnector object at 0x20a8510 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to gdfa2zqvj7l2ng26bwalyubitdxtmywf>, <foolscap.negotiate.TubConnector object at 0x20a8810 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to i3u2uz4ewpz3n36ckaqmtmfrubpifapd>, <foolscap.negotiate.TubConnector object at 0x20b3450 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to x4pds7hcgaq2exnumxbsjkv2jrkzyyxf>, <foolscap.negotiate.TubConnector object at 0x20b3750 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to k6h4otc2f2wmmysam5ofciaenhngurwm>, <foolscap.negotiate.TubConnector object at 0x20b3d50 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to gkc4vgeeyktdzwklygxnummy3cin2fri>, <foolscap.negotiate.TubConnector object at 0x20bb090 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to srb3paczogfsh6blsmopelry2ogc4uii>, <foolscap.negotiate.TubConnector object at 0x20bb410 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to jg6qoqzfunqhsvjsrquize3tpcsyglvo>, <foolscap.negotiate.TubConnector object at 0x20bbd90 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to pstqfjqxsfpu7ac7crfzgm26jervfz37>, <foolscap.negotiate.TubConnector object at 0x20c5090 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to mtuj2lnh5sarkpummx77w7jjhynz2jea>, <foolscap.negotiate.TubConnector object at 0x20c5350 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to okjeolejndkuenpuuaiduvk2plebgjao>, <foolscap.negotiate.TubConnector object at 0x20c5710 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to jfccf453ml75gscmpugihlg2bcjzgswr>, <foolscap.negotiate.TubConnector object at 0x20c5c10 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to yzdb43kciqbv3oech66pq54lrtddbbvf>, <foolscap.negotiate.TubConnector object at 0x20d0250 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to i4kd2dpoaj7wnd6kicsklcswayop24wy>, <foolscap.negotiate.TubConnector object at 0x20d0a90 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to jobn3i5i7dlu6vt2y6duc3yhvif3d4vr>, <foolscap.negotiate.TubConnector object at 0x20d0f50 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to zzkq2nfnpukh5uajj7aedsl6sc3m5rup>, <foolscap.negotiate.TubConnector object at 0x20d9d50 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to t5zibyamutxhjtazj7fdc5q4nqmifqnj>, <foolscap.negotiate.TubConnector object at 0x21ac150 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to sttpea3ze6sqi5qyqmomeiwfsayytkar>, <foolscap.negotiate.TubConnector object at 0x21b8210 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to 3wrurfqxq4xqzlqatks53b2ekwaw5jt2>, <foolscap.negotiate.TubConnector object at 0x21b8850 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to g5662mtbvxlfcipvcpvicellm2ljdqow>, <foolscap.negotiate.TubConnector object at 0x21bf350 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to 6fyx5u4zr7tvz3szynihc4x3uc6ct5gh>, <foolscap.negotiate.TubConnector object at 0x21bf650 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to xpyajfs4rvm3yps5szojjrxjsgkmshfp>, <foolscap.negotiate.TubConnector object at 0x21bfb50 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to lwkv6cjicbzqjwwwuifik3pogeupsicb>, <foolscap.negotiate.TubConnector object at 0x21ca190 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to xiktf6ok5f5ao5znxxttriv233hmvi4v>, <foolscap.negotiate.TubConnector object at 0x21ca490 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to t5g7egomnnktbpydbuijt6zgtmw4oqi5>, <foolscap.negotiate.TubConnector object at 0x21caa90 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to odcd5enlzmv7iobwad63tchotty3lotx>, <foolscap.negotiate.TubConnector object at 0x21e0710 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to iul3ljbknqud666muthssyfbyrbd4xae>] [INCIDENT-TRIGGER]
comment:5 follow-up: ↓ 9 Changed at 2009-05-14T22:01:42Z by zooko
So I think we need to split this ticket up into (at least):
- A ticket to switch the wui/wapi to chunked encoding so that we can unambiguously signal late errors to the client.
- This ticket, whose focus is probably going to turn to seeing if I can reproduce this problem with any other web browser than the Firefox-3.5b4 that I've seen it on.
- A ticket to investigate the WEIRD Tub.connectorFinished mentioned in 4. Is that right, Brian, that it deserves investigation?
comment:6 Changed at 2009-05-14T22:33:58Z by zooko
Okay, re topic 2 from 5, I see that the corrupted page that I get is actually the complete contents of my wiki with the first 2302 or 2306 bytes of it overwritten by the Twisted error messages.
That is: my wiki begins with the following 4000 bytes:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xml:lang="en" xmlns="http://www.w3.org/1999/xhtml" lang="en"><head> <script id="versionArea" type="text/javascript"> //<![CDATA[ var version = {title: "TiddlyWiki", major: 2, minor: 5, revision: 0, date: new Date("Mar 9, 2009"), extensions: {}}; //]]> </script> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <meta name="copyright" content="TiddlyWiki created by Jeremy Ruston, (jeremy [at] osmosoft [dot] com) Copyright (c) UnaMesa Association 2004-2009 Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. Neither the name of the UnaMesa Association nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 'AS IS' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE."> <script id="jsheadArea" type="text/javascript"> //<![CDATA[ /* * jQuery JavaScript Library v1.3.2 * http://jquery.com/ * * Copyright (c) 2009 John Resig * Dual licensed under the MIT and GPL licenses. * http://docs.jquery.com/License * * Date: 2009-02-19 17:34:21 -0500 (Thu, 19 Feb 2009) * Revision: 6246 */ (function(){var l=this,g,y=l.jQuery,p=l.$,o=l.jQuery=l.$=function(E,F){return new o.fn.init(E,F)},D=/^[^<]*(<(.|\s)+>)[^>]*$|^#([\w-]+)$/,f=/^.[^:#\[\.,]*$/;o.fn=o.prototype={init:function(E,H){E=E||document;if(E.nodeType){this[0]=E;this.length=1;this.context=E;return this}if(typeof E==="string"){var G=D.exec(E);if(G&&(G[1]||!H)){if(G[1]){E=o.clean([G[1]],H)}else{var I=document.getElementById(G[3]);if(I&&I.id!=G[3]){return o().find(E)}var F=o(I||[]);F.context=document;F.selector=E;return F}}else{return o(H).find(E)}}else{if(o.isFunction(E)){return o(document).ready(E)}}if(E.selector&&E.context){this.selector=E.selector;this.context=E.context}return this.setArray(o.isArray(E)?E:o.makeArray(E))},selector:"",jquery:"1.3.2",size:function(){return this.length},get:function(E){return E===g?Array.prototype.slice.call(this):this[E]},pushStack:function(F,H,E){var G=o(F);G.prevObject=this;G.context=this.context;if(H==="find"){G.selector=this.selector+(this.selector?" ":"")+E}else{if(H){G.selector=this.selector+"."+H+"("+E+")"}}return G},setArray:function(E){this.length=0;Array.prototype.push.apply(this,E);return this},each:function(F,E){return o.each(this,F,E)},index:function(E){return o.inArray(E&&E.jquery?E[0]:E,this)},attr:function(F,H,G){var E=F;if(typeof F==="string"){if(H===g){return this[0]&&o[G||"attr"](this[0],F)}else{E={};E[F]=H}}return this.each(function(I){for(F in E){o.attr(G?this.style:this,F,o.prop(this,E[F],G,I,F))}})},css:function(E,F){if((E=="width"||E=="height")&&parseFloat(F)<0){F=g}return this.attr(E,F,"curCSS")},text:function(F){if(typeof F!=="object"&&F!=null){return this.empty().append((this[0]&&this[0].ownerDocument||do
When I encountered this failure the first time, the first 4000 bytes of the resulting page that I got looked like this:
<html><head><title>Exception</title></head><body><style type="text/css"> p.error { color: black; font-family: Verdana, Arial, helvetica, sans-serif; font-weight: bold; font-size: large; margin: 0.25em; } div { font-family: Verdana, Arial, helvetica, sans-serif; } strong.variableClass { font-size: small; } div.stackTrace { } div.frame { padding: 0.25em; background: white; border-bottom: thin black dotted; } div.firstFrame { padding: 0.25em; background: white; border-top: thin black dotted; border-bottom: thin black dotted; } div.location { font-size: small; } div.snippet { background: #FFFFDD; padding: 0.25em; } div.snippetHighlightLine { color: red; } span.lineno { font-size: small; } pre.code { margin: 0px; padding: 0px; display: inline; font-size: small; font-family: "Courier New", courier, monotype; } span.function { font-weight: bold; font-family: "Courier New", courier, monotype; } table.variables { border-collapse: collapse; width: 100%; } td.varName { width: 1in; vertical-align: top; font-style: italic; font-size: small; padding-right: 0.25em; } td.varValue { padding-left: 0.25em; padding-right: 0.25em; font-size: small; } div.variables { margin-top: 0.5em; } div.dict { background: #cccc99; padding: 2px; float: left; } td.dictKey { background: #ffff99; font-weight: bold; } td.dictValue { background: #ffff99; } div.list { background: #7777cc; padding: 2px; float: left; } div.listItem { background: #9999ff; } div.instance { width: 100%; background: #efefef; padding: 2px; float: left; } span.instanceName { font-size: small; display: block; } span.instanceRepr { font-family: "Courier New", courier, monotype; } div.function { background: orange; font-weight: bold; float: left; } </style><a href="#tracebackEnd"><p class="error"><class 'twisted.internet.defer.FirstError'>: FirstError(<twisted.python.failure.Failure <class 'foolscap.ipb.DeadReferenceError'>>, 2)</p></a><div class="stackTrace"></div><a name="tracebackEnd"><p class="error"><class 'twisted.internet.defer.FirstError'>: FirstError(<twisted.python.failure.Failure <class 'foolscap.ipb.DeadReferenceError'>>, 2)</p></a></body></html>, 19 Feb 2009) * Revision: 6246 */ (function(){var l=this,g,y=l.jQuery,p=l.$,o=l.jQuery=l.$=function(E,F){return new o.fn.init(E,F)},D=/^[^<]*(<(.|\s)+>)[^>]*$|^#([\w-]+)$/,f=/^.[^:#\[\.,]*$/;o.fn=o.prototype={init:function(E,H){E=E||document;if(E.nodeType){this[0]=E;this.length=1;this.context=E;return this}if(typeof E==="string"){var G=D.exec(E);if(G&&(G[1]||!H)){if(G[1]){E=o.clean([G[1]],H)}else{var I=document.getElementById(G[3]);if(I&&I.id!=G[3]){return o().find(E)}var F=o(I||[]);F.context=document;F.selector=E;return F}}else{return o(H).find(E)}}else{if(o.isFunction(E)){return o(document).ready(E)}}if(E.selector&&E.context){this.selector=E.selector;this.context=E.context}return this.setArray(o.isArray(E)?E:o.makeArray(E))},selector:"",jquery:"1.3.2",size:function(){return this.length},get:function(E){return E===g?Array.prototype.slice.call(this):this[E]},pushStack:function(F,H,E){var G=o(F);G.prevObject=this;G.context=this.context;if(H==="find"){G.selector=this.selector+(this.selector?" ":"")+E}else{if(H){G.selector=this.selector+"."+H+"("+E+")"}}return G},setArray:function(E){this.length=0;Array.prototype.push.apply(this,E);return this},each:function(F,E){return o.each(this,F,E)},index:function(E){return o.inArray(E&&E.jquery?E[0]:E,this)},attr:function(F,H,G){var E=F;if(typeof F==="string"){if(H===g){return this[0]&&o[G||"attr"](this[0],F)}else{E={};E[F]=H}}return this.each(function(I){for(F in E){o.attr(G?this.style:this,F,o.prop(this,E[F],G,I,F))}})},css:function(E,F){if((E=="width"||E=="height")&&parseFloat(F)<0){F=g}return this.attr(E,F,"curCSS")},text:function(F){if(typeof F!=="object"&&F!=null){return this.empty().append((this[0]&&this[0].ownerDocument
See what I mean? It looks like Firefox 3.5b4 has overwritten the first 2302 or so bytes of the file with the twisted error message. Note that it doesn't exactly line up -- the twisted error message from <html> to </html> inclusive is 2302 bytes, but the first bytes of the original file , 19 Feb 2009) I would expect to begin on byte 2306 rather than byte 2302. Maybe there was a newline conversion or the header got subtly rewritten a little between the time I captured the above view of "my wiki" and the time the bad wiki was emitted.
Anyway, this makes me pretty sure that Firefox-3.5b4 is doing something mighty funny here. Certainly I don't think Tahoe is sending those bytes down in an HTTP response. Indeed, from the "Current Uploads Downloads" log, we can see that Tahoe wasn't asked for the contents of the wiki at all (the "update in MODE_READ" and the "retrieve" operations are the get the directory, not to get the actual wiki.html file from the directory). Firefox-3.5b4 must have supplied the cached contents of the wiki itself, then overwritten the first few bytes of it with the twisted error message, then displayed it to the user. Ugh.
I suppose we should report this on the firefox issue tracker...
comment:7 Changed at 2009-06-01T17:25:34Z by zooko
- Resolution set to invalid
- Status changed from new to closed
Okay I reported this to the Ubuntu maintainers of firefox via launchpad:
https://bugs.launchpad.net/ubuntu/+source/firefox/+bug/382444 # short read from server results in cached page with the first few bytes overwritten by the short read result
Closing this ticket as 'invalid'.
comment:8 Changed at 2009-06-03T21:46:37Z by zooko
I've opened a ticket for Firefox-3.5b4: https://bugzilla.mozilla.org/show_bug.cgi?id=496205
comment:9 in reply to: ↑ 5 Changed at 2009-10-29T05:47:51Z by davidsarah
Replying to zooko:
So I think we need to split this ticket up into (at least):
- A ticket to switch the wui/wapi to chunked encoding so that we can unambiguously signal late errors to the client.
This is now #822.
- This ticket, whose focus is probably going to turn to seeing if I can reproduce this problem with any other web browser than the Firefox-3.5b4 that I've seen it on.
- A ticket to investigate the WEIRD Tub.connectorFinished mentioned in 4. Is that right, Brian, that it deserves investigation?
Does this still deserve its own ticket?
comment:10 Changed at 2009-10-29T17:18:11Z by zooko
I believe Brian downgraded that event to no longer be considered "weird" so it would stop triggering incident reports. WEIRD Tub.connectorFinished: WEIRD, <foolscap.negotiate.TubConnector object at 0x21ace10 from vraj3cgb5eqs5xrb44qgqvwo4obbdhfd to qvqv7jmm76yfhdjuuzeqyfuwlqinpofd> is not in [....]
This unexplained behavior may be related to #653, as described in ticket:653#comment:20. So, I guess it isn't clear to me whether we should ticket that issue or instead engage in some strategy which makes our use of foolscap immune to such issues. I suppose if we were going to do that then we should ticket that.
That's pretty weird. The first thing that comes to mind is that a server connection could have been lost in the middle of the download (in this case, after we've retrieved the UEB and some of the hashes, but before we've retrieved the first data block). The web server has to commit to success (200) or failure (404 or 500 or something) before it starts sending any of the plaintext, but it doesn't want to store the entire file either. So it bases the HTTP response code upon the initial availability of k servers, and hopes they'll stick around for the whole download.
When we get a "late failure" (i.e. one of the servers disconnects in the middle), the webapi doesn't have a lot of choices. At the moment, it emits a brief error message (attached to whatever partial content has already been written out), then drops the HTTP connection, and hopes that the client is observant enough to notice that the number of received bytes does not match the previously-sent Content-Length header, and then announce an error on the client side.
If the application doing the fetch (perhaps the browser, perhaps tiddywiki itself?) doesn't strictly check the Content-Length header, then it could get partial content without an error message.
There are two directions to fix this:
I'm not sure what's up with the happens-again-after-retry part of this. For the benefit of partial-range fetches, we sometimes cache the file's contents locally, and I don't know how that would interact with lost-server errors. It's at least conceivable that the caching mechanism doesn't realize that an error occurred, and tries to pass partial data to the second download attempt. But most browsers don't send a Range: header at all (it's mostly streaming media players which do that), and I believe that the webapi will skip this whole caching thing unless it sees a Range: header.