Opened at 2012-09-08T13:15:38Z
Closed at 2020-10-30T12:35:44Z
#1803 closed defect (wontfix)
S3 backend: AttributeError: 'NoneType' object has no attribute 'startswith'
Reported by: | zooko | Owned by: | davidsarah |
---|---|---|---|
Priority: | major | Milestone: | undecided |
Component: | code-storage | Version: | 1.9.0-s3branch |
Keywords: | s3-backend error txaws | Cc: | |
Launchpad Bug: |
Description (last modified by davidsarah)
This was after it had been doing a "tahoe backup" job for about 90 minutes. There could have been a transient network failure.
File "/home/zooko/playground/tahoe-lafs/cloud-backend/src/allmydata/scripts/tahoe_backup.py", line 305, in upload raise HTTPError("Error during file PUT", resp) HTTPError: Error during file PUT: 500 Internal Server Error "Traceback (most recent call last):\x0a File \"/usr/local/lib/python2.7/dist-packages/foolscap-0.6.3.post0-py2.7.egg/foolscap/call.py\", line 753, in receiveClose\x0a self.request.fail(f)\x0a File \"/usr/local/lib/python2.7/dist-packages/foolscap-0.6.3.post0-py2.7.egg/foolscap/call.py\", line 95, in fail\x0a self.deferred.errback(why)\x0a File \"/home/zooko/playground/tahoe-lafs/cloud-backend/support/lib/python2.7/site-packages/Twisted-11.1.0-py2.7-linux-x86_64.egg/twisted/internet/defer.py\", line 391, in errback\x0a self._startRunCallbacks(fail)\x0a File \"/home/zooko/playground/tahoe-lafs/cloud-backend/support/lib/python2.7/site-packages/Twisted-11.1.0-py2.7-linux-x86_64.egg/twisted/internet/defer.py\", line 458, in _startRunCallbacks\x0a self._runCallbacks()\x0a--- <exception caught here> ---\x0a File \"/home/zooko/playground/tahoe-lafs/cloud-backend/support/lib/python2.7/site-packages/Twisted-11.1.0-py2.7-linux-x86_64.egg/twisted/internet/defer.py\", line 545, in _runCallbacks\x0a current.result = callback(current.result, *args, **kw)\x0a File \"/home/zooko/playground/tahoe-lafs/cloud-backend/src/allmydata/immutable/upload.py\", line 604, in _got_response\x0a return self._loop()\x0a File \"/home/zooko/playground/tahoe-lafs/cloud-backend/src/allmydata/immutable/upload.py\", line 516, in _loop\x0a return self._failed(msg)\x0a File \"/home/zooko/playground/tahoe-lafs/cloud-backend/src/allmydata/immutable/upload.py\", line 617, in _failed\x0a raise UploadUnhappinessError(msg)\x0aallmydata.interfaces.UploadUnhappinessError: server selection failed for <Tahoe2ServerSelector for upload yyoj4>: shares could be placed or found on only 0 server(s). We were asked to place shares on at least 1 server(s) such that any 1 of them have enough shares to recover the file. (placed 0 shares out of 1 total (1 homeless), want to place shares on at least 1 servers such that any 1 of them have enough shares to recover the file, sent 1 queries to 1 servers, 0 queries placed some shares, 1 placed none (of which 0 placed none due to the server being full and 1 placed none due to an error)) (last failure (from <ServerTracker for server 66rayr and SI yyoj4>) was: [Failure instance: Traceback (failure with no frames): <class 'foolscap.tokens.RemoteException'>: <RemoteException around '[CopiedFailure instance: Traceback from remote host -- Traceback (most recent call last):\x0a File \"/usr/local/lib/python2.6/dist-packages/Twisted-11.1.0-py2.6-linux-i686.egg/twisted/internet/tcp.py\", line 277, in connectionLost\x0a protocol.connectionLost(reason)\x0a File \"/usr/local/lib/python2.6/dist-packages/Twisted-11.1.0-py2.6-linux-i686.egg/twisted/web/client.py\", line 191, in connectionLost\x0a self.factory._disconnectedDeferred.callback(None)\x0a File \"/usr/local/lib/python2.6/dist-packages/Twisted-11.1.0-py2.6-linux-i686.egg/twisted/internet/defer.py\", line 362, in callback\x0a self._startRunCallbacks(result)\x0a File \"/usr/local/lib/python2.6/dist-packages/Twisted-11.1.0-py2.6-linux-i686.egg/twisted/internet/defer.py\", line 458, in _startRunCallbacks\x0a self._runCallbacks()\x0a--- <exception caught here> ---\x0a File \"/usr/local/lib/python2.6/dist-packages/Twisted-11.1.0-py2.6-linux-i686.egg/twisted/internet/defer.py\", line 545, in _runCallbacks\x0a current.result = callback(current.result, *args, **kw)\x0a File \"/home/customer/LAFS_source/src/allmydata/storage/backends/s3/s3_backend.py\", line 20, in _make_share\x0a if data.startswith(MUTABLE_MAGIC):\x0aexceptions.AttributeError: 'NoneType' object has no attribute 'startswith'\x0a]'>\x0a])\x0a"
The client is actually the cloud backend, not trunk.
allmydata-tahoe: 1.9.0-r5930 foolscap: 0.6.3.post0 pycryptopp: 0.6.0.1206569328141510525648634803928199668821045408958 zfec: 1.4.24 Twisted: 11.1.0 Nevow: 0.10.0 zope.interface: unknown python: 2.7.3 platform: Linux-Ubuntu_12.04-x86_64-64bit_ELF pyOpenSSL: 0.12 simplejson: 2.3.2 pycrypto: 2.5 pyasn1: unknown mock: 0.8.0beta3 txAWS: 0.2.1.post4 Epsilon: 0.6.0 setuptools: 0.6c16dev3
The storage server is LAE's ticket999-S3-backend branch.
Change History (5)
comment:1 Changed at 2012-09-08T13:19:14Z by zooko
- Summary changed from s3_backend: AttributeError: 'NoneType' object has no attribute 'startswith' to cloud backend: AttributeError: 'NoneType' object has no attribute 'startswith'
comment:2 Changed at 2012-09-10T03:34:04Z by davidsarah
- Description modified (diff)
- Keywords s3-backend error added
- Owner set to davidsarah
- Priority changed from normal to major
- Status changed from new to assigned
- Summary changed from cloud backend: AttributeError: 'NoneType' object has no attribute 'startswith' to S3 backend: AttributeError: 'NoneType' object has no attribute 'startswith'
- Version changed from cloud-branch to 1.9.0-s3branch
comment:3 Changed at 2012-11-22T01:25:50Z by davidsarah
- Keywords txaws added
comment:4 Changed at 2013-04-23T06:00:21Z by daira
We could work around this relatively straightforwardly by treating a None return from txaws as a failure that should cause a retry (in the current cloud backend, replace _do_request in cloud_common.py with the code below), but it goes against the grain to do that without understanding what the cause of the bug is, or at least whether it is in txaws or in HTTPClientFactory.
def _do_request(self, description, operation, *args, **kwargs): d = defer.maybeDeferred(operation, *args, **kwargs) def _maybe_retry(r): if r is None: r = Failure(CloudError("unexpected value None returned from container operation")) if not isinstance(r, Failure): return r d2 = self._handle_error(r, 1, None, description, operation, *args, **kwargs) def _trigger_incident(res): log.msg(format="error(s) on cloud container operation: %(description)s %(arguments)s %(kwargs)s %(res)s", arguments=args[:2], kwargs=kwargs, description=description, res=res, level=log.WEIRD) return res d2.addBoth(_trigger_incident) return d2 d.addBoth(_maybe_retry) return d
comment:5 Changed at 2020-10-30T12:35:44Z by exarkun
- Resolution set to wontfix
- Status changed from assigned to closed
The established line of development on the "cloud backend" branch has been abandoned. This ticket is being closed as part of a batch-ticket cleanup for "cloud backend"-related tickets.
If this is a bug, it is probably genuinely no longer relevant. The "cloud backend" branch is too large and unwieldy to ever be merged into the main line of development (particularly now that the Python 3 porting effort is significantly underway).
If this is a feature, it may be relevant to some future efforts - if they are sufficiently similar to the "cloud backend" effort - but I am still closing it because there are no immediate plans for a new development effort in such a direction.
Tickets related to the "leasedb" are included in this set because the "leasedb" code is in the "cloud backend" branch and fairly well intertwined with the "cloud backend". If there is interest in lease implementation change at some future time then that effort will essentially have to be restarted as well.
From the traceback it seems to be the S3 backend branch (/home/customer/LAFS_source on the storage server), not the cloud backend branch (/home/zooko/playground/tahoe-lafs/cloud-backend on the gateway), that is relevant. In any case the S3Bucket class from which the error is raised appears in both with few changes.
It seems that S3Bucket.get_object returned a Deferred for None (rather than a Deferred byte string), which is confusing. I think it can only happen if txaws.s3.client.S3Client.get_object returns a Deferred for None.
I looked briefly at the txaws code and it seems as though the return value comes from the client returned by twisted.web.client.HTTPClientFactory, but I can't tell whether that is giving None or whether the data is getting lost somewhere in txaws, or (less likely) in our error handling code.