Opened at 2022-01-04T16:32:27Z
Closed at 2022-01-07T19:37:42Z
#3854 closed defect (fixed)
builtins.TypeError: write() argument must be str, not bytes from allmydata/webish.py usage of FileUploadFieldStorage
Reported by: | exarkun | Owned by: | itamarst |
---|---|---|---|
Priority: | normal | Milestone: | undecided |
Component: | unknown | Version: | n/a |
Keywords: | python3 | Cc: | |
Launchpad Bug: |
Description
On Python 3.9 when issuing some request to the Tahoe-LAFS web API, this traceback comes up:
2022-01-04T10:36:04-0500 [_GenericHTTPChannelProtocol,1,127.0.0.1] Unhandled Error Traceback (most recent call last): File "python3.9/site-packages/twisted/python/log.py", line 103, in callWithLogger return callWithContext({"system": lp}, func, *args, **kw) File "python3.9/site-packages/twisted/python/log.py", line 86, in callWithContext return context.call({ILogContext: newCtx}, func, *args, **kw) File "python3.9/site-packages/twisted/python/context.py", line 122, in callWithContext return self.currentContext().callWithContext(ctx, func, *args, **kw) File "python3.9/site-packages/twisted/python/context.py", line 85, in callWithContext return func(*args,**kw) --- <exception caught here> --- File "python3.9/site-packages/twisted/internet/posixbase.py", line 614, in _doReadOrWrite why = selectable.doRead() File "python3.9/site-packages/twisted/internet/tcp.py", line 243, in doRead return self._dataReceived(data) File "python3.9/site-packages/twisted/internet/tcp.py", line 249, in _dataReceived rval = self.protocol.dataReceived(data) File "python3.9/site-packages/twisted/web/http.py", line 3024, in dataReceived return self._channel.dataReceived(data) File "python3.9/site-packages/twisted/web/http.py", line 2305, in dataReceived return basic.LineReceiver.dataReceived(self, data) File "python3.9/site-packages/twisted/protocols/basic.py", line 579, in dataReceived why = self.rawDataReceived(data) File "python3.9/site-packages/twisted/web/http.py", line 2312, in rawDataReceived self._transferDecoder.dataReceived(data) File "python3.9/site-packages/twisted/web/http.py", line 1755, in dataReceived finishCallback(data[contentLength:]) File "python3.9/site-packages/twisted/web/http.py", line 2171, in _finishRequestBody self.allContentReceived() File "python3.9/site-packages/twisted/web/http.py", line 2284, in allContentReceived req.requestReceived(command, path, version) File "python3.9/site-packages/allmydata/webish.py", line 134, in requestReceived self.fields = FileUploadFieldStorage( File "python3.9/cgi.py", line 482, in __init__ self.read_single() File "python3.9/cgi.py", line 675, in read_single self.read_binary() File "python3.9/cgi.py", line 697, in read_binary self.file.write(data) builtins.TypeError: write() argument must be str, not bytes
I'm not exactly sure yet what request triggers this.
Change History (9)
comment:1 Changed at 2022-01-04T16:55:00Z by itamarst
- Owner set to itamarst
comment:2 Changed at 2022-01-04T16:59:30Z by itamarst
comment:3 Changed at 2022-01-04T17:04:01Z by itamarst
Ah, so:
In order to workaround problems in Python 3's decision about whether something is bytes or unicode, we implement a heuristic that says "if the 'name' MIME field of the upload was 'file', assume it's bytes."
And my guess is that heuristic isn't good enough for all clients, so we need some more aggressive heuristic. I guess we could just say "it's always bytes"? But that might break some other bits, like the web UI.
comment:4 Changed at 2022-01-04T17:06:46Z by itamarst
One problem with current heuristic is that it breaks Python 3's "'filename' field was set" heuristic. So that's one change to make.
And then clients could be required to set that going forward.
comment:5 Changed at 2022-01-04T17:07:15Z by itamarst
Anyway I would look for uploads in the client code that is triggering this to get a reproducer.
comment:6 Changed at 2022-01-06T13:53:05Z by exarkun
The request that triggers the traceback is a POST to /storage-plugins/privatestorageio-zkapauthz-v1/calculate-price (so, not a first-part resource).
The headers are:
{ 'content-length': '2433' , 'authorization': 'tahoe-lafs O_0Cs...' , 'content-type': 'application/json' , 'accept-encoding': 'gzip' , 'host': '127.0.0.1:39053' }
There are some first-party resources that accept JSON so maybe it is possible to reproduce this without involving third-party plugins. Although I don't know what it means that there are not already any failing unit tests for this code path (apart from the obvious guess of incomplete test coverage).
Looking at docs/frontends/webapi.rst I see POST /uri?t=mkdir-with-children which should be pretty similar (there are also a lot of variations on that action, "create a directory", that also take JSON bodies).
The docs *don't* say that a content-type: application/json header is required in these cases. I don't know if that's a relevant distinction or not.
comment:7 Changed at 2022-01-06T13:54:07Z by exarkun
- Keywords python3 added
comment:8 Changed at 2022-01-06T17:35:35Z by itamarst
So now my guess is that the bug is "the code does multipart/form-data parsing even when that MIME type isn't set." Which in Python 2 would be harmless waste of effort, and I guess blows up in Python 3.
comment:9 Changed at 2022-01-07T19:37:42Z by itamarst
- Resolution set to fixed
- Status changed from new to closed
I wonder if this has to do with the hack in FileUploadFieldStorage.