Opened at 2016-12-17T22:18:32Z
Last modified at 2021-03-30T18:40:19Z
#2850 new defect
ugly (temporary) error message when connecting to new onion server
Reported by: | warner | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | soon |
Component: | code-frontend-web | Version: | 1.11.0 |
Keywords: | Cc: | ||
Launchpad Bug: |
Description
While running some smoketests on 1.12, I spun up a new onion-based server (tahoe create-node --listen=tor) on an existing grid, while watching a client on that grid.
I think the client hears about the new service before it's really ready to listen, because for the first 10 or 15 seconds, the client's Welcome page displays an ugly traceback squeezed into the "Connection" column (screenshots attached). The full traceback (obtained from the flogtool log) was:
21:56:02.363 L20 []#132 TubConnector created from p4pmhkdfwp2hdrm33xgvyoiijzkkj6cm to likvz52mql57cxczpf4nnthfjsm4tngj 21:56:02.363 L20 []#133 considering hint: tor:7uq4kbwg5cbiwlhv.onion:3457 21:56:02.363 L20 []#134 connecting to hint: tor:7uq4kbwg5cbiwlhv.onion:3457 21:56:03.671 L28 []#135 FAILURE: [CopiedFailure instance: Traceback from remote host -- Traceback (most recent call last): File "/home/warner/venv/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 498, in errback self._startRunCallbacks(fail) File "/home/warner/venv/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 565, in _startRunCallbacks self._runCallbacks() File "/home/warner/venv/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 651, in _runCallbacks current.result = callback(current.result, *args, **kw) File "/home/warner/venv/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 1355, in gotResult _inlineCallbacks(r, g, deferred) --- <exception caught here> --- File "/home/warner/v -- TRACEBACK ELIDED -- receive(data) File "/home/warner/venv/local/lib/python2.7/site-packages/ometa/tube.py", line 41, in receive status = self._interp.receive(data) File "/home/warner/venv/local/lib/python2.7/site-packages/ometa/interp.py", line 48, in receive for x in self.next: File "/home/warner/venv/local/lib/python2.7/site-packages/ometa/interp.py", line 177, in apply for x in self._apply(f, ruleName, argvals): File "/home/warner/venv/local/lib/python2.7/site-packages/ometa/interp.py", line 110, in _apply for x in rule(): File "/home/warner/venv/local/lib/python2.7/site-packages/ometa/interp.py", line 256, in parse_Or for x in self._eval(subexpr): File "/home/warner/venv/local/lib/python2.7/site-packages/ometa/interp.py", line 241, in parse_And for x in self._eval(subexpr): File "/home/warner/venv/local/lib/python2.7/site-packages/ometa/interp.py", line 440, in parse_Action val = eval(expr.data, self.globals, self._localsStack[-1]) File "<string>", line 1, in <module> File "/home/warner/venv/local/lib/python2.7/site-packages/txsocksx/client.py", line 153, in serverResponse raise e.socks5ErrorMap.get(status)() txsocksx.errors.HostUnreachable: ] 21:56:03.671 L20 []#136 connectorFinished (<foolscap.connection.TubConnector object at 0x7fe0acf919d0 from p4pmhkdfwp2hdrm33xgvyoiijzkkj6cm to likvz52mql57cxczpf4nnthfjsm4tngj>)
After about 15 or maybe 30 seconds, the server became reachable, and the error message cleared up. I believe the client was using the SOCKS port of the local Tor daemon, so the delay probably wasn't because the client was waiting for a local Tor to launch, nor waiting for a control connection to the local Tor.
I think the fix for this will be to have the foolscap Tor handler map this txsocksx.errors.HostUnreachable to something more concise, like just "onion server unreachable".
It's possible that we could do something on the server side too. The server was using a control port on the system Tor daemon to register the onion service. It will have published the .onion address as soon as it started (just after establishing a Tor-mediated connection to the Introducer), but it tells Tor to set up the server at startup too, and that process probably didn't finish by the time the client heard about the address. The .onion address is generated during tahoe create-node, and written into tahoe.cfg, so it's available very early, before the Tor control connection is even started.
So maybe we want to change the server to wait for some sort of ACK from the tor_provider, telling us that the Tor daemon says the onion service descriptors and rendezvous points are ready to go, before allowing the IntroducerClient to publish the storage server. That might help with the first launch of the server. It won't help with later launches, when the Introducer (and clients) already have the announcement, and they're busy trying+retrying to connect, but there's nothing we can do about that.
Attachments (2)
Change History (4)
Changed at 2016-12-17T22:19:01Z by warner
comment:1 Changed at 2020-06-30T14:45:13Z by exarkun
- Milestone changed from 1.13.0 to 1.15.0
Moving open issues out of closed milestones.
comment:2 Changed at 2021-03-30T18:40:19Z by meejah
- Milestone changed from 1.15.0 to soon
Ticket retargeted after milestone closed
error during connect