#2364 new defect

Clients in onion grid busy-wait if a storage node is unreachable — at Version 3

Reported by: mhazinsk Owned by:
Priority: major Milestone: undecided
Component: code-network Version: 1.10.0
Keywords: availability reliability anti-censorship tor-protocol anonymity Cc:
Launchpad Bug:

Description (last modified by mhazinsk)

When running "torify tahoe start", the process uses 100% CPU while attempting to connect to a down storage node. Meanwhile, connections to all other storage nodes and the introducer show as intermittent on the grid status page so the grid is unusable.

I'm using the latest trunk version of Tahoe on Arch Linux with Torsocks 2.0.0

Here's the errors that get printed:

[Jan 18 14:42:14] ERROR torsocks[15086]: Host unreachable (in socks5_recv_connect_reply() at socks5.c:528)
2015-01-18 14:42:14-0500 [-] Unhandled Error
	Traceback (most recent call last):
	Failure: foolscap.ipb.DeadReferenceError: Connection was lost (to tubid=awo7) (during method=RIStorageServer.tahoe.allmydata.com:get_version)
	
2015-01-18 14:42:14-0500 [-] Unhandled Error
	Traceback (most recent call last):
	Failure: foolscap.ipb.DeadReferenceError: Connection was lost (to tubid=poce) (during method=RIIntroducerPublisherAndSubscriberService_v2.tahoe.allmydata.com:subscribe_v2)
	
[Jan 18 14:42:24] ERROR torsocks[15086]: Host unreachable (in socks5_recv_connect_reply() at socks5.c:528)
[Jan 18 14:42:33] ERROR torsocks[15086]: Host unreachable (in socks5_recv_connect_reply() at socks5.c:528)
[Jan 18 14:42:56] ERROR torsocks[15086]: Host unreachable (in socks5_recv_connect_reply() at socks5.c:528)
2015-01-18 14:42:56-0500 [-] Unhandled Error
	Traceback (most recent call last):
	Failure: foolscap.ipb.DeadReferenceError: Connection was lost (to tubid=rnfm) (during method=RIStorageServer.tahoe.allmydata.com:get_version)
	
[Jan 18 14:43:20] ERROR torsocks[15086]: Host unreachable (in socks5_recv_connect_reply() at socks5.c:528)
[Jan 18 14:43:33] ERROR torsocks[15086]: Host unreachable (in socks5_recv_connect_reply() at socks5.c:528)
[Jan 18 14:43:43] ERROR torsocks[15086]: Host unreachable (in socks5_recv_connect_reply() at socks5.c:528)
2015-01-18 14:43:43-0500 [-] Unhandled Error
	Traceback (most recent call last):
	Failure: foolscap.ipb.DeadReferenceError: Connection was lost (to tubid=xivp) (during method=RIStorageServer.tahoe.allmydata.com:get_version)
	
2015-01-18 14:43:43-0500 [-] Unhandled Error
	Traceback (most recent call last):
	Failure: foolscap.ipb.DeadReferenceError: Connection was lost (to tubid=rnfm) (during method=RIStorageServer.tahoe.allmydata.com:get_version)
	
2015-01-18 14:43:43-0500 [-] Unhandled Error
	Traceback (most recent call last):
	Failure: foolscap.ipb.DeadReferenceError: Connection was lost (to tubid=poce) (during method=RIIntroducerPublisherAndSubscriberService_v2.tahoe.allmydata.com:subscribe_v2)
	
[Jan 18 14:43:53] ERROR torsocks[15086]: Host unreachable (in socks5_recv_connect_reply() at socks5.c:528)
[Jan 18 14:44:20] ERROR torsocks[15086]: Host unreachable (in socks5_recv_connect_reply() at socks5.c:528)
[Jan 18 14:44:35] ERROR torsocks[15086]: Host unreachable (in socks5_recv_connect_reply() at socks5.c:528)
[Jan 18 14:44:47] ERROR torsocks[15086]: Host unreachable (in socks5_recv_connect_reply() at socks5.c:528)
2015-01-18 14:44:47-0500 [-] Unhandled Error
	Traceback (most recent call last):
	Failure: foolscap.ipb.DeadReferenceError: Connection was lost (to tubid=rnfm) (during method=RIStorageServer.tahoe.allmydata.com:get_version)
	
[Jan 18 14:44:58] ERROR torsocks[15086]: Host unreachable (in socks5_recv_connect_reply() at socks5.c:528)
[Jan 18 14:45:23] ERROR torsocks[15086]: Host unreachable (in socks5_recv_connect_reply() at socks5.c:528)
[Jan 18 14:45:34] ERROR torsocks[15086]: Host unreachable (in socks5_recv_connect_reply() at socks5.c:528)
[Jan 18 14:45:53] ERROR torsocks[15086]: Host unreachable (in socks5_recv_connect_reply() at socks5.c:528)
[Jan 18 14:46:21] ERROR torsocks[15086]: Host unreachable (in socks5_recv_connect_reply() at socks5.c:528)
2015-01-18 14:46:21-0500 [-] Reconnector._failed (furl=[redacted]): [Failure instance: Traceback (failure with no frames): <class 'foolscap.tokens.NegotiationError'>: no connection established within client timeout

This problem persists across restarts of tahoe. However, when I remove the tahoe directory, recreate a tahoe directory with "tahoe create-client ~/.tahoe", and restart tahoe, the problem goes away. So far I have not been able to reproduce this on any other machines.

Change History (3)

comment:1 Changed at 2015-01-18T22:26:24Z by daira

  • Component changed from unknown to code-network
  • Keywords tor added

comment:2 Changed at 2015-01-18T22:26:53Z by daira

  • Description modified (diff)

comment:3 Changed at 2015-01-19T01:20:01Z by mhazinsk

  • Description modified (diff)
Note: See TracTickets for help on using tickets.