Opened at 2011-02-23T00:44:51Z
Last modified at 2011-02-23T02:40:20Z
#1367 new defect
tolerance for broken TCP connections due to incorrect/restrictive firewalls
Reported by: | gdt | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | undecided |
Component: | code-network | Version: | 1.8.2 |
Keywords: | availability firewall reliability | Cc: | |
Launchpad Bug: |
Description
I've run a server and seen problems due to an overzealous firewall, where TCP connections are impaired after a short time. Clients try to talk to the server, and I see queued bytes that are never acked, and it then seems that each access takes 4m or 8m to time out and finish.
Somehow, tahoe should refrain from waiting a long time repeatedly for systems that history predicts will not answer, and operations that can be completed reasonably quickly with the subset of responding servers should finish reasonably quickly.
To reproduce without my firewall, add debug code to the server to discard (instead of processing) data on all TCP connections older than 3 minutes. Then bring a storage node with this impairment up on a grid.
Change History (1)
comment:1 Changed at 2011-02-23T02:40:20Z by davidsarah
- Keywords reliability added