#1467 closed defect (fixed)
nondeterministic test failure in allmydata.test.test_system.SystemTest.test_filesystem because the Welcome page says "Connected to helper: no" when the test expected "yes"
Reported by: | davidsarah | Owned by: | warner |
---|---|---|---|
Priority: | normal | Milestone: | 1.9.2 |
Component: | code | Version: | 1.8.2 |
Keywords: | test heisenbug | Cc: | |
Launchpad Bug: |
Description
[FAIL]: allmydata.test.test_system.SystemTest.test_filesystem Traceback (most recent call last): File "/home/buildbot/bb-tahoe/brian-linode/build/src/allmydata/test/test_system.py", line 1102, in _got_welcome_helper page) twisted.trial.unittest.FailTest: <html><head> [...] <div class="connected-no"> <div>Helper: <span>pb://ncieuno4yjvioe7fz5y7tqhl3nfrmawh@72.14.177.254:36780,127.0.0.1:36780/2w3qpdgcqegdd3nvhyn2eyaiphzmyog3</span></div> <div>Connected to helper?: <span>no</span></div> </div> [...] </body></html>
Change History (13)
comment:1 Changed at 2011-08-04T02:58:21Z by davidsarah
comment:2 Changed at 2011-08-04T02:59:46Z by davidsarah
- Keywords heisenbug added
- Summary changed from test failure in allmydata.test.test_system.SystemTest.test_filesystem because the Welcome page says "Connected to helper: no" when the test expected "yes" to nondeterministic test failure in allmydata.test.test_system.SystemTest.test_filesystem because the Welcome page says "Connected to helper: no" when the test expected "yes"
zooko: "Yes, I've seen that. It is non-deterministic."
comment:3 follow-up: ↓ 4 Changed at 2012-02-08T18:26:10Z by zooko
test_filesystem just irreproducible FAILed again, and this time it happened on one my favorite buildslaves -- the eminently stable and well-performing "Kyle OpenBSD amd64":
https://tahoe-lafs.org/buildbot-tahoe-lafs/builders/Kyle%20OpenBSD%20amd64/builds/22
This time there was a different error message, which might be a useful clue:
twisted.web.error.Error: 404 Not Found
Here is the part of the test.log which appears after it announces SystemTest.test_filesystem and before it announces the next test (SystemTest.test_filesystem_with_cli_in_subprocess):
2012-02-01 01:36:53.422Z [-] --> allmydata.test.test_system.SystemTest.test_filesystem <-- 2012-02-01 01:36:53.434Z [-] foolscap.pb.Listener starting on 19624 2012-02-01 01:36:53.435Z [-] nevow.appserver.NevowSite starting on 19673 2012-02-01 01:36:53.435Z [-] Starting factory <nevow.appserver.NevowSite instance at 0x21d8becf8> 2012-02-01 01:36:53.436Z [-] My pid: 30725 2012-02-01 01:36:53.438Z [-] twisted.internet.protocol.DatagramProtocol starting on 2948 2012-02-01 01:36:53.439Z [-] Starting protocol <twisted.internet.protocol.DatagramProtocol instance at 0x202adedd0> 2012-02-01 01:36:53.450Z [-] (Port 2948 Closed) 2012-02-01 01:36:53.452Z [-] Stopping protocol <twisted.internet.protocol.DatagramProtocol instance at 0x202adedd0> 2012-02-01 01:36:53.614Z [-] foolscap.pb.Listener starting on 35768 2012-02-01 01:36:54.296Z [-] twisted.internet.protocol.DatagramProtocol starting on 34635 2012-02-01 01:36:54.297Z [-] Starting protocol <twisted.internet.protocol.DatagramProtocol instance at 0x202adedd0> 2012-02-01 01:36:54.361Z [-] foolscap.pb.Listener starting on 37699 2012-02-01 01:36:54.362Z [-] nevow.appserver.NevowSite starting on 46227 2012-02-01 01:36:54.362Z [-] Starting factory <nevow.appserver.NevowSite instance at 0x21d2cc488> 2012-02-01 01:36:54.363Z [-] My pid: 30725 2012-02-01 01:36:54.363Z [-] (Port 34635 Closed) 2012-02-01 01:36:54.364Z [-] Stopping protocol <twisted.internet.protocol.DatagramProtocol instance at 0x202adedd0> 2012-02-01 01:36:54.365Z [-] twisted.internet.protocol.DatagramProtocol starting on 19862 2012-02-01 01:36:54.365Z [-] Starting protocol <twisted.internet.protocol.DatagramProtocol instance at 0x21d937f80> 2012-02-01 01:36:54.366Z [-] (Port 19862 Closed) 2012-02-01 01:36:54.366Z [-] Stopping protocol <twisted.internet.protocol.DatagramProtocol instance at 0x21d937f80> 2012-02-01 01:36:54.439Z [-] foolscap.pb.Listener starting on 7790 2012-02-01 01:36:54.440Z [-] My pid: 30725 2012-02-01 01:36:54.487Z [-] foolscap.pb.Listener starting on 35570 2012-02-01 01:36:54.488Z [-] My pid: 30725 2012-02-01 01:36:54.537Z [-] foolscap.pb.Listener starting on 7049 2012-02-01 01:36:54.537Z [-] nevow.appserver.NevowSite starting on 21595 2012-02-01 01:36:54.538Z [-] Starting factory <nevow.appserver.NevowSite instance at 0x2085d2ef0> 2012-02-01 01:36:54.538Z [-] My pid: 30725 2012-02-01 01:36:54.586Z [-] foolscap.pb.Listener starting on 24152 2012-02-01 01:36:54.587Z [-] My pid: 30725 2012-02-01 01:36:54.589Z [-] twisted.internet.protocol.DatagramProtocol starting on 37407 2012-02-01 01:36:54.590Z [-] Starting protocol <twisted.internet.protocol.DatagramProtocol instance at 0x20e547b90> 2012-02-01 01:36:54.591Z [-] twisted.internet.protocol.DatagramProtocol starting on 30005 2012-02-01 01:36:54.592Z [-] Starting protocol <twisted.internet.protocol.DatagramProtocol instance at 0x21dc17680> 2012-02-01 01:36:54.639Z [-] twisted.internet.protocol.DatagramProtocol starting on 6706 2012-02-01 01:36:54.639Z [-] Starting protocol <twisted.internet.protocol.DatagramProtocol instance at 0x21e3a4098> 2012-02-01 01:36:54.662Z [-] twisted.internet.protocol.DatagramProtocol starting on 30125 2012-02-01 01:36:54.663Z [-] Starting protocol <twisted.internet.protocol.DatagramProtocol instance at 0x21dc2bc68> 2012-02-01 01:36:54.668Z [-] (Port 37407 Closed) 2012-02-01 01:36:54.669Z [-] Stopping protocol <twisted.internet.protocol.DatagramProtocol instance at 0x20e547b90> 2012-02-01 01:36:54.670Z [-] (Port 30005 Closed) 2012-02-01 01:36:54.670Z [-] Stopping protocol <twisted.internet.protocol.DatagramProtocol instance at 0x21dc17680> 2012-02-01 01:36:54.672Z [-] (Port 6706 Closed) 2012-02-01 01:36:54.672Z [-] Stopping protocol <twisted.internet.protocol.DatagramProtocol instance at 0x21e3a4098> 2012-02-01 01:36:54.673Z [-] (Port 30125 Closed) 2012-02-01 01:36:54.673Z [-] Stopping protocol <twisted.internet.protocol.DatagramProtocol instance at 0x21dc2bc68> 2012-02-01 01:36:55.806Z [-] Starting factory <HTTPClientFactory: http://127.0.0.1:19673/> 2012-02-01 01:36:55.877Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:19673/?t=json> 2012-02-01 01:36:55.877Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:19673/> 2012-02-01 01:36:55.908Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:19673/?t=json> 2012-02-01 01:36:57.423Z [-] (Port 46227 Closed) 2012-02-01 01:36:57.423Z [-] Stopping factory <nevow.appserver.NevowSite instance at 0x21d2cc488> 2012-02-01 01:36:57.424Z [-] (Port 37699 Closed) 2012-02-01 01:36:58.484Z [-] foolscap.pb.Listener starting on 37699 2012-02-01 01:36:58.485Z [-] nevow.appserver.NevowSite starting on 6309 2012-02-01 01:36:58.485Z [-] Starting factory <nevow.appserver.NevowSite instance at 0x21d930ea8> 2012-02-01 01:36:58.486Z [-] My pid: 30725 2012-02-01 01:36:58.489Z [-] twisted.internet.protocol.DatagramProtocol starting on 34314 2012-02-01 01:36:58.490Z [-] Starting protocol <twisted.internet.protocol.DatagramProtocol instance at 0x21d910b90> 2012-02-01 01:36:58.494Z [-] (Port 34314 Closed) 2012-02-01 01:36:58.494Z [-] Stopping protocol <twisted.internet.protocol.DatagramProtocol instance at 0x21d910b90> 2012-02-01 01:37:05.172Z [-] Starting factory <HTTPClientFactory: http://127.0.0.1:6309/> 2012-02-01 01:37:05.211Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:21595/> 2012-02-01 01:37:05.212Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:6309/> 2012-02-01 01:37:05.227Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI:DIR2:yodtx22564zgatzzowv765umse:gnlv5ryhwfjvzgqcedx5hlihwrbugh6bd5uiu3vupnsvhqqvq2qa> 2012-02-01 01:37:05.228Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:21595/> 2012-02-01 01:37:05.421Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI:DIR2:yodtx22564zgatzzowv765umse:gnlv5ryhwfjvzgqcedx5hlihwrbugh6bd5uiu3vupnsvhqqvq2qa/subdir1> 2012-02-01 01:37:05.422Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI%3ADIR2%3Ayodtx22564zgatzzowv765umse%3Agnlv5ryhwfjvzgqcedx5hlihwrbugh6bd5uiu3vupnsvhqqvq2qa/> 2012-02-01 01:37:05.766Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI:DIR2:yodtx22564zgatzzowv765umse:gnlv5ryhwfjvzgqcedx5hlihwrbugh6bd5uiu3vupnsvhqqvq2qa/subdir1/mydata567> 2012-02-01 01:37:05.767Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI%3ADIR2%3Ayodtx22564zgatzzowv765umse%3Agnlv5ryhwfjvzgqcedx5hlihwrbugh6bd5uiu3vupnsvhqqvq2qa/subdir1/> 2012-02-01 01:37:05.971Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI:CHK:gmdbvycnymujkjlj646vkngaw4:zyx5xzvredovnxvio2w6iavlikt7i4ie5yh7okbavsf6zo7c6d5q:3:10:112?filename=mydata567> 2012-02-01 01:37:05.972Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI:DIR2:yodtx22564zgatzzowv765umse:gnlv5ryhwfjvzgqcedx5hlihwrbugh6bd5uiu3vupnsvhqqvq2qa/subdir1/mydata567> 2012-02-01 01:37:06.006Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:6309/uri?uri=URI:CHK:gmdbvycnymujkjlj646vkngaw4:zyx5xzvredovnxvio2w6iavlikt7i4ie5yh7okbavsf6zo7c6d5q:3:10:112> 2012-02-01 01:37:06.007Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI:CHK:gmdbvycnymujkjlj646vkngaw4:zyx5xzvredovnxvio2w6iavlikt7i4ie5yh7okbavsf6zo7c6d5q:3:10:112?filename=mydata567> 2012-02-01 01:37:06.060Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI:CHK:gmdbvycnymujkjlj646vkngawy:zyx5xzvredovnxvio2w6iavlikt7i4ie5yh7okbavsf6zo7c6d5q:3:10:112?filename=mydata567> 2012-02-01 01:37:06.061Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI%3ACHK%3Agmdbvycnymujkjlj646vkngaw4%3Azyx5xzvredovnxvio2w6iavlikt7i4ie5yh7okbavsf6zo7c6d5q%3A3%3A10%3A112> 2012-02-01 01:37:06.076Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI:DIR2:yodtx22564zgatzzowv765umse:gnlv5ryhwfjvzgqcedx5hlihwrbugh6bd5uiu3vupnsvhqqvq2qa/subdir3/new.txt> 2012-02-01 01:37:06.077Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI:CHK:gmdbvycnymujkjlj646vkngawy:zyx5xzvredovnxvio2w6iavlikt7i4ie5yh7okbavsf6zo7c6d5q:3:10:112?filename=mydata567> 2012-02-01 01:37:06.567Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI:DIR2:yodtx22564zgatzzowv765umse:gnlv5ryhwfjvzgqcedx5hlihwrbugh6bd5uiu3vupnsvhqqvq2qa/subdir3/new.txt> 2012-02-01 01:37:06.567Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI:DIR2:yodtx22564zgatzzowv765umse:gnlv5ryhwfjvzgqcedx5hlihwrbugh6bd5uiu3vupnsvhqqvq2qa/subdir3/new.txt> 2012-02-01 01:37:06.703Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI:DIR2:yodtx22564zgatzzowv765umse:gnlv5ryhwfjvzgqcedx5hlihwrbugh6bd5uiu3vupnsvhqqvq2qa/subdir3/big.txt> 2012-02-01 01:37:06.707Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI:DIR2:yodtx22564zgatzzowv765umse:gnlv5ryhwfjvzgqcedx5hlihwrbugh6bd5uiu3vupnsvhqqvq2qa/subdir3/new.txt> 2012-02-01 01:37:08.019Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI:DIR2:yodtx22564zgatzzowv765umse:gnlv5ryhwfjvzgqcedx5hlihwrbugh6bd5uiu3vupnsvhqqvq2qa/subdir3/big.txt> 2012-02-01 01:37:08.019Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI:DIR2:yodtx22564zgatzzowv765umse:gnlv5ryhwfjvzgqcedx5hlihwrbugh6bd5uiu3vupnsvhqqvq2qa/subdir3/big.txt> 2012-02-01 01:37:08.734Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI:DIR2:yodtx22564zgatzzowv765umse:gnlv5ryhwfjvzgqcedx5hlihwrbugh6bd5uiu3vupnsvhqqvq2qa/subdir3/new.txt> 2012-02-01 01:37:08.736Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI:DIR2:yodtx22564zgatzzowv765umse:gnlv5ryhwfjvzgqcedx5hlihwrbugh6bd5uiu3vupnsvhqqvq2qa/subdir3/big.txt> 2012-02-01 01:37:09.054Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI:DIR2:yodtx22564zgatzzowv765umse:gnlv5ryhwfjvzgqcedx5hlihwrbugh6bd5uiu3vupnsvhqqvq2qa/subdir3/new.txt> 2012-02-01 01:37:09.055Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI:DIR2:yodtx22564zgatzzowv765umse:gnlv5ryhwfjvzgqcedx5hlihwrbugh6bd5uiu3vupnsvhqqvq2qa/subdir3/new.txt> 2012-02-01 01:37:09.191Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:6309/uri> 2012-02-01 01:37:09.192Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:6309/uri/URI:DIR2:yodtx22564zgatzzowv765umse:gnlv5ryhwfjvzgqcedx5hlihwrbugh6bd5uiu3vupnsvhqqvq2qa/subdir3/new.txt> 2012-02-01 01:37:09.583Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:21595/uri> 2012-02-01 01:37:09.584Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:6309/uri> 2012-02-01 01:37:10.382Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:6309/status> 2012-02-01 01:37:10.383Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:21595/uri> 2012-02-01 01:37:10.433Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:6309/status/up-229> 2012-02-01 01:37:10.433Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:6309/status/> 2012-02-01 01:37:10.454Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:6309/status/down-135> 2012-02-01 01:37:10.454Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:6309/status/up-229> 2012-02-01 01:37:10.644Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:6309/status/mapupdate-2143> 2012-02-01 01:37:10.645Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:6309/status/down-135> 2012-02-01 01:37:10.659Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:6309/status/publish-890> 2012-02-01 01:37:10.660Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:6309/status/mapupdate-2143> 2012-02-01 01:37:10.674Z [HTTPPageGetter,client] Starting factory <HTTPClientFactory: http://127.0.0.1:6309/status/retrieve-1440> 2012-02-01 01:37:10.675Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:6309/status/publish-890> 2012-02-01 01:37:10.687Z [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: http://127.0.0.1:6309/status/retrieve-1440> 2012-02-01 01:37:10.687Z [-] (Port 6309 Closed) 2012-02-01 01:37:10.688Z [-] Stopping factory <nevow.appserver.NevowSite instance at 0x21d930ea8> 2012-02-01 01:37:10.689Z [-] (Port 37699 Closed) 2012-02-01 01:37:10.690Z [-] (Port 24152 Closed) 2012-02-01 01:37:10.691Z [-] (Port 21595 Closed) 2012-02-01 01:37:10.691Z [-] Stopping factory <nevow.appserver.NevowSite instance at 0x2085d2ef0> 2012-02-01 01:37:10.692Z [-] (Port 7049 Closed) 2012-02-01 01:37:10.693Z [-] (Port 35570 Closed) 2012-02-01 01:37:10.693Z [-] (Port 7790 Closed) 2012-02-01 01:37:10.694Z [-] (Port 35768 Closed) 2012-02-01 01:37:10.695Z [-] (Port 19673 Closed) 2012-02-01 01:37:10.695Z [-] Stopping factory <nevow.appserver.NevowSite instance at 0x21d8becf8> 2012-02-01 01:37:10.696Z [-] (Port 19624 Closed) 2012-02-01 01:37:10.708Z [-] Main loop terminated.
comment:4 in reply to: ↑ 3 Changed at 2012-02-08T22:36:40Z by davidsarah
Replying to zooko:
test_filesystem just irreproducible FAILed again, and this time it happened on one my favorite buildslaves -- the eminently stable and well-performing "Kyle OpenBSD amd64":
https://tahoe-lafs.org/buildbot-tahoe-lafs/builders/Kyle%20OpenBSD%20amd64/builds/22
This time there was a different error message, which might be a useful clue:
twisted.web.error.Error: 404 Not Found
Since there's no traceback, I don't think we can reliably conclude that this is the same bug. test_filesystem is a huge test, and there could easily be more than one nondeterministic bug in it. Let's keep this ticket for the check that we know fails at src/allmydata/test/test_system.py?rev=5051#L1101.
comment:5 Changed at 2012-06-13T16:17:11Z by davidsarah
- Owner changed from somebody to warner
warner recently changed something relating to race conditions in these tests, I think.
comment:6 Changed at 2012-06-13T16:17:24Z by davidsarah
- Priority changed from major to normal
comment:7 Changed at 2012-06-14T18:43:37Z by warner
Oh, I think I see the problem. test/common.py (in bounce_client(), when it polls with wait_for_connections()) isn't looking for a Helper connection, just for the introducer and server connections. _check_connections() on common.py line 676 should also look for a helper connection (if a helper is configured).
(the change in 51a5aaa6 should fix a couple of race issues in test_system, but not this particular one)
comment:8 Changed at 2012-06-14T19:21:40Z by Brian Warner <warner@…>
- Resolution set to fixed
- Status changed from new to closed
In b2dcbbb62dc759d4:
comment:9 Changed at 2012-06-14T19:24:11Z by Brian Warner <warner@…>
In b2dcbbb62dc759d4:
comment:10 Changed at 2012-06-14T19:25:19Z by warner
- Milestone changed from undecided to 1.10.0
comment:11 Changed at 2012-06-14T20:39:45Z by Brian Warner <warner@…>
In 5512/1.9.2:
comment:12 Changed at 2012-06-14T20:40:52Z by davidsarah
- Milestone changed from 1.10.0 to 1.9.2
Reviewed and applied to 1.9.2.
The test that failed is at src/allmydata/test/test_system.py@5051#L1101.