Opened at 2009-03-10T21:02:07Z
Closed at 2009-03-19T23:10:04Z
#659 closed defect (invalid)
rusty dusty server fails on tests
Reported by: | arch_o_median | Owned by: | arch_o_median |
---|---|---|---|
Priority: | major | Milestone: | eventually |
Component: | code | Version: | 1.3.0 |
Keywords: | rusty dusty server setup.py setup test build old minimal | Cc: | |
Launchpad Bug: |
Description
I'm installing on a:
Hardware: Pentium III; 64 Mb RAM; 731 Mhz OS: Debian 5.0 Lenny 2.6.26-1-686
The attached file includes:
1) dpkg_l_Log: the output of a dpkg -l
"included because I installed non-base packages including python-devel gcc and darcs 2.1.2"
2) buildlog and testlog[_2]: outputs of setup.py build and test respectively
Attachments have brief descriptive headers echoing the commandlines that invoked them.
Attachments (7)
Change History (23)
Changed at 2009-03-10T21:02:31Z by arch_o_median
comment:1 Changed at 2009-03-10T21:05:48Z by arch_o_median
- Keywords rusty dusty server added
- Summary changed from Install on Old Sytem setup.py test output to Minimal Resource Server: "setup.py build; setup.py test"
comment:2 Changed at 2009-03-10T22:40:46Z by zooko
- Summary changed from Minimal Resource Server: "setup.py build; setup.py test" to tests fail on rusty dusty server
The test log includes failures, starting with this:
allmydata.test.test_immutable Test test_download ... [OK] test_download_abort_if_too_many_corrupted_shares ... [OK] test_download_abort_if_too_many_missing_shares ... [ERROR] [ERROR] [ERROR] test_download_from_only_3_remaining_shares ... [ERROR] [ERROR] [ERROR] test_download_from_only_3_shares_with_good_crypttext_hash ... [ERROR] [ERROR] test_test_code ... [ERROR] [ERROR]
Would you please re-run just these tests (so that it won't take all day), with python ./setup.py test -s allmydata.test.test_immutable, let it run to completion, and attach the output as well as the contents of _trial_temp/test.log ?
comment:3 follow-up: ↓ 4 Changed at 2009-03-10T22:41:00Z by zooko
- Component changed from packaging to code
- Priority changed from minor to major
comment:4 in reply to: ↑ 3 ; follow-up: ↓ 5 Changed at 2009-03-11T02:20:00Z by arch_o_median
Replying to zooko:
Here they come!
Changed at 2009-03-11T02:20:39Z by arch_o_median
Changed at 2009-03-11T02:20:59Z by arch_o_median
comment:5 in reply to: ↑ 4 Changed at 2009-03-11T02:21:55Z by arch_o_median
Replying to arch_o_median:
Replying to zooko:
test.log is _trial_temp/test.log
comment:6 Changed at 2009-03-11T02:41:19Z by zooko
What? Tests passed this time! Darn! Nondeterministic computing!
Could you please try to figure out what makes the tests pass or fail? Perhaps it is that when you run the full suite then the test_immutable part fails? Perhaps it has to do with whether the machine is loaded with other processes at the same time? Or maybe you should have to run test_immutable several times in a row to see it fail?
comment:7 Changed at 2009-03-11T04:37:40Z by zooko
Handy dandy shell script:
/bin/true # to set $? to 0 while [ $? = 0 ] ; do python ./setup.py test -s allmydata.test.test_immutable done
That will execute the test_immutable tests over and over, stopping if python exits with a non-zero exit code, which it will if any test fails.
comment:8 Changed at 2009-03-13T06:32:26Z by arch_o_median
OK, running overnight or until test failure... using my very first from scratch BASH script of more that a couple of lines.
comment:9 Changed at 2009-03-13T22:31:34Z by arch_o_median
I'm running "python ./setup.py test" repeatedly. I plan to let it run for 72 hours.
comment:10 follow-up: ↓ 15 Changed at 2009-03-13T22:52:20Z by zooko
Wait a minute! You mean that when you reran the exact same test which earlier produced "test.log" which you attached in http://allmydata.org/trac/tahoe/ticket/659#comment:1 , it didn't produce the same output? (The test_immutable and some other ones didn't fail?)
comment:11 Changed at 2009-03-14T03:28:39Z by arch_o_median
I ran it using the attached script. Maybe it doesn't do what I think it does? I assume a failed test produces a non-zero exit code.
The script ran 95 times, before I killed it.
I have changed state on the system. In particular I darcs "got" the latest version of tahoe and put it in a directory parallel to the one containing the version obtained by walking through the install. I ran setup.py in the darcs version. I then rm -rf'd the darcs-version directory, and reacquired the "walk-through" 1.3.0 version, and built it. The tests I mention were run on this reinstalled version of 1.3.0, so if tahoe changed state outside of its install directory (I also rm -rf'd ~/.tahoe) then I've made things harder on myself.
I've also apt-get install vim. I _may_ have apt-get installed other packages but I don't remember for sure and I haven't convinced dpkg to tell me what I installed most recently, yet.
I notice that the ERROR's in the first and second logs here: http://allmydata.org/trac/tahoe/attachment/ticket/659/old_sys_inst_setup_test_logs.tar.bz2 are not in the same tests. The second log passes test_immutable, but produces errors in test_repairer.
comment:12 Changed at 2009-03-14T15:36:47Z by zooko
Maybe it doesn't do what I think it does? I assume a failed test produces a non-zero exit code.
Did you capture the stdout/stderr so you can grep it and see if any tests failed?
comment:13 Changed at 2009-03-14T15:38:27Z by zooko
if tahoe changed state outside of its install directory (I also rm -rf'd ~/.tahoe)
It shouldn't have changed anything outside of those two directories. In fact, just running the unit tests shouldn't have changed anything outside the install directory.
comment:14 Changed at 2009-03-14T15:48:04Z by zooko
I have changed state on the system. In particular I darcs "got" the latest version of tahoe and put it in a directory parallel to the one containing the version obtained by walking through the install. I ran setup.py in the darcs version. I then rm -rf'd the darcs-version directory, and reacquired the "walk-through" 1.3.0 version, and built it. The tests I mention were run on this reinstalled version of 1.3.0,
Which tests? Maybe you could identify which tests you mean using a URL to their log file or something.
I still don't understand why it failed one time but didn't fail another time. I think, but am not entirely certain, that you actually ran the same tests on the same source code with the same system both times, which implies that there is a non-determinism problem in the tests (probably a timing problem). But can you confirm that this is what you did? I really want a report of a controlled experiment in which the exact same test produced two different results.
By the way, I like the command_repeater.sh script!
Changed at 2009-03-19T22:28:42Z by arch_o_median
Changed at 2009-03-19T22:31:14Z by arch_o_median
Changed at 2009-03-19T22:31:22Z by arch_o_median
comment:15 in reply to: ↑ 10 Changed at 2009-03-19T22:42:43Z by arch_o_median
Replying to zooko:
Wait a minute! You mean that when you reran the exact same test which earlier produced "test.log" which you attached in http://allmydata.org/trac/tahoe/ticket/659#comment:1 , it didn't produce the same output? (The test_immutable and some other ones didn't fail?)
Not quite:
When I ran "python ./setup.py test" the first time I got this result: http://allmydata.org/trac/tahoe/attachment/ticket/659/test_all_one
When I ran "python ./setup.py test" the second time I got this results: http://allmydata.org/trac/tahoe/attachment/ticket/659/test_all_two
When I ran "python ./setup.py test -s allmydata.test.test_immutable" 96 times I generated 0 [ERRORS].
The [ERROR] sets generated by "setup.py test" are not the same.
Given the above observations I thought it would be interesting to look at patterns in [ERROR] messages generated across a large number of "setup.py test" runs. I had resolved to do this when a new development arose. The Rusty Dusty server, died.
That is it generated some terrifying error message and became totally non-responsive. Unfortunately this occurred several days ago and I cannot remember the specifics of the error. I eventually powered the machine down. Today when I pressed the power button to bring it back up, it started emitting a regular BEEEP BEEEP noise, and gave no other signs of life (e.g. no activity on monitor). I did this twice.
My new plan is to run the above tests on a different old desktop, while debugging Rusty.
comment:16 Changed at 2009-03-19T23:10:04Z by zooko
- Resolution set to invalid
- Status changed from new to closed
- Summary changed from tests fail on rusty dusty server to rusty dusty server fails on tests
Heh heh heh. Closing this ticket as "invalid". You should try running memtest86 and other such diagnostics.
build test logs