Opened at 2012-05-23T04:14:47Z
Closed at 2012-06-16T18:20:42Z
#1749 closed defect (fixed)
bug in mutable publish that could cause an IndexError when a writer is removed in Publish._connection_problem
Reported by: | davidsarah | Owned by: | warner |
---|---|---|---|
Priority: | critical | Milestone: | 1.9.2 |
Component: | code-mutable | Version: | 1.9.1 |
Keywords: | publish regression test-needed review-needed | Cc: | |
Launchpad Bug: |
Description (last modified by davidsarah)
"Traceback (most recent call last): Failure: allmydata.mutable.common.NotEnoughServersError: (\"Publish ran out of good servers, last failure was: [Failure instance: Traceback: <type 'exceptions.IndexError'>: list index out of range /home/davidsarah/cloud-branch/support/lib/python2.6/site-packages/Twisted-12.0.0-py2.6-linux-i686.egg/twisted/internet/base.py:800:runUntilCurrent /home/davidsarah/cloud-branch/support/lib/python2.6/site-packages/foolscap-0.6.3-py2.6.egg/foolscap/eventual.py:26:_turn /home/davidsarah/cloud-branch/support/lib/python2.6/site-packages/Twisted-12.0.0-py2.6-linux-i686.egg/twisted/internet/defer.py:368:callback /home/davidsarah/cloud-branch/support/lib/python2.6/site-packages/Twisted-12.0.0-py2.6-linux-i686.egg/twisted/internet/defer.py:464:_startRunCallbacks --- <exception caught here> ---\\n/home/davidsarah/cloud-branch/support/lib/python2.6/site-packages/Twisted-12.0.0-py2.6-linux-i686.egg/twisted/internet/defer.py:551:_runCallbacks /home/davidsarah/cloud-branch/src/allmydata/mutable/publish.py:634:_push /home/davidsarah/cloud-branch/src/allmydata/mutable/publish.py:651:push_segment /home/davidsarah/cloud-branch/src/allmydata/mutable/publish.py:637:_push /home/davidsarah/cloud-branch/src/allmydata/mutable/publish.py:773:push_everything_else /home/davidsarah/cloud-branch/src/allmydata/mutable/publish.py:878:finish_publishing /home/davidsarah/cloud-branch/src/allmydata/mutable/publish.py:886:_record_verinfo
I can reproduce this, at least on the cloud-branch, when I do a tahoe put --mutable shortly after the gateway has started.
Attachments (1)
Change History (11)
comment:1 Changed at 2012-05-23T04:16:23Z by davidsarah
- Description modified (diff)
- Owner set to zooko
comment:2 Changed at 2012-05-23T04:25:29Z by warner
sounds like a job for dictutil.DictOfSets!
comment:3 Changed at 2012-05-23T04:42:26Z by warner
Seriously though, yes, DictOfSets was made for this. It deletes the key from the dict when the last value is removed. I skimmed through publish.py: the only place that populates self.writers does so from an unordered set, so self.writers does not need to retain order, so DictOfSets is sufficient, and DictOfLists is not necessary.
comment:4 Changed at 2012-05-23T15:36:26Z by davidsarah
+1 for using DictOfSets.
comment:5 Changed at 2012-06-12T16:56:00Z by davidsarah
- Owner changed from zooko to davidsarah
- Status changed from new to assigned
comment:6 Changed at 2012-06-15T03:30:48Z by davidsarah
- Owner changed from davidsarah to kevan
- Status changed from assigned to new
I looked at changing it to use DictOfSets, but there are several instances of self.writers.values()[0][0]. Currently, that will always pick the first writer added for some arbitrary shnum. If I use list(self.writers.values()[0])[0], that does pass tests, but it is using an arbitrary writer for an arbitrary shnum. Kevan (who wrote this code), is that change ok?
comment:7 Changed at 2012-06-15T03:33:34Z by davidsarah
- Owner changed from kevan to davidsarah
Oh, warner is right, the writers are populated based on self.goal which is unordered.
comment:8 Changed at 2012-06-15T03:44:13Z by davidsarah
Note that there are two places where self.writers is initialized, and the diff in comment:1 only changes one of them. (In general there's a lot of duplicated code between the update and publish methods of class Publish.)
Changed at 2012-06-15T03:47:23Z by davidsarah
Fix a bug in mutable publish that could cause an IndexError? when a writer is removed in Publish._connection_problem. This version uses DictOfSets? as suggested by warner. fixes #1749
comment:9 Changed at 2012-06-15T03:49:34Z by davidsarah
- Keywords review-needed added
- Owner changed from davidsarah to warner
The existing tests should be sufficient to check that the changed code works, but we probably should have a test for the original issue. Review needed for the fix.
comment:10 Changed at 2012-06-16T18:20:42Z by david-sarah@…
- Resolution set to fixed
- Status changed from new to closed
In 635e87bd7b460324:
zooko suggests this fix, which I reviewed and approved of: