#1729 closed defect (fixed)

memory leak in allmydata.test.test_web

Reported by: zooko Owned by: Brian Warner <warner@…>
Priority: normal Milestone: 1.9.2
Component: code Version: 1.9.1
Keywords: test memory Cc:
Launchpad Bug:

Description

Running allmydata.test.test_web with --until-failure results in memory usage increasing until it fails when a subprocess can't be created:

time  ./bin/tahoe @trial --rterror --until-failure allmydata.test.test_web

    test_welcome_page_mkdir_button ...                                     [OK]

-------------------------------------------------------------------------------
Ran 243 tests in 22.676s

PASSED (successes=243)
Test Pass 13
allmydata.test.test_web
  Grid
    test_add_lease ...                                                     [OK]
    test_blacklist ...                                                     [OK]
    test_deep_add_lease ...                                                [OK]
    test_deep_check ...                                                    [OK]
    test_deep_check_and_repair ...                                         [OK]
    test_exceptions ...                                                    [OK]
    test_filecheck ...                                                     [OK]
    test_immutable_unknown ...                                             [OK]
    test_mutant_dirnodes_are_omitted ...                                   [OK]
    test_repair_html ...                                                   [OK]
    test_repair_json ...                                                   [OK]
    test_unknown ...                                                       [OK]
  IntroducerWeb
    test_welcome ... Node._startService failed, aborting
[Failure instance: Traceback: <type 'exceptions.OSError'>: [Errno 12] Cannot allocate memory
/usr/lib/python2.7/threading.py:524:__bootstrap
/usr/lib/python2.7/threading.py:551:__bootstrap_inner
/usr/lib/python2.7/threading.py:504:run
--- <exception caught here> ---
/home/zooko/playground/twisted/twisted/twisted/python/threadpool.py:167:_worker
/home/zooko/playground/twisted/twisted/twisted/python/context.py:118:callWithContext
/home/zooko/playground/twisted/twisted/twisted/python/context.py:81:callWithContext
/home/zooko/playground/tahoe-lafs/dw/src/allmydata/util/iputil.py:224:_synchronously_find_addresses_via_config
/home/zooko/playground/tahoe-lafs/dw/src/allmydata/util/iputil.py:238:_query
/usr/lib/python2.7/subprocess.py:679:__init__
/usr/lib/python2.7/subprocess.py:1143:_execute_child
]
calling os.abort()

real    4m41.030s
user    4m20.853s
sys     0m13.239s

This is on my Macbook Pro running Ubuntu 12.04. Tahoe is current (darcs) trunk:

$ darcs pull
Pulling from "zooko@tahoe-lafs.org:/home/source/darcs/tahoe-lafs/trunk"...
No remote changes to pull in!
$ python setup.py update_version
running update_version
darcsver: wrote '1.9.0-r5493' into src/allmydata/_version.py
$ ./bin/tahoe --version
allmydata-tahoe: 1.9.0-r5493,
foolscap: 0.6.3.post0,
pycryptopp: 0.6.0.1206569328141510525648634803928199668821045408958,
zfec: 1.4.24,
Twisted: 10.1.0,
Nevow: 0.10.0,
zope.interface: unknown,
python: 2.7.3,
platform: Linux-Ubuntu_12.04-x86_64-64bit_ELF,
pyOpenSSL: 0.12,
simplejson: 2.3.2,
pycrypto: 2.4.1,
pyasn1: unknown,
mock: 0.8.0beta3,
sqlite3: 2.6.0 [sqlite 3.7.9],
setuptools: 0.6c16dev3

Change History (8)

comment:1 Changed at 2012-05-22T21:23:13Z by warner

There are four test classes in test_web: Web, IntroducerWeb, Util, and Grid. I ruled out Util and IntroducerWeb (running them for several minutes in a row has a flat memory footprint). Running Grid by itself uses a bunch but seems to peak at 78M VMSize after about 2 or 3 passes. Running Web by itself consumes more and more memory.

I'm pretty sure this is due to the "all_contents" class-level dictionary in allmydata.test.common.FakeCHKFileNode and FakeMutableFileNode. I'll look into parametizing that with a container which can be discarded between test runs.

comment:2 Changed at 2012-05-22T22:31:33Z by warner

Yeah, that seemed to be the problem. The patch I'm about to fix moves that container out into the FakeClient created fresh for each test case, which removes the lingering buildup. With that in place, I see the memory usage for --until-failure looping of test_web.Web grow to maybe 250MB or 300MB and then drop back to 150MB (as GC kicks in). I think that closes the leak, although it might be nice to identify why it's still using so much RAM (maybe the test files it's uploading are excessively large).

comment:3 Changed at 2012-05-22T23:07:30Z by Brian Warner <warner@…>

  • Owner set to Brian Warner <warner@…>
  • Resolution set to fixed
  • Status changed from new to closed

In bfee999e20aa9fdc:

(The changeset message doesn't reference this ticket)

comment:4 Changed at 2012-05-22T23:08:08Z by Brian Warner <warner@…>

In bfee999e20aa9fdc:

test_web.py: fix memory leak when run with --until-failure

The Fake*Node classes in test/common.py were accumulating share data in
a class-level dictionary, which persisted from one test run to the next.
As a result, running test_web.py over and over (with trial's
--until-failure feature) made this dictionary grow without bound,
eventually running out of memory.

This fix moves that dictionary into the FakeClient? built fresh for each
test, so it doesn't build up. It does the same thing for "file_types",
which was much smaller but still lived at the class level.

Closes #1729

comment:5 Changed at 2012-05-22T23:09:01Z by warner

  • Milestone changed from undecided to 1.10.0

comment:6 Changed at 2012-05-23T05:28:01Z by Brian Warner <warner@…>

In 5503/1.9.2:

test_web.py: fix memory leak when run with --until-failure

The Fake*Node classes in test/common.py were accumulating share data in
a class-level dictionary, which persisted from one test run to the next.
As a result, running test_web.py over and over (with trial's
--until-failure feature) made this dictionary grow without bound,
eventually running out of memory.

This fix moves that dictionary into the FakeClient? built fresh for each
test, so it doesn't build up. It does the same thing for "file_types",
which was much smaller but still lived at the class level.

Closes #1729

comment:7 Changed at 2012-05-23T05:28:30Z by davidsarah

  • Milestone changed from 1.10.0 to 1.9.2

Safe to apply to 1.9.2 since it only affects tests.

comment:8 Changed at 2012-07-10T20:04:56Z by Brian Warner <warner@…>

In 5852/cloud-backend:

test_web.py: fix memory leak when run with --until-failure

The Fake*Node classes in test/common.py were accumulating share data in
a class-level dictionary, which persisted from one test run to the next.
As a result, running test_web.py over and over (with trial's
--until-failure feature) made this dictionary grow without bound,
eventually running out of memory.

This fix moves that dictionary into the FakeClient? built fresh for each
test, so it doesn't build up. It does the same thing for "file_types",
which was much smaller but still lived at the class level.

Closes #1729

Note: See TracTickets for help on using tickets.