Ticket #1225: docs-txt-rst-conversion.patch

File docs-txt-rst-conversion.patch, 130.7 KB (added by p-static, at 2010-10-14T07:49:17Z)

Patch against latest darcs checkout

  • docs/architecture.txt

    diff -rN -u old-tahoe-lafs/docs/architecture.txt new-tahoe-lafs/docs/architecture.txt
    old new  
    1 = Tahoe-LAFS Architecture =
     1=======================
     2Tahoe-LAFS Architecture
     3=======================
     4
     51.  `Overview`_
     62.  `The Key-Value Store`_
     73.  `File Encoding`_
     84.  `Capabilities`_
     95.  `Server Selection`_
     106.  `Swarming Download, Trickling Upload`_
     117.  `The Filesystem Layer`_
     128.  `Leases, Refreshing, Garbage Collection`_
     139.  `File Repairer`_
     1410. `Security`_
     1511. `Reliability`_
    216
    3 1.  Overview
    4 2.  The Key-Value Store
    5 3.  File Encoding
    6 4.  Capabilities
    7 5.  Server Selection
    8 6.  Swarming Download, Trickling Upload
    9 7.  The Filesystem Layer
    10 8.  Leases, Refreshing, Garbage Collection
    11 9.  File Repairer
    12 10. Security
    13 11. Reliability
    1417
    15 
    16 == Overview  ==
     18Overview
     19========
    1720
    1821(See the docs/specifications directory for more details.)
    1922
     
    4043copies files from the local disk onto the decentralized filesystem. We later
    4144provide read-only access to those files, allowing users to recover them.
    4245There are several other applications built on top of the Tahoe-LAFS
    43 filesystem (see the RelatedProjects page of the wiki for a list).
     46filesystem (see the `RelatedProjects
     47<http://tahoe-lafs.org/trac/tahoe-lafs/wiki/RelatedProjects>`_ page of the
     48wiki for a list).
    4449
    4550
    46 == The Key-Value Store ==
     51The Key-Value Store
     52===================
    4753
    4854The key-value store is implemented by a grid of Tahoe-LAFS storage servers --
    4955user-space processes. Tahoe-LAFS storage clients communicate with the storage
     
    7682server to tell a new client about all the others.
    7783
    7884
    79 == File Encoding ==
     85File Encoding
     86=============
    8087
    8188When a client stores a file on the grid, it first encrypts the file. It then
    8289breaks the encrypted file into small segments, in order to reduce the memory
     
    117124into plaintext, then emit the plaintext bytes to the output target.
    118125
    119126
    120 == Capabilities ==
     127Capabilities
     128============
    121129
    122130Capabilities to immutable files represent a specific set of bytes. Think of
    123131it like a hash function: you feed in a bunch of bytes, and you get out a
     
    142150that these potential bytes are indeed the ones that you were looking for.
    143151
    144152The "key-value store" layer doesn't include human-meaningful names.
    145 Capabilities sit on the "global+secure" edge of Zooko's Triangle[1]. They are
     153Capabilities sit on the "global+secure" edge of `Zooko's Triangle`_. They are
    146154self-authenticating, meaning that nobody can trick you into accepting a file
    147155that doesn't match the capability you used to refer to that file. The
    148156filesystem layer (described below) adds human-meaningful names atop the
    149157key-value layer.
    150158
     159.. _`Zooko's Triangle`: http://en.wikipedia.org/wiki/Zooko%27s_triangle
     160
    151161
    152 == Server Selection ==
     162Server Selection
     163================
    153164
    154165When a file is uploaded, the encoded shares are sent to some servers. But to
    155166which ones? The "server selection" algorithm is used to make this choice.
    156167
    157168The storage index is used to consistently-permute the set of all servers nodes
    158 (by sorting them by HASH(storage_index+nodeid)). Each file gets a different
     169(by sorting them by ``HASH(storage_index+nodeid)``). Each file gets a different
    159170permutation, which (on average) will evenly distribute shares among the grid
    160171and avoid hotspots. Each server has announced its available space when it
    161172connected to the introducer, and we use that available space information to
     
    254265  significantly hurt reliability (sometimes the permutation resulted in most
    255266  of the shares being dumped on a single node).
    256267
    257   Another algorithm (known as "denver airport"[2]) uses the permuted hash to
     268  Another algorithm (known as "denver airport" [#naming]_) uses the permuted hash to
    258269  decide on an approximate target for each share, then sends lease requests
    259270  via Chord routing. The request includes the contact information of the
    260271  uploading node, and asks that the node which eventually accepts the lease
     
    263274  the same approach. This allows nodes to avoid maintaining a large number of
    264275  long-term connections, at the expense of complexity and latency.
    265276
     277.. [#naming]  all of these names are derived from the location where they were
     278        concocted, in this case in a car ride from Boulder to DEN. To be
     279        precise, "Tahoe 1" was an unworkable scheme in which everyone who holds
     280        shares for a given file would form a sort of cabal which kept track of
     281        all the others, "Tahoe 2" is the first-100-nodes in the permuted hash
     282        described in this document, and "Tahoe 3" (or perhaps "Potrero hill 1")
     283        was the abandoned ring-with-many-hands approach.
    266284
    267 == Swarming Download, Trickling Upload ==
     285
     286Swarming Download, Trickling Upload
     287===================================
    268288
    269289Because the shares being downloaded are distributed across a large number of
    270290nodes, the download process will pull from many of them at the same time. The
     
    295315See "helper.txt" for details about the upload helper.
    296316
    297317
    298 == The Filesystem Layer ==
     318The Filesystem Layer
     319====================
    299320
    300321The "filesystem" layer is responsible for mapping human-meaningful pathnames
    301322(directories and filenames) to pieces of data. The actual bytes inside these
     
    325346that are globally visible.
    326347
    327348
    328 == Leases, Refreshing, Garbage Collection ==
     349Leases, Refreshing, Garbage Collection
     350======================================
    329351
    330352When a file or directory in the virtual filesystem is no longer referenced,
    331353the space that its shares occupied on each storage server can be freed,
     
    346368garbage collection.
    347369
    348370
    349 == File Repairer ==
     371File Repairer
     372=============
    350373
    351374Shares may go away because the storage server hosting them has suffered a
    352375failure: either temporary downtime (affecting availability of the file), or a
     
    403426  in client behavior.
    404427
    405428
    406 == Security ==
     429Security
     430========
    407431
    408432The design goal for this project is that an attacker may be able to deny
    409433service (i.e. prevent you from recovering a file that was uploaded earlier)
    410434but can accomplish none of the following three attacks:
    411435
    412  1) violate confidentiality: the attacker gets to view data to which you have
    413     not granted them access
    414  2) violate integrity: the attacker convinces you that the wrong data is
    415     actually the data you were intending to retrieve
    416  3) violate unforgeability: the attacker gets to modify a mutable file or
    417     directory (either the pathnames or the file contents) to which you have
    418     not given them write permission
     4361) violate confidentiality: the attacker gets to view data to which you have
     437   not granted them access
     4382) violate integrity: the attacker convinces you that the wrong data is
     439   actually the data you were intending to retrieve
     4403) violate unforgeability: the attacker gets to modify a mutable file or
     441   directory (either the pathnames or the file contents) to which you have
     442   not given them write permission
    419443
    420444Integrity (the promise that the downloaded data will match the uploaded data)
    421445is provided by the hashes embedded in the capability (for immutable files) or
     
    467491capabilities).
    468492
    469493
    470 == Reliability ==
     494Reliability
     495===========
    471496
    472497File encoding and peer-node selection parameters can be adjusted to achieve
    473498different goals. Each choice results in a number of properties; there are
     
    532557view the disk consumption of each. It is also acquiring some sections with
    533558availability/reliability numbers, as well as preliminary cost analysis data.
    534559This tool will continue to evolve as our analysis improves.
    535 
    536 ------------------------------
    537 
    538 [1]: http://en.wikipedia.org/wiki/Zooko%27s_triangle
    539 
    540 [2]: all of these names are derived from the location where they were
    541      concocted, in this case in a car ride from Boulder to DEN. To be
    542      precise, "Tahoe 1" was an unworkable scheme in which everyone who holds
    543      shares for a given file would form a sort of cabal which kept track of
    544      all the others, "Tahoe 2" is the first-100-nodes in the permuted hash
    545      described in this document, and "Tahoe 3" (or perhaps "Potrero hill 1")
    546      was the abandoned ring-with-many-hands approach.
    547 
  • docs/backdoors.txt

    diff -rN -u old-tahoe-lafs/docs/backdoors.txt new-tahoe-lafs/docs/backdoors.txt
    old new  
    1 Statement on Backdoors
     1======================
     2Statement on Backdoors
     3======================
    24
    35October 5, 2010
    46
    5 The New York Times has recently reported that the current U.S. administration is proposing a bill that would apparently, if passed, require communication systems to facilitate government wiretapping and access to encrypted data:
     7The New York Times has recently reported that the current U.S. administration
     8is proposing a bill that would apparently, if passed, require communication
     9systems to facilitate government wiretapping and access to encrypted data:
    610
    711 http://www.nytimes.com/2010/09/27/us/27wiretap.html (login required; username/password pairs available at  http://www.bugmenot.com/view/nytimes.com).
    812
    9 Commentary by the  Electronic Frontier Foundation (https://www.eff.org/deeplinks/2010/09/government-seeks ),  Peter Suderman / Reason (http://reason.com/blog/2010/09/27/obama-administration-frustrate ),  Julian Sanchez / Cato Institute (http://www.cato-at-liberty.org/designing-an-insecure-internet/ ).
    10 
    11 The core Tahoe developers promise never to change Tahoe-LAFS to facilitate government access to data stored or transmitted by it. Even if it were desirable to facilitate such access—which it is not—we believe it would not be technically feasible to do so without severely compromising Tahoe-LAFS' security against other attackers. There have been many examples in which backdoors intended for use by government have introduced vulnerabilities exploitable by other parties (a notable example being the Greek cellphone eavesdropping scandal in 2004/5). RFCs  1984 and  2804 elaborate on the security case against such backdoors.
    12 
    13 Note that since Tahoe-LAFS is open-source software, forks by people other than the current core developers are possible. In that event, we would try to persuade any such forks to adopt a similar policy.
     13Commentary by the  Electronic Frontier Foundation
     14(https://www.eff.org/deeplinks/2010/09/government-seeks ),  Peter Suderman /
     15Reason (http://reason.com/blog/2010/09/27/obama-administration-frustrate ),
     16Julian Sanchez / Cato Institute
     17(http://www.cato-at-liberty.org/designing-an-insecure-internet/ ).
     18
     19The core Tahoe developers promise never to change Tahoe-LAFS to facilitate
     20government access to data stored or transmitted by it. Even if it were
     21desirable to facilitate such access—which it is not—we believe it would not be
     22technically feasible to do so without severely compromising Tahoe-LAFS'
     23security against other attackers. There have been many examples in which
     24backdoors intended for use by government have introduced vulnerabilities
     25exploitable by other parties (a notable example being the Greek cellphone
     26eavesdropping scandal in 2004/5). RFCs  1984 and  2804 elaborate on the
     27security case against such backdoors.
     28
     29Note that since Tahoe-LAFS is open-source software, forks by people other than
     30the current core developers are possible. In that event, we would try to
     31persuade any such forks to adopt a similar policy.
    1432
    1533The following Tahoe-LAFS developers agree with this statement:
    1634
    1735David-Sarah Hopwood
     36
    1837Zooko Wilcox-O'Hearn
     38
    1939Brian Warner
     40
    2041Kevan Carstensen
     42
    2143Frédéric Marti
     44
    2245Jack Lloyd
     46
    2347François Deppierraz
     48
    2449Yu Xue
     50
    2551Marc Tooley
  • docs/backupdb.txt

    diff -rN -u old-tahoe-lafs/docs/backupdb.txt new-tahoe-lafs/docs/backupdb.txt
    old new  
    1 = The Tahoe BackupDB =
     1==================
     2The Tahoe BackupDB
     3==================
     4
     51.  `Overview`_
     62.  `Schema`_
     73.  `Upload Operation`_
     84.  `Directory Operations`_
    29
    3 == Overview ==
     10Overview
     11========
    412To speed up backup operations, Tahoe maintains a small database known as the
    513"backupdb". This is used to avoid re-uploading files which have already been
    614uploaded recently.
     
    3341as Debian etch (4.0 "oldstable") or Ubuntu Edgy (6.10) the "python-pysqlite2"
    3442package won't work, but the "sqlite3-dev" package will.
    3543
    36 == Schema ==
     44Schema
     45======
    3746
    38 The database contains the following tables:
     47The database contains the following tables::
    3948
    40 CREATE TABLE version
    41 (
    42  version integer  # contains one row, set to 1
    43 );
    44 
    45 CREATE TABLE local_files
    46 (
    47  path  varchar(1024),  PRIMARY KEY -- index, this is os.path.abspath(fn)
    48  size  integer,         -- os.stat(fn)[stat.ST_SIZE]
    49  mtime number,          -- os.stat(fn)[stat.ST_MTIME]
    50  ctime number,          -- os.stat(fn)[stat.ST_CTIME]
    51  fileid integer
    52 );
    53 
    54 CREATE TABLE caps
    55 (
    56  fileid integer PRIMARY KEY AUTOINCREMENT,
    57  filecap varchar(256) UNIQUE    -- URI:CHK:...
    58 );
    59 
    60 CREATE TABLE last_upload
    61 (
    62  fileid INTEGER PRIMARY KEY,
    63  last_uploaded TIMESTAMP,
    64  last_checked TIMESTAMP
    65 );
    66 
    67 CREATE TABLE directories
    68 (
    69  dirhash varchar(256) PRIMARY KEY,
    70  dircap varchar(256),
    71  last_uploaded TIMESTAMP,
    72  last_checked TIMESTAMP
    73 );
     49  CREATE TABLE version
     50  (
     51   version integer  # contains one row, set to 1
     52  );
     53 
     54  CREATE TABLE local_files
     55  (
     56   path  varchar(1024),  PRIMARY KEY -- index, this is os.path.abspath(fn)
     57   size  integer,         -- os.stat(fn)[stat.ST_SIZE]
     58   mtime number,          -- os.stat(fn)[stat.ST_MTIME]
     59   ctime number,          -- os.stat(fn)[stat.ST_CTIME]
     60   fileid integer
     61  );
     62 
     63  CREATE TABLE caps
     64  (
     65   fileid integer PRIMARY KEY AUTOINCREMENT,
     66   filecap varchar(256) UNIQUE    -- URI:CHK:...
     67  );
     68 
     69  CREATE TABLE last_upload
     70  (
     71   fileid INTEGER PRIMARY KEY,
     72   last_uploaded TIMESTAMP,
     73   last_checked TIMESTAMP
     74  );
     75 
     76  CREATE TABLE directories
     77  (
     78   dirhash varchar(256) PRIMARY KEY,
     79   dircap varchar(256),
     80   last_uploaded TIMESTAMP,
     81   last_checked TIMESTAMP
     82  );
    7483
    75 == Upload Operation ==
     84Upload Operation
     85================
    7686
    7787The upload process starts with a pathname (like ~/.emacs) and wants to end up
    7888with a file-cap (like URI:CHK:...).
     
    8292is not present in this table, the file must be uploaded. The upload process
    8393is:
    8494
    85  1. record the file's size, creation time, and modification time
    86  2. upload the file into the grid, obtaining an immutable file read-cap
    87  3. add an entry to the 'caps' table, with the read-cap, to get a fileid
    88  4. add an entry to the 'last_upload' table, with the current time
    89  5. add an entry to the 'local_files' table, with the fileid, the path,
    90     and the local file's size/ctime/mtime
     951. record the file's size, creation time, and modification time
     96
     972. upload the file into the grid, obtaining an immutable file read-cap
     98
     993. add an entry to the 'caps' table, with the read-cap, to get a fileid
     100
     1014. add an entry to the 'last_upload' table, with the current time
     102
     1035. add an entry to the 'local_files' table, with the fileid, the path,
     104   and the local file's size/ctime/mtime
    91105
    92106If the path *is* present in 'local_files', the easy-to-compute identifying
    93107information is compared: file size and ctime/mtime. If these differ, the file
     
    140154into the grid. The --no-timestamps can be used to disable this optimization,
    141155forcing every byte of the file to be hashed and encoded.
    142156
    143 == Directory Operations ==
     157Directory Operations
     158====================
    144159
    145160Once the contents of a directory are known (a filecap for each file, and a
    146161dircap for each directory), the backup process must find or create a tahoe
  • docs/configuration.txt

    diff -rN -u old-tahoe-lafs/docs/configuration.txt new-tahoe-lafs/docs/configuration.txt
    old new  
    1 
    2 = Configuring a Tahoe node =
     1========================
     2Configuring a Tahoe node
     3========================
     4
     51.  `Overall Node Configuration`_
     62.  `Client Configuration`_
     73.  `Storage Server Configuration`_
     84.  `Running A Helper`_
     95.  `Running An Introducer`_
     106.  `Other Files in BASEDIR`_
     117.  `Other files`_
     128.  `Backwards Compatibility Files`_
     139.  `Example`_
    314
    415A Tahoe node is configured by writing to files in its base directory. These
    516files are read by the node when it starts, so each time you change them, you
     
    2233
    2334The item descriptions below use the following types:
    2435
    25  boolean: one of (True, yes, on, 1, False, off, no, 0), case-insensitive
    26  strports string: a Twisted listening-port specification string, like "tcp:80"
    27                   or "tcp:3456:interface=127.0.0.1". For a full description of
    28                   the format, see
    29                   http://twistedmatrix.com/documents/current/api/twisted.application.strports.html
    30  FURL string: a Foolscap endpoint identifier, like
    31               pb://soklj4y7eok5c3xkmjeqpw@192.168.69.247:44801/eqpwqtzm
     36boolean
     37    one of (True, yes, on, 1, False, off, no, 0), case-insensitive
     38
     39strports string
     40    a Twisted listening-port specification string, like "tcp:80"
     41    or "tcp:3456:interface=127.0.0.1". For a full description of
     42    the format, see
     43    http://twistedmatrix.com/documents/current/api/twisted.application.strports.html
     44
     45FURL string
     46    a Foolscap endpoint identifier, like
     47    pb://soklj4y7eok5c3xkmjeqpw@192.168.69.247:44801/eqpwqtzm
    3248
    3349
    34 == Overall Node Configuration ==
     50Overall Node Configuration
     51==========================
    3552
    3653This section controls the network behavior of the node overall: which ports
    3754and IP addresses are used, when connections are timed out, etc. This
     
    4360that port number in the tub.port option. If behind a NAT, you *may* need to
    4461set the tub.location option described below.
    4562
     63::
    4664
    47 [node]
     65  [node]
    4866
    49 nickname = (UTF-8 string, optional)
     67  nickname = (UTF-8 string, optional)
    5068
    51  This value will be displayed in management tools as this node's "nickname".
    52  If not provided, the nickname will be set to "<unspecified>". This string
    53  shall be a UTF-8 encoded unicode string.
    54 
    55 web.port = (strports string, optional)
    56 
    57  This controls where the node's webserver should listen, providing filesystem
    58  access and node status as defined in webapi.txt . This file contains a
    59  Twisted "strports" specification such as "3456" or
    60  "tcp:3456:interface=127.0.0.1". The 'tahoe create-node' or 'tahoe create-client'
    61  commands set the web.port to "tcp:3456:interface=127.0.0.1" by default; this
    62  is overridable by the "--webport" option. You can make it use SSL by writing
    63  "ssl:3456:privateKey=mykey.pem:certKey=cert.pem" instead.
    64 
    65  If this is not provided, the node will not run a web server.
    66 
    67 web.static = (string, optional)
    68 
    69  This controls where the /static portion of the URL space is served. The
    70  value is a directory name (~username is allowed, and non-absolute names are
    71  interpreted relative to the node's basedir) which can contain HTML and other
    72  files. This can be used to serve a javascript-based frontend to the Tahoe
    73  node, or other services.
    74 
    75  The default value is "public_html", which will serve $BASEDIR/public_html .
    76  With the default settings, http://127.0.0.1:3456/static/foo.html will serve
    77  the contents of $BASEDIR/public_html/foo.html .
    78 
    79 tub.port = (integer, optional)
    80 
    81  This controls which port the node uses to accept Foolscap connections from
    82  other nodes. If not provided, the node will ask the kernel for any available
    83  port. The port will be written to a separate file (named client.port or
    84  introducer.port), so that subsequent runs will re-use the same port.
    85 
    86 tub.location = (string, optional)
    87 
    88  In addition to running as a client, each Tahoe node also runs as a server,
    89  listening for connections from other Tahoe clients. The node announces its
    90  location by publishing a "FURL" (a string with some connection hints) to the
    91  Introducer. The string it publishes can be found in
    92  $BASEDIR/private/storage.furl . The "tub.location" configuration controls
    93  what location is published in this announcement.
    94 
    95  If you don't provide tub.location, the node will try to figure out a useful
    96  one by itself, by using tools like 'ifconfig' to determine the set of IP
    97  addresses on which it can be reached from nodes both near and far. It will
    98  also include the TCP port number on which it is listening (either the one
    99  specified by tub.port, or whichever port was assigned by the kernel when
    100  tub.port is left unspecified).
    101 
    102  You might want to override this value if your node lives behind a firewall
    103  that is doing inbound port forwarding, or if you are using other proxies
    104  such that the local IP address or port number is not the same one that
    105  remote clients should use to connect. You might also want to control this
    106  when using a Tor proxy to avoid revealing your actual IP address through the
    107  Introducer announcement.
    108 
    109  The value is a comma-separated string of host:port location hints, like
    110  this:
    111 
    112   123.45.67.89:8098,tahoe.example.com:8098,127.0.0.1:8098
    113 
    114  A few examples:
    115 
    116   Emulate default behavior, assuming your host has IP address 123.45.67.89
    117   and the kernel-allocated port number was 8098:
    118 
    119    tub.port = 8098
    120    tub.location = 123.45.67.89:8098,127.0.0.1:8098
    121 
    122   Use a DNS name so you can change the IP address more easily:
    123 
    124    tub.port = 8098
    125    tub.location = tahoe.example.com:8098
    126 
    127   Run a node behind a firewall (which has an external IP address) that has
    128   been configured to forward port 7912 to our internal node's port 8098:
    129 
    130    tub.port = 8098
    131    tub.location = external-firewall.example.com:7912
    132 
    133   Run a node behind a Tor proxy (perhaps via torsocks), in client-only mode
    134   (i.e. we can make outbound connections, but other nodes will not be able to
    135   connect to us). The literal 'unreachable.example.org' will not resolve, but
    136   will serve as a reminder to human observers that this node cannot be
    137   reached. "Don't call us.. we'll call you":
    138 
    139    tub.port = 8098
    140    tub.location = unreachable.example.org:0
    141 
    142   Run a node behind a Tor proxy, and make the server available as a Tor
    143   "hidden service". (this assumes that other clients are running their node
    144   with torsocks, such that they are prepared to connect to a .onion address).
    145   The hidden service must first be configured in Tor, by giving it a local
    146   port number and then obtaining a .onion name, using something in the torrc
    147   file like:
    148 
    149     HiddenServiceDir /var/lib/tor/hidden_services/tahoe
    150     HiddenServicePort 29212 127.0.0.1:8098
    151 
    152    once Tor is restarted, the .onion hostname will be in
    153    /var/lib/tor/hidden_services/tahoe/hostname . Then set up your tahoe.cfg
    154    like:
    155 
    156     tub.port = 8098
    157     tub.location = ualhejtq2p7ohfbb.onion:29212
    158 
    159  Most users will not need to set tub.location .
    160 
    161  Note that the old 'advertised_ip_addresses' file from earlier releases is no
    162  longer supported. Tahoe 1.3.0 and later will ignore this file.
    163 
    164 log_gatherer.furl = (FURL, optional)
    165 
    166  If provided, this contains a single FURL string which is used to contact a
    167  'log gatherer', which will be granted access to the logport. This can be
    168  used by centralized storage meshes to gather operational logs in a single
    169  place. Note that when an old-style BASEDIR/log_gatherer.furl file exists
    170  (see 'Backwards Compatibility Files', below), both are used. (for most other
    171  items, the separate config file overrides the entry in tahoe.cfg)
    172 
    173 timeout.keepalive = (integer in seconds, optional)
    174 timeout.disconnect = (integer in seconds, optional)
    175 
    176  If timeout.keepalive is provided, it is treated as an integral number of
    177  seconds, and sets the Foolscap "keepalive timer" to that value. For each
    178  connection to another node, if nothing has been heard for a while, we will
    179  attempt to provoke the other end into saying something. The duration of
    180  silence that passes before sending the PING will be between KT and 2*KT.
    181  This is mainly intended to keep NAT boxes from expiring idle TCP sessions,
    182  but also gives TCP's long-duration keepalive/disconnect timers some traffic
    183  to work with. The default value is 240 (i.e. 4 minutes).
    184 
    185  If timeout.disconnect is provided, this is treated as an integral number of
    186  seconds, and sets the Foolscap "disconnect timer" to that value. For each
    187  connection to another node, if nothing has been heard for a while, we will
    188  drop the connection. The duration of silence that passes before dropping the
    189  connection will be between DT-2*KT and 2*DT+2*KT (please see ticket #521 for
    190  more details). If we are sending a large amount of data to the other end
    191  (which takes more than DT-2*KT to deliver), we might incorrectly drop the
    192  connection. The default behavior (when this value is not provided) is to
    193  disable the disconnect timer.
    194 
    195  See ticket #521 for a discussion of how to pick these timeout values. Using
    196  30 minutes means we'll disconnect after 22 to 68 minutes of inactivity.
    197  Receiving data will reset this timeout, however if we have more than 22min
    198  of data in the outbound queue (such as 800kB in two pipelined segments of 10
    199  shares each) and the far end has no need to contact us, our ping might be
    200  delayed, so we may disconnect them by accident.
    201 
    202 ssh.port = (strports string, optional)
    203 ssh.authorized_keys_file = (filename, optional)
    204 
    205  This enables an SSH-based interactive Python shell, which can be used to
    206  inspect the internal state of the node, for debugging. To cause the node to
    207  accept SSH connections on port 8022 from the same keys as the rest of your
    208  account, use:
    209 
    210    [tub]
    211    ssh.port = 8022
    212    ssh.authorized_keys_file = ~/.ssh/authorized_keys
    213 
    214 tempdir = (string, optional)
    215 
    216  This specifies a temporary directory for the webapi server to use, for
    217  holding large files while they are being uploaded. If a webapi client
    218  attempts to upload a 10GB file, this tempdir will need to have at least 10GB
    219  available for the upload to complete.
    220 
    221  The default value is the "tmp" directory in the node's base directory (i.e.
    222  $NODEDIR/tmp), but it can be placed elsewhere. This directory is used for
    223  files that usually (on a unix system) go into /tmp . The string will be
    224  interpreted relative to the node's base directory.
    225 
    226 == Client Configuration ==
    227 
    228 [client]
    229 introducer.furl = (FURL string, mandatory)
    230 
    231  This FURL tells the client how to connect to the introducer. Each Tahoe grid
    232  is defined by an introducer. The introducer's furl is created by the
    233  introducer node and written into its base directory when it starts,
    234  whereupon it should be published to everyone who wishes to attach a client
    235  to that grid
    236 
    237 helper.furl = (FURL string, optional)
    238 
    239  If provided, the node will attempt to connect to and use the given helper
    240  for uploads. See docs/helper.txt for details.
    241 
    242 key_generator.furl = (FURL string, optional)
    243 
    244  If provided, the node will attempt to connect to and use the given
    245  key-generator service, using RSA keys from the external process rather than
    246  generating its own.
    247 
    248 stats_gatherer.furl = (FURL string, optional)
    249 
    250  If provided, the node will connect to the given stats gatherer and provide
    251  it with operational statistics.
    252 
    253 shares.needed = (int, optional) aka "k", default 3
    254 shares.total = (int, optional) aka "N", N >= k, default 10
    255 shares.happy = (int, optional) 1 <= happy <= N, default 7
    256 
    257  These three values set the default encoding parameters. Each time a new file
    258  is uploaded, erasure-coding is used to break the ciphertext into separate
    259  pieces. There will be "N" (i.e. shares.total) pieces created, and the file
    260  will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved.
    261  The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10).
    262  Setting k to 1 is equivalent to simple replication (uploading N copies of
    263  the file).
    264 
    265  These values control the tradeoff between storage overhead, performance, and
    266  reliability. To a first approximation, a 1MB file will use (1MB*N/k) of
    267  backend storage space (the actual value will be a bit more, because of other
    268  forms of overhead). Up to N-k shares can be lost before the file becomes
    269  unrecoverable, so assuming there are at least N servers, up to N-k servers
    270  can be offline without losing the file. So large N/k ratios are more
    271  reliable, and small N/k ratios use less disk space. Clearly, k must never be
    272  smaller than N.
    273 
    274  Large values of N will slow down upload operations slightly, since more
    275  servers must be involved, and will slightly increase storage overhead due to
    276  the hash trees that are created. Large values of k will cause downloads to
    277  be marginally slower, because more servers must be involved. N cannot be
    278  larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe
    279  uses.
    280 
    281  shares.happy allows you control over the distribution of your immutable file.
    282  For a successful upload, shares are guaranteed to be initially placed on
    283  at least 'shares.happy' distinct servers, the correct functioning of any
    284  k of which is sufficient to guarantee the availability of the uploaded file.
    285  This value should not be larger than the number of servers on your grid.
    286  
    287  A value of shares.happy <= k is allowed, but does not provide any redundancy
    288  if some servers fail or lose shares.
    289 
    290  (Mutable files use a different share placement algorithm that does not
    291   consider this parameter.)
    292 
    293 
    294 == Storage Server Configuration ==
    295 
    296 [storage]
    297 enabled = (boolean, optional)
    298 
    299  If this is True, the node will run a storage server, offering space to other
    300  clients. If it is False, the node will not run a storage server, meaning
    301  that no shares will be stored on this node. Use False this for clients who
    302  do not wish to provide storage service. The default value is True.
    303 
    304 readonly = (boolean, optional)
    305 
    306  If True, the node will run a storage server but will not accept any shares,
    307  making it effectively read-only. Use this for storage servers which are
    308  being decommissioned: the storage/ directory could be mounted read-only,
    309  while shares are moved to other servers. Note that this currently only
    310  affects immutable shares. Mutable shares (used for directories) will be
    311  written and modified anyway. See ticket #390 for the current status of this
    312  bug. The default value is False.
    313 
    314 reserved_space = (str, optional)
    315 
    316  If provided, this value defines how much disk space is reserved: the storage
    317  server will not accept any share which causes the amount of free disk space
    318  to drop below this value. (The free space is measured by a call to statvfs(2)
    319  on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the
    320  user account under which the storage server runs.)
    321 
    322  This string contains a number, with an optional case-insensitive scale
    323  suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So
    324  "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same
    325  thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing.
    326 
    327 expire.enabled =
    328 expire.mode =
    329 expire.override_lease_duration =
    330 expire.cutoff_date =
    331 expire.immutable =
    332 expire.mutable =
    333 
    334  These settings control garbage-collection, in which the server will delete
    335  shares that no longer have an up-to-date lease on them. Please see the
    336  neighboring "garbage-collection.txt" document for full details.
     69    This value will be displayed in management tools as this node's "nickname".
     70    If not provided, the nickname will be set to "<unspecified>". This string
     71    shall be a UTF-8 encoded unicode string.
     72
     73  web.port = (strports string, optional)
     74
     75    This controls where the node's webserver should listen, providing filesystem
     76    access and node status as defined in webapi.txt . This file contains a
     77    Twisted "strports" specification such as "3456" or
     78    "tcp:3456:interface=127.0.0.1". The 'tahoe create-node' or 'tahoe create-client'
     79    commands set the web.port to "tcp:3456:interface=127.0.0.1" by default; this
     80    is overridable by the "--webport" option. You can make it use SSL by writing
     81    "ssl:3456:privateKey=mykey.pem:certKey=cert.pem" instead.
     82   
     83    If this is not provided, the node will not run a web server.
     84
     85  web.static = (string, optional)
     86
     87    This controls where the /static portion of the URL space is served. The
     88    value is a directory name (~username is allowed, and non-absolute names are
     89    interpreted relative to the node's basedir) which can contain HTML and other
     90    files. This can be used to serve a javascript-based frontend to the Tahoe
     91    node, or other services.
     92   
     93    The default value is "public_html", which will serve $BASEDIR/public_html .
     94    With the default settings, http://127.0.0.1:3456/static/foo.html will serve
     95    the contents of $BASEDIR/public_html/foo.html .
     96
     97  tub.port = (integer, optional)
     98
     99    This controls which port the node uses to accept Foolscap connections from
     100    other nodes. If not provided, the node will ask the kernel for any available
     101    port. The port will be written to a separate file (named client.port or
     102    introducer.port), so that subsequent runs will re-use the same port.
     103
     104  tub.location = (string, optional)
     105
     106    In addition to running as a client, each Tahoe node also runs as a server,
     107    listening for connections from other Tahoe clients. The node announces its
     108    location by publishing a "FURL" (a string with some connection hints) to the
     109    Introducer. The string it publishes can be found in
     110    $BASEDIR/private/storage.furl . The "tub.location" configuration controls
     111    what location is published in this announcement.
     112   
     113    If you don't provide tub.location, the node will try to figure out a useful
     114    one by itself, by using tools like 'ifconfig' to determine the set of IP
     115    addresses on which it can be reached from nodes both near and far. It will
     116    also include the TCP port number on which it is listening (either the one
     117    specified by tub.port, or whichever port was assigned by the kernel when
     118    tub.port is left unspecified).
     119   
     120    You might want to override this value if your node lives behind a firewall
     121    that is doing inbound port forwarding, or if you are using other proxies
     122    such that the local IP address or port number is not the same one that
     123    remote clients should use to connect. You might also want to control this
     124    when using a Tor proxy to avoid revealing your actual IP address through the
     125    Introducer announcement.
     126   
     127    The value is a comma-separated string of host:port location hints, like
     128    this:
     129
     130      123.45.67.89:8098,tahoe.example.com:8098,127.0.0.1:8098
     131
     132    A few examples:
     133
     134      Emulate default behavior, assuming your host has IP address 123.45.67.89
     135      and the kernel-allocated port number was 8098:
     136   
     137        tub.port = 8098
     138        tub.location = 123.45.67.89:8098,127.0.0.1:8098
     139   
     140      Use a DNS name so you can change the IP address more easily:
     141   
     142        tub.port = 8098
     143        tub.location = tahoe.example.com:8098
     144   
     145      Run a node behind a firewall (which has an external IP address) that has
     146      been configured to forward port 7912 to our internal node's port 8098:
     147   
     148        tub.port = 8098
     149        tub.location = external-firewall.example.com:7912
     150   
     151      Run a node behind a Tor proxy (perhaps via torsocks), in client-only mode
     152      (i.e. we can make outbound connections, but other nodes will not be able to
     153      connect to us). The literal 'unreachable.example.org' will not resolve, but
     154      will serve as a reminder to human observers that this node cannot be
     155      reached. "Don't call us.. we'll call you":
     156   
     157        tub.port = 8098
     158        tub.location = unreachable.example.org:0
     159   
     160      Run a node behind a Tor proxy, and make the server available as a Tor
     161      "hidden service". (this assumes that other clients are running their node
     162      with torsocks, such that they are prepared to connect to a .onion address).
     163      The hidden service must first be configured in Tor, by giving it a local
     164      port number and then obtaining a .onion name, using something in the torrc
     165      file like:
     166   
     167        HiddenServiceDir /var/lib/tor/hidden_services/tahoe
     168        HiddenServicePort 29212 127.0.0.1:8098
     169   
     170      once Tor is restarted, the .onion hostname will be in
     171      /var/lib/tor/hidden_services/tahoe/hostname . Then set up your tahoe.cfg
     172      like:
     173   
     174        tub.port = 8098
     175        tub.location = ualhejtq2p7ohfbb.onion:29212
     176   
     177    Most users will not need to set tub.location .
     178   
     179    Note that the old 'advertised_ip_addresses' file from earlier releases is no
     180    longer supported. Tahoe 1.3.0 and later will ignore this file.
     181
     182  log_gatherer.furl = (FURL, optional)
     183
     184    If provided, this contains a single FURL string which is used to contact a
     185    'log gatherer', which will be granted access to the logport. This can be
     186    used by centralized storage meshes to gather operational logs in a single
     187    place. Note that when an old-style BASEDIR/log_gatherer.furl file exists
     188    (see 'Backwards Compatibility Files', below), both are used. (for most other
     189    items, the separate config file overrides the entry in tahoe.cfg)
     190
     191  timeout.keepalive = (integer in seconds, optional)
     192  timeout.disconnect = (integer in seconds, optional)
     193
     194    If timeout.keepalive is provided, it is treated as an integral number of
     195    seconds, and sets the Foolscap "keepalive timer" to that value. For each
     196    connection to another node, if nothing has been heard for a while, we will
     197    attempt to provoke the other end into saying something. The duration of
     198    silence that passes before sending the PING will be between KT and 2*KT.
     199    This is mainly intended to keep NAT boxes from expiring idle TCP sessions,
     200    but also gives TCP's long-duration keepalive/disconnect timers some traffic
     201    to work with. The default value is 240 (i.e. 4 minutes).
     202   
     203    If timeout.disconnect is provided, this is treated as an integral number of
     204    seconds, and sets the Foolscap "disconnect timer" to that value. For each
     205    connection to another node, if nothing has been heard for a while, we will
     206    drop the connection. The duration of silence that passes before dropping the
     207    connection will be between DT-2*KT and 2*DT+2*KT (please see ticket #521 for
     208    more details). If we are sending a large amount of data to the other end
     209    (which takes more than DT-2*KT to deliver), we might incorrectly drop the
     210    connection. The default behavior (when this value is not provided) is to
     211    disable the disconnect timer.
     212   
     213    See ticket #521 for a discussion of how to pick these timeout values. Using
     214    30 minutes means we'll disconnect after 22 to 68 minutes of inactivity.
     215    Receiving data will reset this timeout, however if we have more than 22min
     216    of data in the outbound queue (such as 800kB in two pipelined segments of 10
     217    shares each) and the far end has no need to contact us, our ping might be
     218    delayed, so we may disconnect them by accident.
     219
     220  ssh.port = (strports string, optional)
     221  ssh.authorized_keys_file = (filename, optional)
     222
     223    This enables an SSH-based interactive Python shell, which can be used to
     224    inspect the internal state of the node, for debugging. To cause the node to
     225    accept SSH connections on port 8022 from the same keys as the rest of your
     226    account, use:
     227   
     228      [tub]
     229      ssh.port = 8022
     230      ssh.authorized_keys_file = ~/.ssh/authorized_keys
     231
     232  tempdir = (string, optional)
     233
     234    This specifies a temporary directory for the webapi server to use, for
     235    holding large files while they are being uploaded. If a webapi client
     236    attempts to upload a 10GB file, this tempdir will need to have at least 10GB
     237    available for the upload to complete.
     238   
     239    The default value is the "tmp" directory in the node's base directory (i.e.
     240    $NODEDIR/tmp), but it can be placed elsewhere. This directory is used for
     241    files that usually (on a unix system) go into /tmp . The string will be
     242    interpreted relative to the node's base directory.
     243
     244Client Configuration
     245====================
     246
     247::
     248
     249  [client]
     250  introducer.furl = (FURL string, mandatory)
     251 
     252    This FURL tells the client how to connect to the introducer. Each Tahoe grid
     253    is defined by an introducer. The introducer's furl is created by the
     254    introducer node and written into its base directory when it starts,
     255    whereupon it should be published to everyone who wishes to attach a client
     256    to that grid
     257 
     258  helper.furl = (FURL string, optional)
     259 
     260    If provided, the node will attempt to connect to and use the given helper
     261    for uploads. See docs/helper.txt for details.
     262 
     263  key_generator.furl = (FURL string, optional)
     264 
     265    If provided, the node will attempt to connect to and use the given
     266    key-generator service, using RSA keys from the external process rather than
     267    generating its own.
     268 
     269  stats_gatherer.furl = (FURL string, optional)
     270 
     271    If provided, the node will connect to the given stats gatherer and provide
     272    it with operational statistics.
     273 
     274  shares.needed = (int, optional) aka "k", default 3
     275  shares.total = (int, optional) aka "N", N >= k, default 10
     276  shares.happy = (int, optional) 1 <= happy <= N, default 7
     277 
     278    These three values set the default encoding parameters. Each time a new file
     279    is uploaded, erasure-coding is used to break the ciphertext into separate
     280    pieces. There will be "N" (i.e. shares.total) pieces created, and the file
     281    will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved.
     282    The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10).
     283    Setting k to 1 is equivalent to simple replication (uploading N copies of
     284    the file).
     285 
     286    These values control the tradeoff between storage overhead, performance, and
     287    reliability. To a first approximation, a 1MB file will use (1MB*N/k) of
     288    backend storage space (the actual value will be a bit more, because of other
     289    forms of overhead). Up to N-k shares can be lost before the file becomes
     290    unrecoverable, so assuming there are at least N servers, up to N-k servers
     291    can be offline without losing the file. So large N/k ratios are more
     292    reliable, and small N/k ratios use less disk space. Clearly, k must never be
     293    smaller than N.
     294   
     295    Large values of N will slow down upload operations slightly, since more
     296    servers must be involved, and will slightly increase storage overhead due to
     297    the hash trees that are created. Large values of k will cause downloads to
     298    be marginally slower, because more servers must be involved. N cannot be
     299    larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe
     300    uses.
     301   
     302    shares.happy allows you control over the distribution of your immutable file.
     303    For a successful upload, shares are guaranteed to be initially placed on
     304    at least 'shares.happy' distinct servers, the correct functioning of any
     305    k of which is sufficient to guarantee the availability of the uploaded file.
     306    This value should not be larger than the number of servers on your grid.
     307   
     308    A value of shares.happy <= k is allowed, but does not provide any redundancy
     309    if some servers fail or lose shares.
     310   
     311    (Mutable files use a different share placement algorithm that does not
     312    consider this parameter.)
     313
     314
     315Storage Server Configuration
     316============================
     317
     318::
     319
     320  [storage]
     321  enabled = (boolean, optional)
     322 
     323    If this is True, the node will run a storage server, offering space to other
     324    clients. If it is False, the node will not run a storage server, meaning
     325    that no shares will be stored on this node. Use False this for clients who
     326    do not wish to provide storage service. The default value is True.
     327 
     328  readonly = (boolean, optional)
     329 
     330    If True, the node will run a storage server but will not accept any shares,
     331    making it effectively read-only. Use this for storage servers which are
     332    being decommissioned: the storage/ directory could be mounted read-only,
     333    while shares are moved to other servers. Note that this currently only
     334    affects immutable shares. Mutable shares (used for directories) will be
     335    written and modified anyway. See ticket #390 for the current status of this
     336    bug. The default value is False.
     337 
     338  reserved_space = (str, optional)
     339 
     340    If provided, this value defines how much disk space is reserved: the storage
     341    server will not accept any share which causes the amount of free disk space
     342    to drop below this value. (The free space is measured by a call to statvfs(2)
     343    on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the
     344    user account under which the storage server runs.)
     345   
     346    This string contains a number, with an optional case-insensitive scale
     347    suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So
     348    "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same
     349    thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing.
     350 
     351  expire.enabled =
     352  expire.mode =
     353  expire.override_lease_duration =
     354  expire.cutoff_date =
     355  expire.immutable =
     356  expire.mutable =
     357 
     358    These settings control garbage-collection, in which the server will delete
     359    shares that no longer have an up-to-date lease on them. Please see the
     360    neighboring "garbage-collection.txt" document for full details.
    337361
    338362
    339 == Running A Helper ==
     363Running A Helper
     364================
    340365
    341366A "helper" is a regular client node that also offers the "upload helper"
    342367service.
    343368
    344 [helper]
    345 enabled = (boolean, optional)
     369::
    346370
    347  If True, the node will run a helper (see docs/helper.txt for details). The
    348  helper's contact FURL will be placed in private/helper.furl, from which it
    349  can be copied to any clients which wish to use it. Clearly nodes should not
    350  both run a helper and attempt to use one: do not create both helper.furl and
    351  run_helper in the same node. The default is False.
     371  [helper]
     372  enabled = (boolean, optional)
     373 
     374    If True, the node will run a helper (see docs/helper.txt for details). The
     375    helper's contact FURL will be placed in private/helper.furl, from which it
     376    can be copied to any clients which wish to use it. Clearly nodes should not
     377    both run a helper and attempt to use one: do not create both helper.furl and
     378    run_helper in the same node. The default is False.
    352379
    353380
    354 == Running An Introducer ==
     381Running An Introducer
     382=====================
    355383
    356384The introducer node uses a different '.tac' file (named introducer.tac), and
    357385pays attention to the "[node]" section, but not the others.
     
    365393copied into new client nodes before they are started for the first time.
    366394
    367395
    368 == Other Files in BASEDIR ==
     396Other Files in BASEDIR
     397======================
    369398
    370399Some configuration is not kept in tahoe.cfg, for the following reasons:
    371400
    372  * it is generated by the node at startup, e.g. encryption keys. The node
    373    never writes to tahoe.cfg
    374  * it is generated by user action, e.g. the 'tahoe create-alias' command
     401* it is generated by the node at startup, e.g. encryption keys. The node
     402  never writes to tahoe.cfg
     403* it is generated by user action, e.g. the 'tahoe create-alias' command
    375404
    376405In addition, non-configuration persistent state is kept in the node's base
    377406directory, next to the configuration knobs.
    378407
    379408This section describes these other files.
    380409
    381 
    382 private/node.pem : This contains an SSL private-key certificate. The node
    383 generates this the first time it is started, and re-uses it on subsequent
    384 runs. This certificate allows the node to have a cryptographically-strong
    385 identifier (the Foolscap "TubID"), and to establish secure connections to
    386 other nodes.
    387 
    388 storage/ : Nodes which host StorageServers will create this directory to hold
    389 shares of files on behalf of other clients. There will be a directory
    390 underneath it for each StorageIndex for which this node is holding shares.
    391 There is also an "incoming" directory where partially-completed shares are
    392 held while they are being received.
    393 
    394 client.tac : this file defines the client, by constructing the actual Client
    395 instance each time the node is started. It is used by the 'twistd'
    396 daemonization program (in the "-y" mode), which is run internally by the
    397 "tahoe start" command. This file is created by the "tahoe create-node" or
    398 "tahoe create-client" commands.
    399 
    400 private/control.furl : this file contains a FURL that provides access to a
    401 control port on the client node, from which files can be uploaded and
    402 downloaded. This file is created with permissions that prevent anyone else
    403 from reading it (on operating systems that support such a concept), to insure
    404 that only the owner of the client node can use this feature. This port is
    405 intended for debugging and testing use.
    406 
    407 private/logport.furl : this file contains a FURL that provides access to a
    408 'log port' on the client node, from which operational logs can be retrieved.
    409 Do not grant logport access to strangers, because occasionally secret
    410 information may be placed in the logs.
    411 
    412 private/helper.furl : if the node is running a helper (for use by other
    413 clients), its contact FURL will be placed here. See docs/helper.txt for more
    414 details.
    415 
    416 private/root_dir.cap (optional): The command-line tools will read a directory
    417 cap out of this file and use it, if you don't specify a '--dir-cap' option or
    418 if you specify '--dir-cap=root'.
    419 
    420 private/convergence (automatically generated): An added secret for encrypting
    421 immutable files. Everyone who has this same string in their
    422 private/convergence file encrypts their immutable files in the same way when
    423 uploading them. This causes identical files to "converge" -- to share the
    424 same storage space since they have identical ciphertext -- which conserves
    425 space and optimizes upload time, but it also exposes files to the possibility
    426 of a brute-force attack by people who know that string. In this attack, if
    427 the attacker can guess most of the contents of a file, then they can use
    428 brute-force to learn the remaining contents.
     410private/node.pem
     411  This contains an SSL private-key certificate. The node
     412  generates this the first time it is started, and re-uses it on subsequent
     413  runs. This certificate allows the node to have a cryptographically-strong
     414  identifier (the Foolscap "TubID"), and to establish secure connections to
     415  other nodes.
     416
     417storage/
     418  Nodes which host StorageServers will create this directory to hold
     419  shares of files on behalf of other clients. There will be a directory
     420  underneath it for each StorageIndex for which this node is holding shares.
     421  There is also an "incoming" directory where partially-completed shares are
     422  held while they are being received.
     423
     424client.tac
     425  this file defines the client, by constructing the actual Client
     426  instance each time the node is started. It is used by the 'twistd'
     427  daemonization program (in the "-y" mode), which is run internally by the
     428  "tahoe start" command. This file is created by the "tahoe create-node" or
     429  "tahoe create-client" commands.
     430
     431private/control.furl
     432  this file contains a FURL that provides access to a
     433  control port on the client node, from which files can be uploaded and
     434  downloaded. This file is created with permissions that prevent anyone else
     435  from reading it (on operating systems that support such a concept), to insure
     436  that only the owner of the client node can use this feature. This port is
     437  intended for debugging and testing use.
     438
     439private/logport.furl
     440  this file contains a FURL that provides access to a
     441  'log port' on the client node, from which operational logs can be retrieved.
     442  Do not grant logport access to strangers, because occasionally secret
     443  information may be placed in the logs.
     444
     445private/helper.furl
     446  if the node is running a helper (for use by other
     447  clients), its contact FURL will be placed here. See docs/helper.txt for more
     448  details.
     449
     450private/root_dir.cap (optional)
     451  The command-line tools will read a directory
     452  cap out of this file and use it, if you don't specify a '--dir-cap' option or
     453  if you specify '--dir-cap=root'.
     454
     455private/convergence (automatically generated)
     456  An added secret for encrypting
     457  immutable files. Everyone who has this same string in their
     458  private/convergence file encrypts their immutable files in the same way when
     459  uploading them. This causes identical files to "converge" -- to share the
     460  same storage space since they have identical ciphertext -- which conserves
     461  space and optimizes upload time, but it also exposes files to the possibility
     462  of a brute-force attack by people who know that string. In this attack, if
     463  the attacker can guess most of the contents of a file, then they can use
     464  brute-force to learn the remaining contents.
    429465
    430466So the set of people who know your private/convergence string is the set of
    431467people who converge their storage space with you when you and they upload
     
    439475possible, put the empty string (so that private/convergence is a zero-length
    440476file).
    441477
     478Other files
     479===========
    442480
    443 == Other files ==
     481logs/
     482  Each Tahoe node creates a directory to hold the log messages produced
     483  as the node runs. These logfiles are created and rotated by the "twistd"
     484  daemonization program, so logs/twistd.log will contain the most recent
     485  messages, logs/twistd.log.1 will contain the previous ones, logs/twistd.log.2
     486  will be older still, and so on. twistd rotates logfiles after they grow
     487  beyond 1MB in size. If the space consumed by logfiles becomes troublesome,
     488  they should be pruned: a cron job to delete all files that were created more
     489  than a month ago in this logs/ directory should be sufficient.
     490
     491my_nodeid
     492  this is written by all nodes after startup, and contains a
     493  base32-encoded (i.e. human-readable) NodeID that identifies this specific
     494  node. This NodeID is the same string that gets displayed on the web page (in
     495  the "which peers am I connected to" list), and the shortened form (the first
     496  characters) is recorded in various log messages.
    444497
    445 logs/ : Each Tahoe node creates a directory to hold the log messages produced
    446 as the node runs. These logfiles are created and rotated by the "twistd"
    447 daemonization program, so logs/twistd.log will contain the most recent
    448 messages, logs/twistd.log.1 will contain the previous ones, logs/twistd.log.2
    449 will be older still, and so on. twistd rotates logfiles after they grow
    450 beyond 1MB in size. If the space consumed by logfiles becomes troublesome,
    451 they should be pruned: a cron job to delete all files that were created more
    452 than a month ago in this logs/ directory should be sufficient.
    453 
    454 my_nodeid : this is written by all nodes after startup, and contains a
    455 base32-encoded (i.e. human-readable) NodeID that identifies this specific
    456 node. This NodeID is the same string that gets displayed on the web page (in
    457 the "which peers am I connected to" list), and the shortened form (the first
    458 characters) is recorded in various log messages.
    459 
    460 
    461 == Backwards Compatibility Files ==
     498Backwards Compatibility Files
     499=============================
    462500
    463501Tahoe releases before 1.3.0 had no 'tahoe.cfg' file, and used distinct files
    464502for each item listed below. For each configuration knob, if the distinct file
    465 exists, it will take precedence over the corresponding item in tahoe.cfg .
    466 
     503exists, it will take precedence over the corresponding item in tahoe.cfg.
    467504
    468 [node]nickname : BASEDIR/nickname
    469 [node]web.port : BASEDIR/webport
    470 [node]tub.port : BASEDIR/client.port  (for Clients, not Introducers)
    471 [node]tub.port : BASEDIR/introducer.port  (for Introducers, not Clients)
    472       (note that, unlike other keys, tahoe.cfg overrides the *.port file)
    473 [node]tub.location : replaces BASEDIR/advertised_ip_addresses
    474 [node]log_gatherer.furl : BASEDIR/log_gatherer.furl (one per line)
    475 [node]timeout.keepalive : BASEDIR/keepalive_timeout
    476 [node]timeout.disconnect : BASEDIR/disconnect_timeout
    477 [client]introducer.furl : BASEDIR/introducer.furl
    478 [client]helper.furl : BASEDIR/helper.furl
    479 [client]key_generator.furl : BASEDIR/key_generator.furl
    480 [client]stats_gatherer.furl : BASEDIR/stats_gatherer.furl
    481 [storage]enabled : BASEDIR/no_storage (False if no_storage exists)
    482 [storage]readonly : BASEDIR/readonly_storage (True if readonly_storage exists)
    483 [storage]sizelimit : BASEDIR/sizelimit
    484 [storage]debug_discard : BASEDIR/debug_discard_storage
    485 [helper]enabled : BASEDIR/run_helper (True if run_helper exists)
     505===========================  ===============================  =================
     506Config setting               File                             Comment
     507===========================  ===============================  =================
     508[node]nickname               BASEDIR/nickname
     509[node]web.port               BASEDIR/webport
     510[node]tub.port               BASEDIR/client.port              (for Clients, not Introducers)
     511[node]tub.port               BASEDIR/introducer.port          (for Introducers, not Clients) (note that, unlike other keys, tahoe.cfg overrides this file)
     512[node]tub.location           BASEDIR/advertised_ip_addresses
     513[node]log_gatherer.furl      BASEDIR/log_gatherer.furl        (one per line)
     514[node]timeout.keepalive      BASEDIR/keepalive_timeout
     515[node]timeout.disconnect     BASEDIR/disconnect_timeout
     516[client]introducer.furl      BASEDIR/introducer.furl
     517[client]helper.furl          BASEDIR/helper.furl
     518[client]key_generator.furl   BASEDIR/key_generator.furl
     519[client]stats_gatherer.furl  BASEDIR/stats_gatherer.furl
     520[storage]enabled             BASEDIR/no_storage               (False if no_storage exists)
     521[storage]readonly            BASEDIR/readonly_storage         (True if readonly_storage exists)
     522[storage]sizelimit           BASEDIR/sizelimit
     523[storage]debug_discard       BASEDIR/debug_discard_storage
     524[helper]enabled              BASEDIR/run_helper               (True if run_helper exists)
     525===========================  ===============================  =================
    486526
    487527Note: the functionality of [node]ssh.port and [node]ssh.authorized_keys_file
    488528were previously combined, controlled by the presence of a
     
    490530indicated which port the ssh server should listen on, and the contents of the
    491531file provided the ssh public keys to accept. Support for these files has been
    492532removed completely. To ssh into your Tahoe node, add [node]ssh.port and
    493 [node].ssh_authorized_keys_file statements to your tahoe.cfg .
     533[node].ssh_authorized_keys_file statements to your tahoe.cfg.
    494534
    495535Likewise, the functionality of [node]tub.location is a variant of the
    496536now-unsupported BASEDIR/advertised_ip_addresses . The old file was additive
     
    499539is not (tub.location is used verbatim).
    500540
    501541
    502 == Example ==
     542Example
     543=======
    503544
    504545The following is a sample tahoe.cfg file, containing values for all keys
    505546described above. Note that this is not a recommended configuration (most of
    506547these are not the default values), merely a legal one.
    507548
    508 [node]
    509 nickname = Bob's Tahoe Node
    510 tub.port = 34912
    511 tub.location = 123.45.67.89:8098,44.55.66.77:8098
    512 web.port = 3456
    513 log_gatherer.furl = pb://soklj4y7eok5c3xkmjeqpw@192.168.69.247:44801/eqpwqtzm
    514 timeout.keepalive = 240
    515 timeout.disconnect = 1800
    516 ssh.port = 8022
    517 ssh.authorized_keys_file = ~/.ssh/authorized_keys
    518 
    519 [client]
    520 introducer.furl = pb://ok45ssoklj4y7eok5c3xkmj@tahoe.example:44801/ii3uumo
    521 helper.furl = pb://ggti5ssoklj4y7eok5c3xkmj@helper.tahoe.example:7054/kk8lhr
    522 
    523 [storage]
    524 enabled = True
    525 readonly_storage = True
    526 sizelimit = 10000000000
     549::
    527550
    528 [helper]
    529 run_helper = True
     551  [node]
     552  nickname = Bob's Tahoe Node
     553  tub.port = 34912
     554  tub.location = 123.45.67.89:8098,44.55.66.77:8098
     555  web.port = 3456
     556  log_gatherer.furl = pb://soklj4y7eok5c3xkmjeqpw@192.168.69.247:44801/eqpwqtzm
     557  timeout.keepalive = 240
     558  timeout.disconnect = 1800
     559  ssh.port = 8022
     560  ssh.authorized_keys_file = ~/.ssh/authorized_keys
     561 
     562  [client]
     563  introducer.furl = pb://ok45ssoklj4y7eok5c3xkmj@tahoe.example:44801/ii3uumo
     564  helper.furl = pb://ggti5ssoklj4y7eok5c3xkmj@helper.tahoe.example:7054/kk8lhr
     565 
     566  [storage]
     567  enabled = True
     568  readonly_storage = True
     569  sizelimit = 10000000000
     570 
     571  [helper]
     572  run_helper = True
  • docs/debian.txt

    diff -rN -u old-tahoe-lafs/docs/debian.txt new-tahoe-lafs/docs/debian.txt
    old new  
    1 = Debian Support =
     1==============
     2Debian Support
     3==============
     4
     51.  `Overview`_
     62.  `TL;DR supporting package building instructions`_
     73.  `TL;DR package building instructions for Tahoe`_
     84.  `Building Debian Packages`_
     95.  `Using Pre-Built Debian Packages`_
     106.  `Building From Source on Debian Systems`_
    211
    3 1.  Overview
    4 2.  TL;DR supporting package building instructions
    5 3.  TL;DR package building instructions for Tahoe
    6 4.  Building Debian Packages
    7 5.  Using Pre-Built Debian Packages
    8 6.  Building From Source on Debian Systems
    9 
    10 = Overview ==
     12Overview
     13========
    1114
    1215One convenient way to install Tahoe-LAFS is with debian packages.
    1316This document attempts to explain how to complete a desert island build for
    1417people in a hurry. It also attempts to explain more about our Debian packaging
    1518for those willing to read beyond the simple pragmatic packaging exercises.
    1619
    17 == TL;DR supporting package building instructions ==
     20TL;DR supporting package building instructions
     21==============================================
    1822
    1923There are only four supporting packages that are currently not available from
    20 the debian apt repositories in Debian Lenny:
     24the debian apt repositories in Debian Lenny::
    2125
    2226    python-foolscap python-zfec argparse zbase32
    2327
    24 First, we'll install some common packages for development:
     28First, we'll install some common packages for development::
    2529
    2630    sudo apt-get install -y build-essential debhelper cdbs python-central \
    2731                    python-setuptools python python-dev python-twisted-core \
     
    3135    sudo apt-file update
    3236
    3337
    34 To create packages for Lenny, we'll also install stdeb:   
     38To create packages for Lenny, we'll also install stdeb:: 
    3539
    3640    sudo apt-get install python-all-dev
    3741    STDEB_VERSION="0.5.1"
     
    4145    python setup.py --command-packages=stdeb.command bdist_deb
    4246    sudo dpkg -i deb_dist/python-stdeb_$STDEB_VERSION-1_all.deb
    4347
    44 Now we're ready to build and install the zfec Debian package:
     48Now we're ready to build and install the zfec Debian package::
    4549
    4650    darcs get http://allmydata.org/source/zfec/trunk zfac
    4751    cd zfac/zfec/
     
    5054    dpkg-buildpackage -rfakeroot -uc -us
    5155    sudo dpkg -i ../python-zfec_1.4.6-r333-1_amd64.deb
    5256
    53 We need to build a pyutil package:
     57We need to build a pyutil package::
    5458
    5559    wget http://pypi.python.org/packages/source/p/pyutil/pyutil-1.6.1.tar.gz
    5660    tar -xvzf pyutil-1.6.1.tar.gz
     
    6064    dpkg-buildpackage -rfakeroot -uc -us
    6165    sudo dpkg -i ../python-pyutil_1.6.1-1_all.deb
    6266
    63 We also need to install argparse and zbase32:
     67We also need to install argparse and zbase32::
    6468
    6569    sudo easy_install argparse # argparse won't install with stdeb (!) :-(
    6670    sudo easy_install zbase32 # XXX TODO: package with stdeb
    6771
    68 Finally, we'll fetch, unpack, build and install foolscap:
     72Finally, we'll fetch, unpack, build and install foolscap::
    6973
    7074    # You may not already have Brian's key:
    7175    # gpg --recv-key 0x1514A7BD
     
    7983    dpkg-buildpackage -rfakeroot -uc -us
    8084    sudo dpkg -i ../python-foolscap_0.5.0-1_all.deb
    8185
    82 == TL;DR package building instructions for Tahoe ==
     86TL;DR package building instructions for Tahoe
     87=============================================
    8388
    8489If you want to build your own Debian packages from the darcs tree or from
    85 a source release, do the following:
     90a source release, do the following::
    8691
    8792    cd ~/
    8893    mkdir src && cd src/
     
    98103/etc/defaults/allmydata-tahoe file to get Tahoe started. Data is by default
    99104stored in /var/lib/tahoelafsd/ and Tahoe runs as the 'tahoelafsd' user.
    100105
    101 == Building Debian Packages ==
     106Building Debian Packages
     107========================
    102108
    103109The Tahoe source tree comes with limited support for building debian packages
    104110on a variety of Debian and Ubuntu platforms. For each supported platform,
     
    109115
    110116To create debian packages from a Tahoe tree, you will need some additional
    111117tools installed. The canonical list of these packages is in the
    112 "Build-Depends" clause of misc/sid/debian/control , and includes:
     118"Build-Depends" clause of misc/sid/debian/control , and includes::
    113119
    114120 build-essential
    115121 debhelper
     
    130136Note that we haven't tried to build source packages (.orig.tar.gz + dsc) yet,
    131137and there are no such source packages in our APT repository.
    132138
    133 == Using Pre-Built Debian Packages ==
     139Using Pre-Built Debian Packages
     140===============================
    134141
    135142The allmydata.org site hosts an APT repository with debian packages that are
    136 built after each checkin. The following wiki page describes this repository:
    137 
    138  http://allmydata.org/trac/tahoe/wiki/DownloadDebianPackages
     143built after each checkin. `This wiki page
     144<http://allmydata.org/trac/tahoe/wiki/DownloadDebianPackages>`_ describes this
     145repository.
    139146
    140147The allmydata.org APT repository also includes debian packages of support
    141148libraries, like Foolscap, zfec, pycryptopp, and everything else you need that
    142149isn't already in debian.
    143150
    144 == Building From Source on Debian Systems ==
     151Building From Source on Debian Systems
     152======================================
    145153
    146154Many of Tahoe's build dependencies can be satisfied by first installing
    147155certain debian packages: simplejson is one of these. Some debian/ubuntu
  • docs/filesystem-notes.txt

    diff -rN -u old-tahoe-lafs/docs/filesystem-notes.txt new-tahoe-lafs/docs/filesystem-notes.txt
    old new  
     1=========================
     2Filesystem-specific notes
     3=========================
     4
     51. ext3_
    16
    27Tahoe storage servers use a large number of subdirectories to store their
    38shares on local disk. This format is simple and robust, but depends upon the
    49local filesystem to provide fast access to those directories.
    510
    6 = ext3 =
     11ext3
     12====
    713
    814For moderate- or large-sized storage servers, you'll want to make sure the
    915"directory index" feature is enabled on your ext3 directories, otherwise
    1016share lookup may be very slow. Recent versions of ext3 enable this
    11 automatically, but older filesystems may not have it enabled.
     17automatically, but older filesystems may not have it enabled::
    1218
    13 $ sudo tune2fs -l /dev/sda1 |grep feature
    14 Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file
     19  $ sudo tune2fs -l /dev/sda1 |grep feature
     20  Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file
    1521
    1622If "dir_index" is present in the "features:" line, then you're all set. If
    1723not, you'll need to use tune2fs and e2fsck to enable and build the index. See
    18 this page for some hints: http://wiki.dovecot.org/MailboxFormat/Maildir .
     24<http://wiki.dovecot.org/MailboxFormat/Maildir> for some hints.
  • docs/garbage-collection.txt

    diff -rN -u old-tahoe-lafs/docs/garbage-collection.txt new-tahoe-lafs/docs/garbage-collection.txt
    old new  
    1 = Garbage Collection in Tahoe =
     1===========================
     2Garbage Collection in Tahoe
     3===========================
     4
     51. `Overview`_
     62. `Client-side Renewal`_
     73. `Server Side Expiration`_
     84. `Expiration Progress`_
     95. `Future Directions`_
    210
    3 1. Overview
    4 2. Client-side Renewal
    5 3. Server Side Expiration
    6 4. Expiration Progress
    7 5. Future Directions
    8 
    9 == Overview ==
     11Overview
     12========
    1013
    1114When a file or directory in the virtual filesystem is no longer referenced,
    1215the space that its shares occupied on each storage server can be freed,
     
    4043server can use the "expire.override_lease_duration" configuration setting to
    4144increase or decrease the effective duration to something other than 31 days).
    4245
    43 == Client-side Renewal ==
     46Client-side Renewal
     47===================
    4448
    4549If all of the files and directories which you care about are reachable from a
    4650single starting point (usually referred to as a "rootcap"), and you store
     
    6973appropriate for use by individual users as well, and may be incorporated
    7074directly into the client node.
    7175
    72 == Server Side Expiration ==
     76Server Side Expiration
     77======================
    7378
    7479Expiration must be explicitly enabled on each storage server, since the
    7580default behavior is to never expire shares. Expiration is enabled by adding
     
    112117expired whatever it is going to expire, the second and subsequent passes are
    113118not going to find any new leases to remove.
    114119
    115 The tahoe.cfg file uses the following keys to control lease expiration:
     120The tahoe.cfg file uses the following keys to control lease expiration::
    116121
    117 [storage]
     122  [storage]
    118123
    119 expire.enabled = (boolean, optional)
     124  expire.enabled = (boolean, optional)
    120125
    121  If this is True, the storage server will delete shares on which all leases
    122  have expired. Other controls dictate when leases are considered to have
    123  expired. The default is False.
     126    If this is True, the storage server will delete shares on which all leases
     127    have expired. Other controls dictate when leases are considered to have
     128    expired. The default is False.
    124129
    125 expire.mode = (string, "age" or "cutoff-date", required if expiration enabled)
     130  expire.mode = (string, "age" or "cutoff-date", required if expiration enabled)
    126131
    127  If this string is "age", the age-based expiration scheme is used, and the
    128  "expire.override_lease_duration" setting can be provided to influence the
    129  lease ages. If it is "cutoff-date", the absolute-date-cutoff mode is used,
    130  and the "expire.cutoff_date" setting must be provided to specify the cutoff
    131  date. The mode setting currently has no default: you must provide a value.
     132        If this string is "age", the age-based expiration scheme is used, and the
     133        "expire.override_lease_duration" setting can be provided to influence the
     134        lease ages. If it is "cutoff-date", the absolute-date-cutoff mode is used,
     135        and the "expire.cutoff_date" setting must be provided to specify the cutoff
     136        date. The mode setting currently has no default: you must provide a value.
    132137
    133  In a future release, this setting is likely to default to "age", but in this
    134  release it was deemed safer to require an explicit mode specification.
     138        In a future release, this setting is likely to default to "age", but in this
     139        release it was deemed safer to require an explicit mode specification.
    135140
    136 expire.override_lease_duration = (duration string, optional)
     141  expire.override_lease_duration = (duration string, optional)
    137142
    138  When age-based expiration is in use, a lease will be expired if its
    139  "lease.create_renew" timestamp plus its "lease.duration" time is
    140  earlier/older than the current time. This key, if present, overrides the
    141  duration value for all leases, changing the algorithm from:
     143        When age-based expiration is in use, a lease will be expired if its
     144        "lease.create_renew" timestamp plus its "lease.duration" time is
     145        earlier/older than the current time. This key, if present, overrides the
     146        duration value for all leases, changing the algorithm from:
    142147
    143    if (lease.create_renew_timestamp + lease.duration) < now:
    144        expire_lease()
     148    if (lease.create_renew_timestamp + lease.duration) < now:
     149        expire_lease()
    145150
    146  to:
     151    to:
    147152
    148    if (lease.create_renew_timestamp + override_lease_duration) < now:
    149        expire_lease()
     153    if (lease.create_renew_timestamp + override_lease_duration) < now:
     154        expire_lease()
    150155
    151  The value of this setting is a "duration string", which is a number of days,
    152  months, or years, followed by a units suffix, and optionally separated by a
    153  space, such as one of the following:
     156        The value of this setting is a "duration string", which is a number of days,
     157        months, or years, followed by a units suffix, and optionally separated by a
     158        space, such as one of the following:
    154159
    155   7days
    156   31day
    157   60 days
    158   2mo
    159   3 month
    160   12 months
    161   2years
     160          7days
     161          31day
     162          60 days
     163          2mo
     164          3 month
     165          12 months
     166          2years
    162167
    163  This key is meant to compensate for the fact that clients do not yet have
    164  the ability to ask for leases that last longer than 31 days. A grid which
    165  wants to use faster or slower GC than a 31-day lease timer permits can use
    166  this parameter to implement it. The current fixed 31-day lease duration
    167  makes the server behave as if "lease.override_lease_duration = 31days" had
    168  been passed.
     168        This key is meant to compensate for the fact that clients do not yet have
     169        the ability to ask for leases that last longer than 31 days. A grid which
     170        wants to use faster or slower GC than a 31-day lease timer permits can use
     171        this parameter to implement it. The current fixed 31-day lease duration
     172        makes the server behave as if "lease.override_lease_duration = 31days" had
     173        been passed.
    169174
    170  This key is only valid when age-based expiration is in use (i.e. when
    171  "expire.mode = age" is used). It will be rejected if cutoff-date expiration
    172  is in use.
     175        This key is only valid when age-based expiration is in use (i.e. when
     176        "expire.mode = age" is used). It will be rejected if cutoff-date expiration
     177        is in use.
    173178
    174 expire.cutoff_date = (date string, required if mode=cutoff-date)
     179  expire.cutoff_date = (date string, required if mode=cutoff-date)
    175180
    176  When cutoff-date expiration is in use, a lease will be expired if its
    177  create/renew timestamp is older than the cutoff date. This string will be a
    178  date in the following format:
     181        When cutoff-date expiration is in use, a lease will be expired if its
     182        create/renew timestamp is older than the cutoff date. This string will be a
     183        date in the following format:
    179184
    180   2009-01-16   (January 16th, 2009)
    181   2008-02-02
    182   2007-12-25
     185          2009-01-16   (January 16th, 2009)
     186          2008-02-02
     187          2007-12-25
    183188
    184  The actual cutoff time shall be midnight UTC at the beginning of the given
    185  day. Lease timers should naturally be generous enough to not depend upon
    186  differences in timezone: there should be at least a few days between the
    187  last renewal time and the cutoff date.
     189        The actual cutoff time shall be midnight UTC at the beginning of the given
     190        day. Lease timers should naturally be generous enough to not depend upon
     191        differences in timezone: there should be at least a few days between the
     192        last renewal time and the cutoff date.
    188193
    189  This key is only valid when cutoff-based expiration is in use (i.e. when
    190  "expire.mode = cutoff-date"). It will be rejected if age-based expiration is
    191  in use.
     194        This key is only valid when cutoff-based expiration is in use (i.e. when
     195        "expire.mode = cutoff-date"). It will be rejected if age-based expiration is
     196        in use.
    192197
    193 expire.immutable = (boolean, optional)
     198  expire.immutable = (boolean, optional)
    194199
    195  If this is False, then immutable shares will never be deleted, even if their
    196  leases have expired. This can be used in special situations to perform GC on
    197  mutable files but not immutable ones. The default is True.
     200        If this is False, then immutable shares will never be deleted, even if their
     201        leases have expired. This can be used in special situations to perform GC on
     202        mutable files but not immutable ones. The default is True.
    198203
    199 expire.mutable = (boolean, optional)
     204  expire.mutable = (boolean, optional)
    200205
    201  If this is False, then mutable shares will never be deleted, even if their
    202  leases have expired. This can be used in special situations to perform GC on
    203  immutable files but not mutable ones. The default is True.
     206        If this is False, then mutable shares will never be deleted, even if their
     207        leases have expired. This can be used in special situations to perform GC on
     208        immutable files but not mutable ones. The default is True.
    204209
    205 == Expiration Progress ==
     210Expiration Progress
     211===================
    206212
    207213In the current release, leases are stored as metadata in each share file, and
    208214no separate database is maintained. As a result, checking and expiring leases
     
    229235crawler can be forcibly reset by stopping the node, deleting these two files,
    230236then restarting the node.
    231237
    232 == Future Directions ==
     238Future Directions
     239=================
    233240
    234241Tahoe's GC mechanism is undergoing significant changes. The global
    235242mark-and-sweep garbage-collection scheme can require considerable network
  • docs/helper.txt

    diff -rN -u old-tahoe-lafs/docs/helper.txt new-tahoe-lafs/docs/helper.txt
    old new  
    1 = The Tahoe Upload Helper =
     1=======================
     2The Tahoe Upload Helper
     3=======================
     4
     51. `Overview`_
     62. `Setting Up A Helper`_
     73. `Using a Helper`_
     84. `Other Helper Modes`_
    29
    3 1. Overview
    4 2. Setting Up A Helper
    5 3. Using a Helper
    6 4. Other Helper Modes
    7 
    8 == Overview ==
     10Overview
     11========
    912
    1013As described in the "SWARMING DOWNLOAD, TRICKLING UPLOAD" section of
    1114architecture.txt, Tahoe uploads require more bandwidth than downloads: you
     
    4548other applications that are sharing the same uplink to compete more evenly
    4649for the limited bandwidth.
    4750
    48 
    49 
    50 == Setting Up A Helper ==
     51Setting Up A Helper
     52===================
    5153
    5254Who should consider running a helper?
    5355
    54  * Benevolent entities which wish to provide better upload speed for clients
    55    that have slow uplinks
    56  * Folks which have machines with upload bandwidth to spare.
    57  * Server grid operators who want clients to connect to a small number of
    58    helpers rather than a large number of storage servers (a "multi-tier"
    59    architecture)
     56* Benevolent entities which wish to provide better upload speed for clients
     57  that have slow uplinks
     58* Folks which have machines with upload bandwidth to spare.
     59* Server grid operators who want clients to connect to a small number of
     60  helpers rather than a large number of storage servers (a "multi-tier"
     61  architecture)
    6062
    6163What sorts of machines are good candidates for running a helper?
    6264
    63  * The Helper needs to have good bandwidth to the storage servers. In
    64    particular, it needs to have at least 3.3x better upload bandwidth than
    65    the client does, or the client might as well upload directly to the
    66    storage servers. In a commercial grid, the helper should be in the same
    67    colo (and preferably in the same rack) as the storage servers.
    68  * The Helper will take on most of the CPU load involved in uploading a file.
    69    So having a dedicated machine will give better results.
    70  * The Helper buffers ciphertext on disk, so the host will need at least as
    71    much free disk space as there will be simultaneous uploads. When an upload
    72    is interrupted, that space will be used for a longer period of time.
     65* The Helper needs to have good bandwidth to the storage servers. In
     66  particular, it needs to have at least 3.3x better upload bandwidth than
     67  the client does, or the client might as well upload directly to the
     68  storage servers. In a commercial grid, the helper should be in the same
     69  colo (and preferably in the same rack) as the storage servers.
     70* The Helper will take on most of the CPU load involved in uploading a file.
     71  So having a dedicated machine will give better results.
     72* The Helper buffers ciphertext on disk, so the host will need at least as
     73  much free disk space as there will be simultaneous uploads. When an upload
     74  is interrupted, that space will be used for a longer period of time.
    7375
    7476To turn a Tahoe-LAFS node into a helper (i.e. to run a helper service in
    7577addition to whatever else that node is doing), edit the tahoe.cfg file in your
     
    8284helper: you will need to give this FURL to any clients that wish to use your
    8385helper.
    8486
    85  cat $BASEDIR/private/helper.furl |mail -s "helper furl" friend@example.com
     87::
     88
     89  cat $BASEDIR/private/helper.furl | mail -s "helper furl" friend@example.com
    8690
    8791You can tell if your node is running a helper by looking at its web status
    8892page. Assuming that you've set up the 'webport' to use port 3456, point your
     
    105109files in these directories that have not been modified for a week or two.
    106110Future versions of tahoe will try to self-manage these files a bit better.
    107111
    108 == Using a Helper ==
     112Using a Helper
     113==============
    109114
    110115Who should consider using a Helper?
    111116
    112  * clients with limited upstream bandwidth, such as a consumer ADSL line
    113  * clients who believe that the helper will give them faster uploads than
    114    they could achieve with a direct upload
    115  * clients who experience problems with TCP connection fairness: if other
    116    programs or machines in the same home are getting less than their fair
    117    share of upload bandwidth. If the connection is being shared fairly, then
    118    a Tahoe upload that is happening at the same time as a single FTP upload
    119    should get half the bandwidth.
    120  * clients who have been given the helper.furl by someone who is running a
    121    Helper and is willing to let them use it
     117* clients with limited upstream bandwidth, such as a consumer ADSL line
     118* clients who believe that the helper will give them faster uploads than
     119  they could achieve with a direct upload
     120* clients who experience problems with TCP connection fairness: if other
     121  programs or machines in the same home are getting less than their fair
     122  share of upload bandwidth. If the connection is being shared fairly, then
     123  a Tahoe upload that is happening at the same time as a single FTP upload
     124  should get half the bandwidth.
     125* clients who have been given the helper.furl by someone who is running a
     126  Helper and is willing to let them use it
    122127
    123128To take advantage of somebody else's Helper, take the helper.furl file that
    124129they give you, and copy it into your node's base directory, then restart the
    125130node:
    126131
    127  cat email >$BASEDIR/helper.furl
    128  tahoe restart $BASEDIR
     132::
     133
     134  cat email >$BASEDIR/helper.furl
     135  tahoe restart $BASEDIR
    129136
    130137This will signal the client to try and connect to the helper. Subsequent
    131138uploads will use the helper rather than using direct connections to the
     
    146153The upload/download status page (http://localhost:3456/status) will announce
    147154the using-helper-or-not state of each upload, in the "Helper?" column.
    148155
    149 == Other Helper Modes ==
     156Other Helper Modes
     157==================
    150158
    151159The Tahoe Helper only currently helps with one kind of operation: uploading
    152160immutable files. There are three other things it might be able to help with
    153161in the future:
    154162
    155  * downloading immutable files
    156  * uploading mutable files (such as directories)
    157  * downloading mutable files (like directories)
     163* downloading immutable files
     164* uploading mutable files (such as directories)
     165* downloading mutable files (like directories)
    158166
    159167Since mutable files are currently limited in size, the ADSL upstream penalty
    160168is not so severe for them. There is no ADSL penalty to downloads, but there
  • docs/known_issues.txt

    diff -rN -u old-tahoe-lafs/docs/known_issues.txt new-tahoe-lafs/docs/known_issues.txt
    old new  
    1 = known issues =
     1============
     2Known issues
     3============
     4
     5* `Overview`_
     6* `Issues in Tahoe-LAFS v1.8.0, released 2010-09-23`
     7
     8  *  `Potential unauthorized access by JavaScript in unrelated files`_
     9  *  `Potential disclosure of file through embedded hyperlinks or JavaScript in that file`_
     10  *  `Command-line arguments are leaked to other local users`_
     11  *  `Capabilities may be leaked to web browser phishing filter / "safe browsing" servers`_
     12  *  `Known issues in the FTP and SFTP frontends`_
    213
    3 *  overview
    4 *  issues in Tahoe-LAFS v1.8.0, released 2010-09-23
    5   -  potential unauthorized access by JavaScript in unrelated files
    6   -  potential disclosure of file through embedded hyperlinks or JavaScript in that file
    7   -  command-line arguments are leaked to other local users
    8   -  capabilities may be leaked to web browser phishing filter / "safe browsing" servers ===
    9   -  known issues in the FTP and SFTP frontends ===
    10 
    11 == overview ==
     14Overview
     15========
    1216
    1317Below is a list of known issues in recent releases of Tahoe-LAFS, and how to
    1418manage them.  The current version of this file can be found at
     
    2125
    2226http://tahoe-lafs.org/source/tahoe-lafs/trunk/docs/historical/historical_known_issues.txt
    2327
    24 == issues in Tahoe-LAFS v1.8.0, released 2010-09-18 ==
     28Issues in Tahoe-LAFS v1.8.0, released 2010-09-23
     29================================================
    2530
    26 === potential unauthorized access by JavaScript in unrelated files ===
     31Potential unauthorized access by JavaScript in unrelated files
     32--------------------------------------------------------------
    2733
    2834If you view a file stored in Tahoe-LAFS through a web user interface,
    2935JavaScript embedded in that file might be able to access other files or
     
    3339have the ability to modify the contents of those files or directories,
    3440then that script could modify or delete those files or directories.
    3541
    36 ==== how to manage it ====
     42how to manage it
     43~~~~~~~~~~~~~~~~
    3744
    3845For future versions of Tahoe-LAFS, we are considering ways to close off
    3946this leakage of authority while preserving ease of use -- the discussion
    40 of this issue is ticket #615.
     47of this issue is ticket `#615 <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/615>`_.
    4148
    4249For the present, either do not view files stored in Tahoe-LAFS through a
    4350web user interface, or turn off JavaScript in your web browser before
     
    4552malicious JavaScript.
    4653
    4754
    48 === potential disclosure of file through embedded hyperlinks or JavaScript in that file ===
     55Potential disclosure of file through embedded hyperlinks or JavaScript in that file
     56-----------------------------------------------------------------------------------
    4957
    5058If there is a file stored on a Tahoe-LAFS storage grid, and that file
    5159gets downloaded and displayed in a web browser, then JavaScript or
     
    6169browsers, so being careful which hyperlinks you click on is not
    6270sufficient to prevent this from happening.
    6371
    64 ==== how to manage it ====
     72how to manage it
     73~~~~~~~~~~~~~~~~
    6574
    6675For future versions of Tahoe-LAFS, we are considering ways to close off
    6776this leakage of authority while preserving ease of use -- the discussion
    68 of this issue is ticket #127.
     77of this issue is ticket `#127 <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/127>`_.
    6978
    7079For the present, a good work-around is that if you want to store and
    7180view a file on Tahoe-LAFS and you want that file to remain private, then
     
    7483written to maliciously leak access.
    7584
    7685
    77 === command-line arguments are leaked to other local users ===
     86Command-line arguments are leaked to other local users
     87------------------------------------------------------
    7888
    7989Remember that command-line arguments are visible to other users (through
    8090the 'ps' command, or the windows Process Explorer tool), so if you are
     
    8393arguments.  This includes directory caps that you set up with the "tahoe
    8494add-alias" command.
    8595
    86 ==== how to manage it ====
     96how to manage it
     97~~~~~~~~~~~~~~~~
    8798
    8899As of Tahoe-LAFS v1.3.0 there is a "tahoe create-alias" command that does
    89100the following technique for you.
     
    91102Bypass add-alias and edit the NODEDIR/private/aliases file directly, by
    92103adding a line like this:
    93104
    94 fun: URI:DIR2:ovjy4yhylqlfoqg2vcze36dhde:4d4f47qko2xm5g7osgo2yyidi5m4muyo2vjjy53q4vjju2u55mfa
     105  fun: URI:DIR2:ovjy4yhylqlfoqg2vcze36dhde:4d4f47qko2xm5g7osgo2yyidi5m4muyo2vjjy53q4vjju2u55mfa
    95106
    96107By entering the dircap through the editor, the command-line arguments
    97108are bypassed, and other users will not be able to see them. Once you've
     
    102113access to your files and directories.
    103114
    104115
    105 === capabilities may be leaked to web browser phishing filter / "safe browsing" servers ===
     116Capabilities may be leaked to web browser phishing filter / "safe browsing" servers
     117-----------------------------------------------------------------------------------
    106118
    107119Firefox, Internet Explorer, and Chrome include a "phishing filter" or
    108120"safe browing" component, which is turned on by default, and which sends
     
    134146version of this file stated that Firefox had abandoned their phishing
    135147filter; this was incorrect.
    136148
    137 ==== how to manage it ====
     149how to manage it
     150~~~~~~~~~~~~~~~~
    138151
    139152If you use any phishing filter or "safe browsing" feature, consider either
    140153disabling it, or not using the WUI via that browser. Phishing filters have
     
    143156or malware attackers have learnt how to bypass them.
    144157
    145158To disable the filter in IE7 or IE8:
    146  - Click Internet Options from the Tools menu.
    147  - Click the Advanced tab.
    148  - If an "Enable SmartScreen Filter" option is present, uncheck it.
    149    If a "Use Phishing Filter" or "Phishing Filter" option is present,
    150    set it to Disable.
    151  - Confirm (click OK or Yes) out of all dialogs.
     159````````````````````````````````````
     160
     161- Click Internet Options from the Tools menu.
     162
     163- Click the Advanced tab.
     164
     165- If an "Enable SmartScreen Filter" option is present, uncheck it.
     166  If a "Use Phishing Filter" or "Phishing Filter" option is present,
     167  set it to Disable.
     168
     169- Confirm (click OK or Yes) out of all dialogs.
    152170
    153171If you have a version of IE that splits the settings between security
    154172zones, do this for all zones.
    155173
    156174To disable the filter in Firefox:
    157  - Click Options from the Tools menu.
    158  - Click the Security tab.
    159  - Uncheck both the "Block reported attack sites" and "Block reported
    160    web forgeries" options.
    161  - Click OK.
     175`````````````````````````````````
     176
     177- Click Options from the Tools menu.
     178
     179- Click the Security tab.
     180
     181- Uncheck both the "Block reported attack sites" and "Block reported
     182  web forgeries" options.
     183
     184- Click OK.
    162185
    163186To disable the filter in Chrome:
    164  - Click Options from the Tools menu.
    165  - Click the "Under the Hood" tab and find the "Privacy" section.
    166  - Uncheck the "Enable phishing and malware protection" option.
    167  - Click Close.
     187````````````````````````````````
     188
     189- Click Options from the Tools menu.
     190
     191- Click the "Under the Hood" tab and find the "Privacy" section.
     192
     193- Uncheck the "Enable phishing and malware protection" option.
     194
     195- Click Close.
    168196
    169197
    170 === known issues in the FTP and SFTP frontends ===
     198Known issues in the FTP and SFTP frontends
     199------------------------------------------
    171200
    172201These are documented in docs/frontends/FTP-and-SFTP.txt and at
    173202<http://tahoe-lafs.org/trac/tahoe-lafs/wiki/SftpFrontend>.
  • docs/logging.txt

    diff -rN -u old-tahoe-lafs/docs/logging.txt new-tahoe-lafs/docs/logging.txt
    old new  
    1 = Tahoe Logging =
     1=============
     2Tahoe Logging
     3=============
     4
     51.  `Overview`_
     62.  `Realtime Logging`_
     73.  `Incidents`_
     84.  `Working with flogfiles`_
     95.  `Gatherers`_
     10
     11    1.  `Incident Gatherer`_
     12    2.  `Log Gatherer`_
     13
     146.  `Local twistd.log files`_
     157.  `Adding log messages`_
     168.  `Log Messages During Unit Tests`_
    217
    3 1.  Overview
    4 2.  Realtime Logging
    5 3.  Incidents
    6 4.  Working with flogfiles
    7 5.  Gatherers
    8   5.1.  Incident Gatherer
    9   5.2.  Log Gatherer
    10 6.  Local twistd.log files
    11 7.  Adding log messages
    12 8.  Log Messages During Unit Tests
    13 
    14 == Overview ==
     18Overview
     19========
    1520
    1621Tahoe uses the Foolscap logging mechanism (known as the "flog" subsystem) to
    1722record information about what is happening inside the Tahoe node. This is
     
    2631/usr/bin/flogtool) which is used to get access to many foolscap logging
    2732features.
    2833
    29 == Realtime Logging ==
     34Realtime Logging
     35================
    3036
    3137When you are working on Tahoe code, and want to see what the node is doing,
    3238the easiest tool to use is "flogtool tail". This connects to the tahoe node
     
    3743BASEDIR/private/logport.furl . The following command will connect to this
    3844port and start emitting log information:
    3945
    40  flogtool tail BASEDIR/private/logport.furl
     46  flogtool tail BASEDIR/private/logport.furl
    4147
    4248The "--save-to FILENAME" option will save all received events to a file,
    4349where then can be examined later with "flogtool dump" or "flogtool
     
    4551before subscribing to new ones (without --catch-up, you will only hear about
    4652events that occur after the tool has connected and subscribed).
    4753
    48 == Incidents ==
     54Incidents
     55=========
    4956
    5057Foolscap keeps a short list of recent events in memory. When something goes
    5158wrong, it writes all the history it has (and everything that gets logged in
     
    7279parent/child relationships of log events is displayed in a nested format.
    7380"flogtool web-viewer" is still fairly immature.
    7481
    75 == Working with flogfiles ==
     82Working with flogfiles
     83======================
    7684
    7785The "flogtool filter" command can be used to take a large flogfile (perhaps
    7886one created by the log-gatherer, see below) and copy a subset of events into
     
    8593were emitted with a given facility (like foolscap.negotiation or
    8694tahoe.upload).
    8795
    88 == Gatherers ==
     96Gatherers
     97=========
    8998
    9099In a deployed Tahoe grid, it is useful to get log information automatically
    91100transferred to a central log-gatherer host. This offloads the (admittedly
     
    101110The gatherer will write to files in its working directory, which can then be
    102111examined with tools like "flogtool dump" as described above.
    103112
    104 === Incident Gatherer ===
     113Incident Gatherer
     114-----------------
    105115
    106116The "incident gatherer" only collects Incidents: records of the log events
    107117that occurred just before and slightly after some high-level "trigger event"
     
    120130"gatherer.tac" file should be modified to add classifier functions.
    121131
    122132The incident gatherer writes incident names (which are simply the relative
    123 pathname of the incident-*.flog.bz2 file) into classified/CATEGORY. For
     133pathname of the incident-\*.flog.bz2 file) into classified/CATEGORY. For
    124134example, the classified/mutable-retrieve-uncoordinated-write-error file
    125135contains a list of all incidents which were triggered by an uncoordinated
    126136write that was detected during mutable file retrieval (caused when somebody
     
    145155node which generated it to the gatherer. The gatherer will automatically
    146156catch up to any incidents which occurred while it is offline.
    147157
    148 === Log Gatherer ===
     158Log Gatherer
     159------------
    149160
    150161The "Log Gatherer" subscribes to hear about every single event published by
    151162the connected nodes, regardless of severity. This server writes these log
     
    172183the outbound queue grows too large. When this occurs, there will be gaps
    173184(non-sequential event numbers) in the log-gatherer's flogfiles.
    174185
    175 == Local twistd.log files ==
     186Local twistd.log files
     187======================
    176188
    177189[TODO: not yet true, requires foolscap-0.3.1 and a change to allmydata.node]
    178190
     
    188200(i.e. not the log.NOISY debugging events). In addition, foolscap internal
    189201events (like connection negotiation messages) are not bridged to twistd.log .
    190202
    191 == Adding log messages ==
     203Adding log messages
     204===================
    192205
    193206When adding new code, the Tahoe developer should add a reasonable number of
    194207new log events. For details, please see the Foolscap logging documentation,
    195208but a few notes are worth stating here:
    196209
    197  * use a facility prefix of "tahoe.", like "tahoe.mutable.publish"
     210* use a facility prefix of "tahoe.", like "tahoe.mutable.publish"
    198211
    199  * assign each severe (log.WEIRD or higher) event a unique message
    200    identifier, as the umid= argument to the log.msg() call. The
    201    misc/coding_tools/make_umid script may be useful for this purpose. This will make it
    202    easier to write a classification function for these messages.
    203 
    204  * use the parent= argument whenever the event is causally/temporally
    205    clustered with its parent. For example, a download process that involves
    206    three sequential hash fetches could announce the send and receipt of those
    207    hash-fetch messages with a parent= argument that ties them to the overall
    208    download process. However, each new wapi download request should be
    209    unparented.
    210 
    211  * use the format= argument in preference to the message= argument. E.g.
    212    use log.msg(format="got %(n)d shares, need %(k)d", n=n, k=k) instead of
    213    log.msg("got %d shares, need %d" % (n,k)). This will allow later tools to
    214    analyze the event without needing to scrape/reconstruct the structured
    215    data out of the formatted string.
    216 
    217  * Pass extra information as extra keyword arguments, even if they aren't
    218    included in the format= string. This information will be displayed in the
    219    "flogtool dump --verbose" output, as well as being available to other
    220    tools. The umid= argument should be passed this way.
    221 
    222  * use log.err for the catch-all addErrback that gets attached to the end of
    223    any given Deferred chain. When used in conjunction with LOGTOTWISTED=1,
    224    log.err() will tell Twisted about the error-nature of the log message,
    225    causing Trial to flunk the test (with an "ERROR" indication that prints a
    226    copy of the Failure, including a traceback). Don't use log.err for events
    227    that are BAD but handled (like hash failures: since these are often
    228    deliberately provoked by test code, they should not cause test failures):
    229    use log.msg(level=BAD) for those instead.
     212* assign each severe (log.WEIRD or higher) event a unique message
     213  identifier, as the umid= argument to the log.msg() call. The
     214  misc/coding_tools/make_umid script may be useful for this purpose. This will make it
     215  easier to write a classification function for these messages.
     216
     217* use the parent= argument whenever the event is causally/temporally
     218  clustered with its parent. For example, a download process that involves
     219  three sequential hash fetches could announce the send and receipt of those
     220  hash-fetch messages with a parent= argument that ties them to the overall
     221  download process. However, each new wapi download request should be
     222  unparented.
     223
     224* use the format= argument in preference to the message= argument. E.g.
     225  use log.msg(format="got %(n)d shares, need %(k)d", n=n, k=k) instead of
     226  log.msg("got %d shares, need %d" % (n,k)). This will allow later tools to
     227  analyze the event without needing to scrape/reconstruct the structured
     228  data out of the formatted string.
     229
     230* Pass extra information as extra keyword arguments, even if they aren't
     231  included in the format= string. This information will be displayed in the
     232  "flogtool dump --verbose" output, as well as being available to other
     233  tools. The umid= argument should be passed this way.
     234
     235* use log.err for the catch-all addErrback that gets attached to the end of
     236  any given Deferred chain. When used in conjunction with LOGTOTWISTED=1,
     237  log.err() will tell Twisted about the error-nature of the log message,
     238  causing Trial to flunk the test (with an "ERROR" indication that prints a
     239  copy of the Failure, including a traceback). Don't use log.err for events
     240  that are BAD but handled (like hash failures: since these are often
     241  deliberately provoked by test code, they should not cause test failures):
     242  use log.msg(level=BAD) for those instead.
    230243
    231244
    232 == Log Messages During Unit Tests ==
     245Log Messages During Unit Tests
     246==============================
    233247
    234248If a test is failing and you aren't sure why, start by enabling
    235249FLOGTOTWISTED=1 like this:
    236250
    237  make test FLOGTOTWISTED=1
     251  make test FLOGTOTWISTED=1
    238252
    239253With FLOGTOTWISTED=1, sufficiently-important log events will be written into
    240254_trial_temp/test.log, which may give you more ideas about why the test is
     
    246260If that isn't enough, look at the detailed foolscap logging messages instead,
    247261by running the tests like this:
    248262
    249  make test FLOGFILE=flog.out.bz2 FLOGLEVEL=1 FLOGTOTWISTED=1
     263  make test FLOGFILE=flog.out.bz2 FLOGLEVEL=1 FLOGTOTWISTED=1
    250264
    251265The first environment variable will cause foolscap log events to be written
    252266to ./flog.out.bz2 (instead of merely being recorded in the circular buffers
  • docs/performance.txt

    diff -rN -u old-tahoe-lafs/docs/performance.txt new-tahoe-lafs/docs/performance.txt
    old new  
    1 = Performance costs for some common operations =
     1============================================
     2Performance costs for some common operations
     3============================================
     4
     51.  `Publishing an A-byte immutable file`_
     62.  `Publishing an A-byte mutable file`_
     73.  `Downloading B bytes of an A-byte immutable file`_
     84.  `Downloading B bytes of an A-byte mutable file`_
     95.  `Modifying B bytes of an A-byte mutable file`_
     106.  `Inserting/Removing B bytes in an A-byte mutable file`_
     117.  `Adding an entry to an A-entry directory`_
     128.  `Listing an A entry directory`_
     139.  `Performing a file-check on an A-byte file`_
     1410. `Performing a file-verify on an A-byte file`_
     1511. `Repairing an A-byte file (mutable or immutable)`_
    216
    3 1.  Publishing an A-byte immutable file
    4 2.  Publishing an A-byte mutable file
    5 3.  Downloading B bytes of an A-byte immutable file
    6 4.  Downloading B bytes of an A-byte mutable file
    7 5.  Modifying B bytes of an A-byte mutable file
    8 6.  Inserting/Removing B bytes in an A-byte mutable file
    9 7.  Adding an entry to an A-entry directory
    10 8.  Listing an A entry directory
    11 9.  Performing a file-check on an A-byte file
    12 10. Performing a file-verify on an A-byte file
    13 11. Repairing an A-byte file (mutable or immutable)
    14 
    15 == Publishing an A-byte immutable file ==
     17Publishing an ``A``-byte immutable file
     18=======================================
    1619
    1720network: A
     21
    1822memory footprint: N/k*128KiB
    1923
    2024notes: An immutable file upload requires an additional I/O pass over the entire
    21        source file before the upload process can start, since convergent
    22        encryption derives the encryption key in part from the contents of the
    23        source file.
     25source file before the upload process can start, since convergent
     26encryption derives the encryption key in part from the contents of the
     27source file.
    2428
    25 == Publishing an A-byte mutable file ==
     29Publishing an ``A``-byte mutable file
     30=====================================
    2631
    2732network: A
     33
    2834memory footprint: N/k*A
     35
    2936cpu: O(A) + a large constant for RSA keypair generation
    3037
    31 notes: Tahoe-LAFS generates a new RSA keypair for each mutable file that
    32        it publishes to a grid. This takes up to 1 or 2 seconds on a
    33        typical desktop PC.
    34 
    35        Part of the process of encrypting, encoding, and uploading a
    36        mutable file to a Tahoe-LAFS grid requires that the entire file
    37        be in memory at once. For larger files, this may cause
    38        Tahoe-LAFS to have an unacceptably large memory footprint (at
    39        least when uploading a mutable file).
     38notes: Tahoe-LAFS generates a new RSA keypair for each mutable file that it
     39publishes to a grid. This takes up to 1 or 2 seconds on a typical desktop PC.
    4040
    41 == Downloading B bytes of an A-byte immutable file ==
     41Part of the process of encrypting, encoding, and uploading a mutable file to a
     42Tahoe-LAFS grid requires that the entire file be in memory at once. For larger
     43files, this may cause Tahoe-LAFS to have an unacceptably large memory footprint
     44(at least when uploading a mutable file).
     45
     46Downloading ``B`` bytes of an ``A``-byte immutable file
     47=======================================================
    4248
    4349network: B
     50
    4451memory footprint: 128KiB
    4552
    4653notes: When Tahoe-LAFS 1.8.0 or later is asked to read an arbitrary range
    47        of an immutable file, only the 128-KiB segments that overlap the
    48        requested range will be downloaded.
     54of an immutable file, only the 128-KiB segments that overlap the
     55requested range will be downloaded.
    4956
    50        (Earlier versions would download from the beginning of the file up
    51        until the end of the requested range, and then continue to download
    52        the rest of the file even after the request was satisfied.)
     57(Earlier versions would download from the beginning of the file up
     58until the end of the requested range, and then continue to download
     59the rest of the file even after the request was satisfied.)
    5360
    54 == Downloading B bytes of an A-byte mutable file ==
     61Downloading ``B`` bytes of an ``A``-byte mutable file
     62=====================================================
    5563
    5664network: A
     65
    5766memory footprint: A
    5867
    5968notes: As currently implemented, mutable files must be downloaded in
    60        their entirety before any part of them can be read. We are
    61        exploring fixes for this; see ticket #393 for more information.
     69their entirety before any part of them can be read. We are
     70exploring fixes for this; see ticket #393 for more information.
    6271
    63 == Modifying B bytes of an A-byte mutable file ==
     72Modifying ``B`` bytes of an ``A``-byte mutable file
     73===================================================
    6474
    6575network: A
     76
    6677memory footprint: N/k*A
    6778
    6879notes: If you upload a changed version of a mutable file that you
    69        earlier put onto your grid with, say, 'tahoe put --mutable',
    70        Tahoe-LAFS will replace the old file with the new file on the
    71        grid, rather than attempting to modify only those portions of the
    72        file that have changed. Modifying a file in this manner is
    73        essentially uploading the file over again, except that it re-uses
    74        the existing RSA keypair instead of generating a new one.
     80earlier put onto your grid with, say, 'tahoe put --mutable',
     81Tahoe-LAFS will replace the old file with the new file on the
     82grid, rather than attempting to modify only those portions of the
     83file that have changed. Modifying a file in this manner is
     84essentially uploading the file over again, except that it re-uses
     85the existing RSA keypair instead of generating a new one.
    7586
    76 == Inserting/Removing B bytes in an A-byte mutable file ==
     87Inserting/Removing ``B`` bytes in an ``A``-byte mutable file
     88============================================================
    7789
    7890network: A
     91
    7992memory footprint: N/k*A
    8093
    8194notes: Modifying any part of a mutable file in Tahoe-LAFS requires that
    82        the entire file be downloaded, modified, held in memory while it is
    83        encrypted and encoded, and then re-uploaded. A future version of the
    84        mutable file layout ("LDMF") may provide efficient inserts and
    85        deletes. Note that this sort of modification is mostly used internally
    86        for directories, and isn't something that the WUI, CLI, or other
    87        interfaces will do -- instead, they will simply overwrite the file to
    88        be modified, as described in "Modifying B bytes of an A-byte mutable
    89        file".
     95the entire file be downloaded, modified, held in memory while it is
     96encrypted and encoded, and then re-uploaded. A future version of the
     97mutable file layout ("LDMF") may provide efficient inserts and
     98deletes. Note that this sort of modification is mostly used internally
     99for directories, and isn't something that the WUI, CLI, or other
     100interfaces will do -- instead, they will simply overwrite the file to
     101be modified, as described in "Modifying B bytes of an A-byte mutable
     102file".
    90103
    91 == Adding an entry to an A-entry directory ==
     104Adding an entry to an ``A``-entry directory
     105===========================================
    92106
    93107network: O(A)
     108
    94109memory footprint: N/k*A
    95110
    96111notes: In Tahoe-LAFS, directories are implemented as specialized mutable
    97        files. So adding an entry to a directory is essentially adding B
    98        (actually, 300-330) bytes somewhere in an existing mutable file.
     112files. So adding an entry to a directory is essentially adding B
     113(actually, 300-330) bytes somewhere in an existing mutable file.
    99114
    100 == Listing an A entry directory ==
     115Listing an ``A`` entry directory
     116================================
    101117
    102118network: O(A)
     119
    103120memory footprint: N/k*A
    104121
    105122notes: Listing a directory requires that the mutable file storing the
    106        directory be downloaded from the grid. So listing an A entry
    107        directory requires downloading a (roughly) 330 * A byte mutable
    108        file, since each directory entry is about 300-330 bytes in size.
     123directory be downloaded from the grid. So listing an A entry
     124directory requires downloading a (roughly) 330 * A byte mutable
     125file, since each directory entry is about 300-330 bytes in size.
    109126
    110 == Performing a file-check on an A-byte file ==
     127Performing a file-check on an ``A``-byte file
     128=============================================
    111129
    112130network: O(S), where S is the number of servers on your grid
     131
    113132memory footprint: negligible
    114133
    115134notes: To check a file, Tahoe-LAFS queries all the servers that it knows
    116        about. Note that neither of these values directly depend on the size
    117        of the file. This is relatively inexpensive, compared to the verify
    118        and repair operations.
     135about. Note that neither of these values directly depend on the size
     136of the file. This is relatively inexpensive, compared to the verify
     137and repair operations.
    119138
    120 == Performing a file-verify on an A-byte file ==
     139Performing a file-verify on an ``A``-byte file
     140==============================================
    121141
    122142network: N/k*A
     143
    123144memory footprint: N/k*128KiB
    124145
    125146notes: To verify a file, Tahoe-LAFS downloads all of the ciphertext
    126        shares that were originally uploaded to the grid and integrity
    127        checks them. This is, for well-behaved grids, likely to be more
    128        expensive than downloading an A-byte file, since only a fraction
    129        of these shares are necessary to recover the file.
     147shares that were originally uploaded to the grid and integrity
     148checks them. This is, for well-behaved grids, likely to be more
     149expensive than downloading an A-byte file, since only a fraction
     150of these shares are necessary to recover the file.
    130151
    131 == Repairing an A-byte file (mutable or immutable) ==
     152Repairing an ``A``-byte file (mutable or immutable)
     153===================================================
    132154
    133155network: variable; up to around O(A)
     156
    134157memory footprint: from 128KiB to (1+N/k)*128KiB
    135158
    136159notes: To repair a file, Tahoe-LAFS downloads the file, and generates/uploads
    137        missing shares in the same way as when it initially uploads the file.
    138        So, depending on how many shares are missing, this can be about as
    139        expensive as initially uploading the file in the first place.
     160missing shares in the same way as when it initially uploads the file.
     161So, depending on how many shares are missing, this can be about as
     162expensive as initially uploading the file in the first place.
  • docs/stats.txt

    diff -rN -u old-tahoe-lafs/docs/stats.txt new-tahoe-lafs/docs/stats.txt
    old new  
    1 = Tahoe Statistics =
     1================
     2Tahoe Statistics
     3================
     4
     51. `Overview`_
     62. `Statistics Categories`_
     73. `Running a Tahoe Stats-Gatherer Service`_
     84. `Using Munin To Graph Stats Values`_
    29
    3 1. Overview
    4 2. Statistics Categories
    5 3. Running a Tahoe Stats-Gatherer Service
    6 4. Using Munin To Graph Stats Values
    7 
    8 == Overview ==
     10Overview
     11========
    912
    1013Each Tahoe node collects and publishes statistics about its operations as it
    1114runs. These include counters of how many files have been uploaded and
     
    2023block, along with a copy of the raw counters. To obtain just the raw counters
    2124(in JSON format), use /statistics?t=json instead.
    2225
    23 == Statistics Categories ==
     26Statistics Categories
     27=====================
    2428
    2529The stats dictionary contains two keys: 'counters' and 'stats'. 'counters'
    2630are strictly counters: they are reset to zero when the node is started, and
     
    3539
    3640The currently available stats (as of release 1.6.0 or so) are described here:
    3741
    38 counters.storage_server.*: this group counts inbound storage-server
    39                            operations. They are not provided by client-only
    40                            nodes which have been configured to not run a
    41                            storage server (with [storage]enabled=false in
    42                            tahoe.cfg)
    43   allocate, write, close, abort: these are for immutable file uploads.
    44                                  'allocate' is incremented when a client asks
    45                                  if it can upload a share to the server.
    46                                  'write' is incremented for each chunk of
    47                                  data written. 'close' is incremented when
    48                                  the share is finished. 'abort' is
    49                                  incremented if the client abandons the
    50                                  uploaed.
    51   get, read: these are for immutable file downloads. 'get' is incremented
    52              when a client asks if the server has a specific share. 'read' is
    53              incremented for each chunk of data read.
    54   readv, writev: these are for immutable file creation, publish, and
    55                  retrieve. 'readv' is incremented each time a client reads
    56                  part of a mutable share. 'writev' is incremented each time a
    57                  client sends a modification request.
    58   add-lease, renew, cancel: these are for share lease modifications.
    59                             'add-lease' is incremented when an 'add-lease'
    60                             operation is performed (which either adds a new
    61                             lease or renews an existing lease). 'renew' is
    62                             for the 'renew-lease' operation (which can only
    63                             be used to renew an existing one). 'cancel' is
    64                             used for the 'cancel-lease' operation.
    65   bytes_freed: this counts how many bytes were freed when a 'cancel-lease'
    66                operation removed the last lease from a share and the share
    67                was thus deleted.
    68   bytes_added: this counts how many bytes were consumed by immutable share
    69                uploads. It is incremented at the same time as the 'close'
    70                counter.
    71 
    72 stats.storage_server.*:
    73  allocated: this counts how many bytes are currently 'allocated', which
    74             tracks the space that will eventually be consumed by immutable
    75             share upload operations. The stat is increased as soon as the
    76             upload begins (at the same time the 'allocated' counter is
    77             incremented), and goes back to zero when the 'close' or 'abort'
    78             message is received (at which point the 'disk_used' stat should
    79             incremented by the same amount).
    80  disk_total
    81  disk_used
    82  disk_free_for_root
    83  disk_free_for_nonroot
    84  disk_avail
    85  reserved_space: these all reflect disk-space usage policies and status.
    86                  'disk_total' is the total size of disk where the storage
    87                  server's BASEDIR/storage/shares directory lives, as reported
    88                  by /bin/df or equivalent. 'disk_used', 'disk_free_for_root',
    89                  and 'disk_free_for_nonroot' show related information.
    90                  'reserved_space' reports the reservation configured by the
    91                  tahoe.cfg [storage]reserved_space value. 'disk_avail'
    92                  reports the remaining disk space available for the Tahoe
    93                  server after subtracting reserved_space from disk_avail. All
    94                  values are in bytes.
    95  accepting_immutable_shares: this is '1' if the storage server is currently
    96                              accepting uploads of immutable shares. It may be
    97                              '0' if a server is disabled by configuration, or
    98                              if the disk is full (i.e. disk_avail is less
    99                              than reserved_space).
    100  total_bucket_count: this counts the number of 'buckets' (i.e. unique
    101                      storage-index values) currently managed by the storage
    102                      server. It indicates roughly how many files are managed
    103                      by the server.
    104  latencies.*.*: these stats keep track of local disk latencies for
    105                 storage-server operations. A number of percentile values are
    106                 tracked for many operations. For example,
    107                 'storage_server.latencies.readv.50_0_percentile' records the
    108                 median response time for a 'readv' request. All values are in
    109                 seconds. These are recorded by the storage server, starting
    110                 from the time the request arrives (post-deserialization) and
    111                 ending when the response begins serialization. As such, they
    112                 are mostly useful for measuring disk speeds. The operations
    113                 tracked are the same as the counters.storage_server.* counter
    114                 values (allocate, write, close, get, read, add-lease, renew,
    115                 cancel, readv, writev). The percentile values tracked are:
    116                 mean, 01_0_percentile, 10_0_percentile, 50_0_percentile,
    117                 90_0_percentile, 95_0_percentile, 99_0_percentile,
    118                 99_9_percentile. (the last value, 99.9 percentile, means that
    119                 999 out of the last 1000 operations were faster than the
    120                 given number, and is the same threshold used by Amazon's
    121                 internal SLA, according to the Dynamo paper).
    122 
    123 counters.uploader.files_uploaded
    124 counters.uploader.bytes_uploaded
    125 counters.downloader.files_downloaded
    126 counters.downloader.bytes_downloaded
    127 
    128  These count client activity: a Tahoe client will increment these when it
    129  uploads or downloads an immutable file. 'files_uploaded' is incremented by
    130  one for each operation, while 'bytes_uploaded' is incremented by the size of
    131  the file.
    132 
    133 counters.mutable.files_published
    134 counters.mutable.bytes_published
    135 counters.mutable.files_retrieved
    136 counters.mutable.bytes_retrieved
     42**counters.storage_server.\***
     43
     44    this group counts inbound storage-server operations. They are not provided
     45    by client-only nodes which have been configured to not run a storage server
     46    (with [storage]enabled=false in tahoe.cfg)
     47                           
     48    allocate, write, close, abort
     49        these are for immutable file uploads. 'allocate' is incremented when a
     50        client asks if it can upload a share to the server. 'write' is
     51        incremented for each chunk of data written. 'close' is incremented when
     52        the share is finished. 'abort' is incremented if the client abandons
     53        the upload.
     54
     55    get, read
     56        these are for immutable file downloads. 'get' is incremented
     57        when a client asks if the server has a specific share. 'read' is
     58        incremented for each chunk of data read.
     59
     60    readv, writev
     61        these are for immutable file creation, publish, and retrieve. 'readv'
     62        is incremented each time a client reads part of a mutable share.
     63        'writev' is incremented each time a client sends a modification
     64        request.
     65
     66    add-lease, renew, cancel
     67        these are for share lease modifications. 'add-lease' is incremented
     68        when an 'add-lease' operation is performed (which either adds a new
     69        lease or renews an existing lease). 'renew' is for the 'renew-lease'
     70        operation (which can only be used to renew an existing one). 'cancel'
     71        is used for the 'cancel-lease' operation.
     72
     73    bytes_freed
     74        this counts how many bytes were freed when a 'cancel-lease'
     75        operation removed the last lease from a share and the share
     76        was thus deleted.
     77
     78    bytes_added
     79        this counts how many bytes were consumed by immutable share
     80        uploads. It is incremented at the same time as the 'close'
     81        counter.
     82
     83**stats.storage_server.\***
     84
     85    allocated
     86        this counts how many bytes are currently 'allocated', which
     87        tracks the space that will eventually be consumed by immutable
     88        share upload operations. The stat is increased as soon as the
     89        upload begins (at the same time the 'allocated' counter is
     90        incremented), and goes back to zero when the 'close' or 'abort'
     91        message is received (at which point the 'disk_used' stat should
     92        incremented by the same amount).
     93
     94    disk_total, disk_used, disk_free_for_root, disk_free_for_nonroot, disk_avail, reserved_space
     95        these all reflect disk-space usage policies and status.
     96        'disk_total' is the total size of disk where the storage
     97        server's BASEDIR/storage/shares directory lives, as reported
     98        by /bin/df or equivalent. 'disk_used', 'disk_free_for_root',
     99        and 'disk_free_for_nonroot' show related information.
     100        'reserved_space' reports the reservation configured by the
     101        tahoe.cfg [storage]reserved_space value. 'disk_avail'
     102        reports the remaining disk space available for the Tahoe
     103        server after subtracting reserved_space from disk_avail. All
     104        values are in bytes.
     105
     106    accepting_immutable_shares
     107        this is '1' if the storage server is currently accepting uploads of
     108        immutable shares. It may be '0' if a server is disabled by
     109        configuration, or if the disk is full (i.e. disk_avail is less than
     110        reserved_space).
     111
     112    total_bucket_count
     113        this counts the number of 'buckets' (i.e. unique
     114        storage-index values) currently managed by the storage
     115        server. It indicates roughly how many files are managed
     116        by the server.
     117
     118    latencies.*.*
     119        these stats keep track of local disk latencies for
     120        storage-server operations. A number of percentile values are
     121        tracked for many operations. For example,
     122        'storage_server.latencies.readv.50_0_percentile' records the
     123        median response time for a 'readv' request. All values are in
     124        seconds. These are recorded by the storage server, starting
     125        from the time the request arrives (post-deserialization) and
     126        ending when the response begins serialization. As such, they
     127        are mostly useful for measuring disk speeds. The operations
     128        tracked are the same as the counters.storage_server.* counter
     129        values (allocate, write, close, get, read, add-lease, renew,
     130        cancel, readv, writev). The percentile values tracked are:
     131        mean, 01_0_percentile, 10_0_percentile, 50_0_percentile,
     132        90_0_percentile, 95_0_percentile, 99_0_percentile,
     133        99_9_percentile. (the last value, 99.9 percentile, means that
     134        999 out of the last 1000 operations were faster than the
     135        given number, and is the same threshold used by Amazon's
     136        internal SLA, according to the Dynamo paper).
     137
     138**counters.uploader.files_uploaded**
     139
     140**counters.uploader.bytes_uploaded**
     141
     142**counters.downloader.files_downloaded**
     143
     144**counters.downloader.bytes_downloaded**
     145
     146    These count client activity: a Tahoe client will increment these when it
     147    uploads or downloads an immutable file. 'files_uploaded' is incremented by
     148    one for each operation, while 'bytes_uploaded' is incremented by the size of
     149    the file.
     150
     151**counters.mutable.files_published**
     152
     153**counters.mutable.bytes_published**
     154
     155**counters.mutable.files_retrieved**
     156
     157**counters.mutable.bytes_retrieved**
    137158
    138159 These count client activity for mutable files. 'published' is the act of
    139160 changing an existing mutable file (or creating a brand-new mutable file).
    140161 'retrieved' is the act of reading its current contents.
    141162
    142 counters.chk_upload_helper.*
     163**counters.chk_upload_helper.\***
     164
     165    These count activity of the "Helper", which receives ciphertext from clients
     166    and performs erasure-coding and share upload for files that are not already
     167    in the grid. The code which implements these counters is in
     168    src/allmydata/immutable/offloaded.py .
     169
     170    upload_requests
     171        incremented each time a client asks to upload a file
     172        upload_already_present: incremented when the file is already in the grid
     173
     174    upload_need_upload
     175        incremented when the file is not already in the grid
     176
     177    resumes
     178        incremented when the helper already has partial ciphertext for
     179        the requested upload, indicating that the client is resuming an
     180        earlier upload
     181
     182    fetched_bytes
     183        this counts how many bytes of ciphertext have been fetched
     184        from uploading clients
     185
     186    encoded_bytes
     187        this counts how many bytes of ciphertext have been
     188        encoded and turned into successfully-uploaded shares. If no
     189        uploads have failed or been abandoned, encoded_bytes should
     190        eventually equal fetched_bytes.
     191
     192**stats.chk_upload_helper.\***
     193
     194    These also track Helper activity:
     195
     196    active_uploads
     197        how many files are currently being uploaded. 0 when idle.
     198   
     199    incoming_count
     200        how many cache files are present in the incoming/ directory,
     201        which holds ciphertext files that are still being fetched
     202        from the client
    143203
    144  These count activity of the "Helper", which receives ciphertext from clients
    145  and performs erasure-coding and share upload for files that are not already
    146  in the grid. The code which implements these counters is in
    147  src/allmydata/immutable/offloaded.py .
    148 
    149   upload_requests: incremented each time a client asks to upload a file
    150   upload_already_present: incremented when the file is already in the grid
    151   upload_need_upload: incremented when the file is not already in the grid
    152   resumes: incremented when the helper already has partial ciphertext for
    153            the requested upload, indicating that the client is resuming an
    154            earlier upload
    155   fetched_bytes: this counts how many bytes of ciphertext have been fetched
    156                  from uploading clients
    157   encoded_bytes: this counts how many bytes of ciphertext have been
    158                  encoded and turned into successfully-uploaded shares. If no
    159                  uploads have failed or been abandoned, encoded_bytes should
    160                  eventually equal fetched_bytes.
    161 
    162 stats.chk_upload_helper.*
    163 
    164  These also track Helper activity:
    165 
    166   active_uploads: how many files are currently being uploaded. 0 when idle.
    167   incoming_count: how many cache files are present in the incoming/ directory,
    168                   which holds ciphertext files that are still being fetched
    169                   from the client
    170   incoming_size: total size of cache files in the incoming/ directory
    171   incoming_size_old: total size of 'old' cache files (more than 48 hours)
    172   encoding_count: how many cache files are present in the encoding/ directory,
    173                   which holds ciphertext files that are being encoded and
    174                   uploaded
    175   encoding_size: total size of cache files in the encoding/ directory
    176   encoding_size_old: total size of 'old' cache files (more than 48 hours)
    177 
    178 stats.node.uptime: how many seconds since the node process was started
    179 
    180 stats.cpu_monitor.*:
    181   .1min_avg, 5min_avg, 15min_avg: estimate of what percentage of system CPU
    182                                   time was consumed by the node process, over
    183                                   the given time interval. Expressed as a
    184                                   float, 0.0 for 0%, 1.0 for 100%
    185   .total: estimate of total number of CPU seconds consumed by node since
    186           the process was started. Ticket #472 indicates that .total may
    187           sometimes be negative due to wraparound of the kernel's counter.
    188 
    189 stats.load_monitor.*:
    190  When enabled, the "load monitor" continually schedules a one-second
    191  callback, and measures how late the response is. This estimates system load
    192  (if the system is idle, the response should be on time). This is only
    193  enabled if a stats-gatherer is configured.
     204    incoming_size
     205        total size of cache files in the incoming/ directory
    194206
    195  .avg_load: average "load" value (seconds late) over the last minute
    196  .max_load: maximum "load" value over the last minute
     207    incoming_size_old
     208        total size of 'old' cache files (more than 48 hours)
    197209
     210    encoding_count
     211        how many cache files are present in the encoding/ directory,
     212        which holds ciphertext files that are being encoded and
     213        uploaded
    198214
    199 == Running a Tahoe Stats-Gatherer Service ==
     215    encoding_size
     216        total size of cache files in the encoding/ directory
     217
     218    encoding_size_old
     219        total size of 'old' cache files (more than 48 hours)
     220
     221**stats.node.uptime**
     222    how many seconds since the node process was started
     223
     224**stats.cpu_monitor.\***
     225
     226    1min_avg, 5min_avg, 15min_avg
     227        estimate of what percentage of system CPU time was consumed by the
     228        node process, over the given time interval. Expressed as a float, 0.0
     229        for 0%, 1.0 for 100%
     230
     231    total
     232        estimate of total number of CPU seconds consumed by node since
     233        the process was started. Ticket #472 indicates that .total may
     234        sometimes be negative due to wraparound of the kernel's counter.
     235
     236**stats.load_monitor.\***
     237
     238    When enabled, the "load monitor" continually schedules a one-second
     239    callback, and measures how late the response is. This estimates system load
     240    (if the system is idle, the response should be on time). This is only
     241    enabled if a stats-gatherer is configured.
     242
     243    avg_load
     244        average "load" value (seconds late) over the last minute
     245
     246    max_load
     247        maximum "load" value over the last minute
     248
     249
     250Running a Tahoe Stats-Gatherer Service
     251======================================
    200252
    201253The "stats-gatherer" is a simple daemon that periodically collects stats from
    202254several tahoe nodes. It could be useful, e.g., in a production environment,
     
    204256host. It merely gatherers statistics from many nodes into a single place: it
    205257does not do any actual analysis.
    206258
    207 The stats gatherer listens on a network port using the same Foolscap
     259The stats gatherer listens on a network port using the same Foolscap_
    208260connection library that Tahoe clients use to connect to storage servers.
    209261Tahoe nodes can be configured to connect to the stats gatherer and publish
    210 their stats on a periodic basis. (in fact, what happens is that nodes connect
     262their stats on a periodic basis. (In fact, what happens is that nodes connect
    211263to the gatherer and offer it a second FURL which points back to the node's
    212264"stats port", which the gatherer then uses to pull stats on a periodic basis.
    213265The initial connection is flipped to allow the nodes to live behind NAT
    214 boxes, as long as the stats-gatherer has a reachable IP address)
     266boxes, as long as the stats-gatherer has a reachable IP address.)
     267
     268.. _Foolscap: http://foolscap.lothar.com/trac
    215269
    216270The stats-gatherer is created in the same fashion as regular tahoe client
    217271nodes and introducer nodes. Choose a base directory for the gatherer to live
    218272in (but do not create the directory). Then run:
    219273
    220  tahoe create-stats-gatherer $BASEDIR
     274::
     275
     276   tahoe create-stats-gatherer $BASEDIR
    221277
    222278and start it with "tahoe start $BASEDIR". Once running, the gatherer will
    223279write a FURL into $BASEDIR/stats_gatherer.furl .
     
    226282this FURL into the node's tahoe.cfg file, in a section named "[client]",
    227283under a key named "stats_gatherer.furl", like so:
    228284
    229  [client]
    230  stats_gatherer.furl = pb://qbo4ktl667zmtiuou6lwbjryli2brv6t@192.168.0.8:49997/wxycb4kaexzskubjnauxeoptympyf45y
     285::
     286
     287    [client]
     288    stats_gatherer.furl = pb://qbo4ktl667zmtiuou6lwbjryli2brv6t@192.168.0.8:49997/wxycb4kaexzskubjnauxeoptympyf45y
    231289
    232290or simply copy the stats_gatherer.furl file into the node's base directory
    233291(next to the tahoe.cfg file): it will be interpreted in the same way.
     
    256314total-disk-available number for the entire grid (however, the "disk watcher"
    257315daemon, in misc/operations_helpers/spacetime/, is better suited for this specific task).
    258316
    259 == Using Munin To Graph Stats Values ==
     317Using Munin To Graph Stats Values
     318=================================
    260319
    261320The misc/munin/ directory contains various plugins to graph stats for Tahoe
    262 nodes. They are intended for use with the Munin system-management tool, which
     321nodes. They are intended for use with the Munin_ system-management tool, which
    263322typically polls target systems every 5 minutes and produces a web page with
    264323graphs of various things over multiple time scales (last hour, last month,
    265324last year).
    266325
     326.. _Munin: http://munin-monitoring.org/
     327
    267328Most of the plugins are designed to pull stats from a single Tahoe node, and
    268329are configured with the e.g. http://localhost:3456/statistics?t=json URL. The
    269330"tahoe_stats" plugin is designed to read from the pickle file created by the