Context Navigation

Back to Ticket #1225

Ticket #1225: docs-txt-rst-conversion.patch

File docs-txt-rst-conversion.patch, 130.7 KB (added by p-static, at 2010-10-14T07:49:17Z)
Patch against latest darcs checkout

docs/architecture.txt

diff -rN -u old-tahoe-lafs/docs/architecture.txt new-tahoe-lafs/docs/architecture.txt

-                      old
+                      new
+= Tahoe-LAFS Architecture =
+=======================
+Tahoe-LAFS Architecture
+=======================
+.  `Overview`_
+.  `The Key-Value Store`_
+.  `File Encoding`_
+.  `Capabilities`_
+.  `Server Selection`_
+.  `Swarming Download, Trickling Upload`_
+.  `The Filesystem Layer`_
+.  `Leases, Refreshing, Garbage Collection`_
+.  `File Repairer`_
+. `Security`_
+. `Reliability`_
-.  Overview
-.  The Key-Value Store
-.  File Encoding
-.  Capabilities
-.  Server Selection
-.  Swarming Download, Trickling Upload
-.  The Filesystem Layer
-.  Leases, Refreshing, Garbage Collection
-.  File Repairer
-. Security
-. Reliability
 == Overview  ==
+Overview
+========
 (See the docs/specifications directory for more details.)
 …
 copies files from the local disk onto the decentralized filesystem. We later
 provide read-only access to those files, allowing users to recover them.
 There are several other applications built on top of the Tahoe-LAFS
+filesystem (see the RelatedProjects page of the wiki for a list).
+filesystem (see the `RelatedProjects
+<http://tahoe-lafs.org/trac/tahoe-lafs/wiki/RelatedProjects>`_ page of the
+wiki for a list).
+== The Key-Value Store ==
+The Key-Value Store
+===================
 The key-value store is implemented by a grid of Tahoe-LAFS storage servers --
 user-space processes. Tahoe-LAFS storage clients communicate with the storage
 …
 server to tell a new client about all the others.
+== File Encoding ==
+File Encoding
+=============
 When a client stores a file on the grid, it first encrypts the file. It then
 breaks the encrypted file into small segments, in order to reduce the memory
 …
 into plaintext, then emit the plaintext bytes to the output target.
+== Capabilities ==
+Capabilities
+============
 Capabilities to immutable files represent a specific set of bytes. Think of
 it like a hash function: you feed in a bunch of bytes, and you get out a
 …
 that these potential bytes are indeed the ones that you were looking for.
 The "key-value store" layer doesn't include human-meaningful names.
 Capabilities sit on the "global+secure" edge of Zooko's Triangle[1]. They are
+Capabilities sit on the "global+secure" edge of `Zooko's Triangle`_. They are
 self-authenticating, meaning that nobody can trick you into accepting a file
 that doesn't match the capability you used to refer to that file. The
 filesystem layer (described below) adds human-meaningful names atop the
 key-value layer.
+.. _`Zooko's Triangle`: http://en.wikipedia.org/wiki/Zooko%27s_triangle
+== Server Selection ==
+Server Selection
+================
 When a file is uploaded, the encoded shares are sent to some servers. But to
 which ones? The "server selection" algorithm is used to make this choice.
 The storage index is used to consistently-permute the set of all servers nodes
 (by sorting them by HASH(storage_index+nodeid)). Each file gets a different
+(by sorting them by ``HASH(storage_index+nodeid)``). Each file gets a different
 permutation, which (on average) will evenly distribute shares among the grid
 and avoid hotspots. Each server has announced its available space when it
 connected to the introducer, and we use that available space information to
 …
   significantly hurt reliability (sometimes the permutation resulted in most
   of the shares being dumped on a single node).
   Another algorithm (known as "denver airport"[2]) uses the permuted hash to
+  Another algorithm (known as "denver airport" [#naming]_) uses the permuted hash to
   decide on an approximate target for each share, then sends lease requests
   via Chord routing. The request includes the contact information of the
   uploading node, and asks that the node which eventually accepts the lease
 …
   the same approach. This allows nodes to avoid maintaining a large number of
   long-term connections, at the expense of complexity and latency.
+.. [#naming]  all of these names are derived from the location where they were
+        concocted, in this case in a car ride from Boulder to DEN. To be
+        precise, "Tahoe 1" was an unworkable scheme in which everyone who holds
+        shares for a given file would form a sort of cabal which kept track of
+        all the others, "Tahoe 2" is the first-100-nodes in the permuted hash
+        described in this document, and "Tahoe 3" (or perhaps "Potrero hill 1")
+        was the abandoned ring-with-many-hands approach.
+== Swarming Download, Trickling Upload ==
+Swarming Download, Trickling Upload
+===================================
 Because the shares being downloaded are distributed across a large number of
 nodes, the download process will pull from many of them at the same time. The
 …
 See "helper.txt" for details about the upload helper.
+== The Filesystem Layer ==
+The Filesystem Layer
+====================
 The "filesystem" layer is responsible for mapping human-meaningful pathnames
 (directories and filenames) to pieces of data. The actual bytes inside these
 …
 that are globally visible.
+== Leases, Refreshing, Garbage Collection ==
+Leases, Refreshing, Garbage Collection
+======================================
 When a file or directory in the virtual filesystem is no longer referenced,
 the space that its shares occupied on each storage server can be freed,
 …
 garbage collection.
+== File Repairer ==
+File Repairer
+=============
 Shares may go away because the storage server hosting them has suffered a
 failure: either temporary downtime (affecting availability of the file), or a
 …
   in client behavior.
+== Security ==
+Security
+========
 The design goal for this project is that an attacker may be able to deny
 service (i.e. prevent you from recovering a file that was uploaded earlier)
 but can accomplish none of the following three attacks:
 ) violate confidentiality: the attacker gets to view data to which you have
     not granted them access
 ) violate integrity: the attacker convinces you that the wrong data is
     actually the data you were intending to retrieve
 ) violate unforgeability: the attacker gets to modify a mutable file or
     directory (either the pathnames or the file contents) to which you have
     not given them write permission
+) violate confidentiality: the attacker gets to view data to which you have
+   not granted them access
+) violate integrity: the attacker convinces you that the wrong data is
+   actually the data you were intending to retrieve
+) violate unforgeability: the attacker gets to modify a mutable file or
+   directory (either the pathnames or the file contents) to which you have
+   not given them write permission
 Integrity (the promise that the downloaded data will match the uploaded data)
 is provided by the hashes embedded in the capability (for immutable files) or
 …
 capabilities).
+== Reliability ==
+Reliability
+===========
 File encoding and peer-node selection parameters can be adjusted to achieve
 different goals. Each choice results in a number of properties; there are
 …
 view the disk consumption of each. It is also acquiring some sections with
 availability/reliability numbers, as well as preliminary cost analysis data.
 This tool will continue to evolve as our analysis improves.
-------------------------------
-[1]: http://en.wikipedia.org/wiki/Zooko%27s_triangle
-[2]: all of these names are derived from the location where they were
-     concocted, in this case in a car ride from Boulder to DEN. To be
-     precise, "Tahoe 1" was an unworkable scheme in which everyone who holds
-     shares for a given file would form a sort of cabal which kept track of
-     all the others, "Tahoe 2" is the first-100-nodes in the permuted hash
-     described in this document, and "Tahoe 3" (or perhaps "Potrero hill 1")
-     was the abandoned ring-with-many-hands approach.

docs/backdoors.txt

diff -rN -u old-tahoe-lafs/docs/backdoors.txt new-tahoe-lafs/docs/backdoors.txt

-                      old
+                      new
+Statement on Backdoors
+======================
+Statement on Backdoors
+======================
 October 5, 2010
+The New York Times has recently reported that the current U.S. administration is proposing a bill that would apparently, if passed, require communication systems to facilitate government wiretapping and access to encrypted data:
+The New York Times has recently reported that the current U.S. administration
+is proposing a bill that would apparently, if passed, require communication
+systems to facilitate government wiretapping and access to encrypted data:
  http://www.nytimes.com/2010/09/27/us/27wiretap.html (login required; username/password pairs available at  http://www.bugmenot.com/view/nytimes.com).
+Commentary by the  Electronic Frontier Foundation (https://www.eff.org/deeplinks/2010/09/government-seeks ),  Peter Suderman / Reason (http://reason.com/blog/2010/09/27/obama-administration-frustrate ),  Julian Sanchez / Cato Institute (http://www.cato-at-liberty.org/designing-an-insecure-internet/ ).
+The core Tahoe developers promise never to change Tahoe-LAFS to facilitate government access to data stored or transmitted by it. Even if it were desirable to facilitate such access—which it is not—we believe it would not be technically feasible to do so without severely compromising Tahoe-LAFS' security against other attackers. There have been many examples in which backdoors intended for use by government have introduced vulnerabilities exploitable by other parties (a notable example being the Greek cellphone eavesdropping scandal in 2004/5). RFCs  1984 and  2804 elaborate on the security case against such backdoors.
+Note that since Tahoe-LAFS is open-source software, forks by people other than the current core developers are possible. In that event, we would try to persuade any such forks to adopt a similar policy.
+Commentary by the  Electronic Frontier Foundation
+(https://www.eff.org/deeplinks/2010/09/government-seeks ),  Peter Suderman /
+Reason (http://reason.com/blog/2010/09/27/obama-administration-frustrate ),
+Julian Sanchez / Cato Institute
+(http://www.cato-at-liberty.org/designing-an-insecure-internet/ ).
+The core Tahoe developers promise never to change Tahoe-LAFS to facilitate
+government access to data stored or transmitted by it. Even if it were
+desirable to facilitate such access—which it is not—we believe it would not be
+technically feasible to do so without severely compromising Tahoe-LAFS'
+security against other attackers. There have been many examples in which
+backdoors intended for use by government have introduced vulnerabilities
+exploitable by other parties (a notable example being the Greek cellphone
+eavesdropping scandal in 2004/5). RFCs  1984 and  2804 elaborate on the
+security case against such backdoors.
+Note that since Tahoe-LAFS is open-source software, forks by people other than
+the current core developers are possible. In that event, we would try to
+persuade any such forks to adopt a similar policy.
 The following Tahoe-LAFS developers agree with this statement:
 David-Sarah Hopwood
 Zooko Wilcox-O'Hearn
 Brian Warner
 Kevan Carstensen
 Frédéric Marti
 Jack Lloyd
 François Deppierraz
 Yu Xue
 Marc Tooley

docs/backupdb.txt

diff -rN -u old-tahoe-lafs/docs/backupdb.txt new-tahoe-lafs/docs/backupdb.txt

-                      old
+                      new
+= The Tahoe BackupDB =
+==================
+The Tahoe BackupDB
+==================
+.  `Overview`_
+.  `Schema`_
+.  `Upload Operation`_
+.  `Directory Operations`_
+== Overview ==
+Overview
+========
 To speed up backup operations, Tahoe maintains a small database known as the
 "backupdb". This is used to avoid re-uploading files which have already been
 uploaded recently.
 …
 as Debian etch (4.0 "oldstable") or Ubuntu Edgy (6.10) the "python-pysqlite2"
 package won't work, but the "sqlite3-dev" package will.
+== Schema ==
+Schema
+======
 The database contains the following tables:
+The database contains the following tables::
 CREATE TABLE version
+(
  version integer  # contains one row, set to 1
 );
 CREATE TABLE local_files
+(
  path  varchar(1024),  PRIMARY KEY -- index, this is os.path.abspath(fn)
  size  integer,         -- os.stat(fn)[stat.ST_SIZE]
  mtime number,          -- os.stat(fn)[stat.ST_MTIME]
  ctime number,          -- os.stat(fn)[stat.ST_CTIME]
  fileid integer
 );
 CREATE TABLE caps
+(
  fileid integer PRIMARY KEY AUTOINCREMENT,
  filecap varchar(256) UNIQUE    -- URI:CHK:...
 );
 CREATE TABLE last_upload
+(
  fileid INTEGER PRIMARY KEY,
  last_uploaded TIMESTAMP,
  last_checked TIMESTAMP
 );
 CREATE TABLE directories
+(
  dirhash varchar(256) PRIMARY KEY,
  dircap varchar(256),
  last_uploaded TIMESTAMP,
  last_checked TIMESTAMP
 );
+  CREATE TABLE version
+  (
+   version integer  # contains one row, set to 1
+  );
+  CREATE TABLE local_files
+  (
+   path  varchar(1024),  PRIMARY KEY -- index, this is os.path.abspath(fn)
+   size  integer,         -- os.stat(fn)[stat.ST_SIZE]
+   mtime number,          -- os.stat(fn)[stat.ST_MTIME]
+   ctime number,          -- os.stat(fn)[stat.ST_CTIME]
+   fileid integer
+  );
+  CREATE TABLE caps
+  (
+   fileid integer PRIMARY KEY AUTOINCREMENT,
+   filecap varchar(256) UNIQUE    -- URI:CHK:...
+  );
+  CREATE TABLE last_upload
+  (
+   fileid INTEGER PRIMARY KEY,
+   last_uploaded TIMESTAMP,
+   last_checked TIMESTAMP
+  );
+  CREATE TABLE directories
+  (
+   dirhash varchar(256) PRIMARY KEY,
+   dircap varchar(256),
+   last_uploaded TIMESTAMP,
+   last_checked TIMESTAMP
+  );
+== Upload Operation ==
+Upload Operation
+================
 The upload process starts with a pathname (like ~/.emacs) and wants to end up
 with a file-cap (like URI:CHK:...).
 …
 is not present in this table, the file must be uploaded. The upload process
 is:
+. record the file's size, creation time, and modification time
+. upload the file into the grid, obtaining an immutable file read-cap
+. add an entry to the 'caps' table, with the read-cap, to get a fileid
+. add an entry to the 'last_upload' table, with the current time
+. add an entry to the 'local_files' table, with the fileid, the path,
+    and the local file's size/ctime/mtime
+. record the file's size, creation time, and modification time
+. upload the file into the grid, obtaining an immutable file read-cap
+. add an entry to the 'caps' table, with the read-cap, to get a fileid
+. add an entry to the 'last_upload' table, with the current time
+. add an entry to the 'local_files' table, with the fileid, the path,
+   and the local file's size/ctime/mtime
 If the path *is* present in 'local_files', the easy-to-compute identifying
 information is compared: file size and ctime/mtime. If these differ, the file
 …
 into the grid. The --no-timestamps can be used to disable this optimization,
 forcing every byte of the file to be hashed and encoded.
+== Directory Operations ==
+Directory Operations
+====================
 Once the contents of a directory are known (a filecap for each file, and a
 dircap for each directory), the backup process must find or create a tahoe

docs/configuration.txt

diff -rN -u old-tahoe-lafs/docs/configuration.txt new-tahoe-lafs/docs/configuration.txt

-                      old
+                      new
+= Configuring a Tahoe node =
+========================
+Configuring a Tahoe node
+========================
+.  `Overall Node Configuration`_
+.  `Client Configuration`_
+.  `Storage Server Configuration`_
+.  `Running A Helper`_
+.  `Running An Introducer`_
+.  `Other Files in BASEDIR`_
+.  `Other files`_
+.  `Backwards Compatibility Files`_
+.  `Example`_
 A Tahoe node is configured by writing to files in its base directory. These
 files are read by the node when it starts, so each time you change them, you
 …
 The item descriptions below use the following types:
+ boolean: one of (True, yes, on, 1, False, off, no, 0), case-insensitive
+ strports string: a Twisted listening-port specification string, like "tcp:80"
+                  or "tcp:3456:interface=127.0.0.1". For a full description of
+                  the format, see
+                  http://twistedmatrix.com/documents/current/api/twisted.application.strports.html
+ FURL string: a Foolscap endpoint identifier, like
+              pb://soklj4y7eok5c3xkmjeqpw@192.168.69.247:44801/eqpwqtzm
+boolean
+    one of (True, yes, on, 1, False, off, no, 0), case-insensitive
+strports string
+    a Twisted listening-port specification string, like "tcp:80"
+    or "tcp:3456:interface=127.0.0.1". For a full description of
+    the format, see
+    http://twistedmatrix.com/documents/current/api/twisted.application.strports.html
+FURL string
+    a Foolscap endpoint identifier, like
+    pb://soklj4y7eok5c3xkmjeqpw@192.168.69.247:44801/eqpwqtzm
+== Overall Node Configuration ==
+Overall Node Configuration
+==========================
 This section controls the network behavior of the node overall: which ports
 and IP addresses are used, when connections are timed out, etc. This
 …
 that port number in the tub.port option. If behind a NAT, you *may* need to
 set the tub.location option described below.
+::
 [node]
+  [node]
 nickname = (UTF-8 string, optional)
+  nickname = (UTF-8 string, optional)
+ This value will be displayed in management tools as this node's "nickname".
+ If not provided, the nickname will be set to "<unspecified>". This string
+ shall be a UTF-8 encoded unicode string.
+web.port = (strports string, optional)
+ This controls where the node's webserver should listen, providing filesystem
+ access and node status as defined in webapi.txt . This file contains a
+ Twisted "strports" specification such as "3456" or
+ "tcp:3456:interface=127.0.0.1". The 'tahoe create-node' or 'tahoe create-client'
+ commands set the web.port to "tcp:3456:interface=127.0.0.1" by default; this
+ is overridable by the "--webport" option. You can make it use SSL by writing
+ "ssl:3456:privateKey=mykey.pem:certKey=cert.pem" instead.
+ If this is not provided, the node will not run a web server.
+web.static = (string, optional)
+ This controls where the /static portion of the URL space is served. The
+ value is a directory name (~username is allowed, and non-absolute names are
+ interpreted relative to the node's basedir) which can contain HTML and other
+ files. This can be used to serve a javascript-based frontend to the Tahoe
+ node, or other services.
+ The default value is "public_html", which will serve $BASEDIR/public_html .
+ With the default settings, http://127.0.0.1:3456/static/foo.html will serve
+ the contents of $BASEDIR/public_html/foo.html .
+tub.port = (integer, optional)
+ This controls which port the node uses to accept Foolscap connections from
+ other nodes. If not provided, the node will ask the kernel for any available
+ port. The port will be written to a separate file (named client.port or
+ introducer.port), so that subsequent runs will re-use the same port.
+tub.location = (string, optional)
+ In addition to running as a client, each Tahoe node also runs as a server,
+ listening for connections from other Tahoe clients. The node announces its
+ location by publishing a "FURL" (a string with some connection hints) to the
+ Introducer. The string it publishes can be found in
+ $BASEDIR/private/storage.furl . The "tub.location" configuration controls
+ what location is published in this announcement.
+ If you don't provide tub.location, the node will try to figure out a useful
+ one by itself, by using tools like 'ifconfig' to determine the set of IP
+ addresses on which it can be reached from nodes both near and far. It will
+ also include the TCP port number on which it is listening (either the one
+ specified by tub.port, or whichever port was assigned by the kernel when
+ tub.port is left unspecified).
+ You might want to override this value if your node lives behind a firewall
+ that is doing inbound port forwarding, or if you are using other proxies
+ such that the local IP address or port number is not the same one that
+ remote clients should use to connect. You might also want to control this
+ when using a Tor proxy to avoid revealing your actual IP address through the
+ Introducer announcement.
+ The value is a comma-separated string of host:port location hints, like
+ this:
+.45.67.89:8098,tahoe.example.com:8098,127.0.0.1:8098
+ A few examples:
+  Emulate default behavior, assuming your host has IP address 123.45.67.89
+  and the kernel-allocated port number was 8098:
+   tub.port = 8098
+   tub.location = 123.45.67.89:8098,127.0.0.1:8098
+  Use a DNS name so you can change the IP address more easily:
+   tub.port = 8098
+   tub.location = tahoe.example.com:8098
+  Run a node behind a firewall (which has an external IP address) that has
+  been configured to forward port 7912 to our internal node's port 8098:
+   tub.port = 8098
+   tub.location = external-firewall.example.com:7912
+  Run a node behind a Tor proxy (perhaps via torsocks), in client-only mode
+  (i.e. we can make outbound connections, but other nodes will not be able to
+  connect to us). The literal 'unreachable.example.org' will not resolve, but
+  will serve as a reminder to human observers that this node cannot be
+  reached. "Don't call us.. we'll call you":
+   tub.port = 8098
+   tub.location = unreachable.example.org:0
+  Run a node behind a Tor proxy, and make the server available as a Tor
+  "hidden service". (this assumes that other clients are running their node
+  with torsocks, such that they are prepared to connect to a .onion address).
+  The hidden service must first be configured in Tor, by giving it a local
+  port number and then obtaining a .onion name, using something in the torrc
+  file like:
+    HiddenServiceDir /var/lib/tor/hidden_services/tahoe
+    HiddenServicePort 29212 127.0.0.1:8098
+   once Tor is restarted, the .onion hostname will be in
+   /var/lib/tor/hidden_services/tahoe/hostname . Then set up your tahoe.cfg
+   like:
+    tub.port = 8098
+    tub.location = ualhejtq2p7ohfbb.onion:29212
+ Most users will not need to set tub.location .
+ Note that the old 'advertised_ip_addresses' file from earlier releases is no
+ longer supported. Tahoe 1.3.0 and later will ignore this file.
+log_gatherer.furl = (FURL, optional)
+ If provided, this contains a single FURL string which is used to contact a
+ 'log gatherer', which will be granted access to the logport. This can be
+ used by centralized storage meshes to gather operational logs in a single
+ place. Note that when an old-style BASEDIR/log_gatherer.furl file exists
+ (see 'Backwards Compatibility Files', below), both are used. (for most other
+ items, the separate config file overrides the entry in tahoe.cfg)
+timeout.keepalive = (integer in seconds, optional)
+timeout.disconnect = (integer in seconds, optional)
+ If timeout.keepalive is provided, it is treated as an integral number of
+ seconds, and sets the Foolscap "keepalive timer" to that value. For each
+ connection to another node, if nothing has been heard for a while, we will
+ attempt to provoke the other end into saying something. The duration of
+ silence that passes before sending the PING will be between KT and 2*KT.
+ This is mainly intended to keep NAT boxes from expiring idle TCP sessions,
+ but also gives TCP's long-duration keepalive/disconnect timers some traffic
+ to work with. The default value is 240 (i.e. 4 minutes).
+ If timeout.disconnect is provided, this is treated as an integral number of
+ seconds, and sets the Foolscap "disconnect timer" to that value. For each
+ connection to another node, if nothing has been heard for a while, we will
+ drop the connection. The duration of silence that passes before dropping the
+ connection will be between DT-2*KT and 2*DT+2*KT (please see ticket #521 for
+ more details). If we are sending a large amount of data to the other end
+ (which takes more than DT-2*KT to deliver), we might incorrectly drop the
+ connection. The default behavior (when this value is not provided) is to
+ disable the disconnect timer.
+ See ticket #521 for a discussion of how to pick these timeout values. Using
+minutes means we'll disconnect after 22 to 68 minutes of inactivity.
+ Receiving data will reset this timeout, however if we have more than 22min
+ of data in the outbound queue (such as 800kB in two pipelined segments of 10
+ shares each) and the far end has no need to contact us, our ping might be
+ delayed, so we may disconnect them by accident.
+ssh.port = (strports string, optional)
+ssh.authorized_keys_file = (filename, optional)
+ This enables an SSH-based interactive Python shell, which can be used to
+ inspect the internal state of the node, for debugging. To cause the node to
+ accept SSH connections on port 8022 from the same keys as the rest of your
+ account, use:
+   [tub]
+   ssh.port = 8022
+   ssh.authorized_keys_file = ~/.ssh/authorized_keys
+tempdir = (string, optional)
+ This specifies a temporary directory for the webapi server to use, for
+ holding large files while they are being uploaded. If a webapi client
+ attempts to upload a 10GB file, this tempdir will need to have at least 10GB
+ available for the upload to complete.
+ The default value is the "tmp" directory in the node's base directory (i.e.
+ $NODEDIR/tmp), but it can be placed elsewhere. This directory is used for
+ files that usually (on a unix system) go into /tmp . The string will be
+ interpreted relative to the node's base directory.
+== Client Configuration ==
+[client]
+introducer.furl = (FURL string, mandatory)
+ This FURL tells the client how to connect to the introducer. Each Tahoe grid
+ is defined by an introducer. The introducer's furl is created by the
+ introducer node and written into its base directory when it starts,
+ whereupon it should be published to everyone who wishes to attach a client
+ to that grid
+helper.furl = (FURL string, optional)
+ If provided, the node will attempt to connect to and use the given helper
+ for uploads. See docs/helper.txt for details.
+key_generator.furl = (FURL string, optional)
+ If provided, the node will attempt to connect to and use the given
+ key-generator service, using RSA keys from the external process rather than
+ generating its own.
+stats_gatherer.furl = (FURL string, optional)
+ If provided, the node will connect to the given stats gatherer and provide
+ it with operational statistics.
+shares.needed = (int, optional) aka "k", default 3
+shares.total = (int, optional) aka "N", N >= k, default 10
+shares.happy = (int, optional) 1 <= happy <= N, default 7
+ These three values set the default encoding parameters. Each time a new file
+ is uploaded, erasure-coding is used to break the ciphertext into separate
+ pieces. There will be "N" (i.e. shares.total) pieces created, and the file
+ will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved.
+ The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10).
+ Setting k to 1 is equivalent to simple replication (uploading N copies of
+ the file).
+ These values control the tradeoff between storage overhead, performance, and
+ reliability. To a first approximation, a 1MB file will use (1MB*N/k) of
+ backend storage space (the actual value will be a bit more, because of other
+ forms of overhead). Up to N-k shares can be lost before the file becomes
+ unrecoverable, so assuming there are at least N servers, up to N-k servers
+ can be offline without losing the file. So large N/k ratios are more
+ reliable, and small N/k ratios use less disk space. Clearly, k must never be
+ smaller than N.
+ Large values of N will slow down upload operations slightly, since more
+ servers must be involved, and will slightly increase storage overhead due to
+ the hash trees that are created. Large values of k will cause downloads to
+ be marginally slower, because more servers must be involved. N cannot be
+ larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe
+ uses.
+ shares.happy allows you control over the distribution of your immutable file.
+ For a successful upload, shares are guaranteed to be initially placed on
+ at least 'shares.happy' distinct servers, the correct functioning of any
+ k of which is sufficient to guarantee the availability of the uploaded file.
+ This value should not be larger than the number of servers on your grid.
+ A value of shares.happy <= k is allowed, but does not provide any redundancy
+ if some servers fail or lose shares.
+ (Mutable files use a different share placement algorithm that does not
+  consider this parameter.)
+== Storage Server Configuration ==
+[storage]
+enabled = (boolean, optional)
+ If this is True, the node will run a storage server, offering space to other
+ clients. If it is False, the node will not run a storage server, meaning
+ that no shares will be stored on this node. Use False this for clients who
+ do not wish to provide storage service. The default value is True.
+readonly = (boolean, optional)
+ If True, the node will run a storage server but will not accept any shares,
+ making it effectively read-only. Use this for storage servers which are
+ being decommissioned: the storage/ directory could be mounted read-only,
+ while shares are moved to other servers. Note that this currently only
+ affects immutable shares. Mutable shares (used for directories) will be
+ written and modified anyway. See ticket #390 for the current status of this
+ bug. The default value is False.
+reserved_space = (str, optional)
+ If provided, this value defines how much disk space is reserved: the storage
+ server will not accept any share which causes the amount of free disk space
+ to drop below this value. (The free space is measured by a call to statvfs(2)
+ on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the
+ user account under which the storage server runs.)
+ This string contains a number, with an optional case-insensitive scale
+ suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So
+ "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same
+ thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing.
+expire.enabled =
+expire.mode =
+expire.override_lease_duration =
+expire.cutoff_date =
+expire.immutable =
+expire.mutable =
+ These settings control garbage-collection, in which the server will delete
+ shares that no longer have an up-to-date lease on them. Please see the
+ neighboring "garbage-collection.txt" document for full details.
+    This value will be displayed in management tools as this node's "nickname".
+    If not provided, the nickname will be set to "<unspecified>". This string
+    shall be a UTF-8 encoded unicode string.
+  web.port = (strports string, optional)
+    This controls where the node's webserver should listen, providing filesystem
+    access and node status as defined in webapi.txt . This file contains a
+    Twisted "strports" specification such as "3456" or
+    "tcp:3456:interface=127.0.0.1". The 'tahoe create-node' or 'tahoe create-client'
+    commands set the web.port to "tcp:3456:interface=127.0.0.1" by default; this
+    is overridable by the "--webport" option. You can make it use SSL by writing
+    "ssl:3456:privateKey=mykey.pem:certKey=cert.pem" instead.
+    If this is not provided, the node will not run a web server.
+  web.static = (string, optional)
+    This controls where the /static portion of the URL space is served. The
+    value is a directory name (~username is allowed, and non-absolute names are
+    interpreted relative to the node's basedir) which can contain HTML and other
+    files. This can be used to serve a javascript-based frontend to the Tahoe
+    node, or other services.
+    The default value is "public_html", which will serve $BASEDIR/public_html .
+    With the default settings, http://127.0.0.1:3456/static/foo.html will serve
+    the contents of $BASEDIR/public_html/foo.html .
+  tub.port = (integer, optional)
+    This controls which port the node uses to accept Foolscap connections from
+    other nodes. If not provided, the node will ask the kernel for any available
+    port. The port will be written to a separate file (named client.port or
+    introducer.port), so that subsequent runs will re-use the same port.
+  tub.location = (string, optional)
+    In addition to running as a client, each Tahoe node also runs as a server,
+    listening for connections from other Tahoe clients. The node announces its
+    location by publishing a "FURL" (a string with some connection hints) to the
+    Introducer. The string it publishes can be found in
+    $BASEDIR/private/storage.furl . The "tub.location" configuration controls
+    what location is published in this announcement.
+    If you don't provide tub.location, the node will try to figure out a useful
+    one by itself, by using tools like 'ifconfig' to determine the set of IP
+    addresses on which it can be reached from nodes both near and far. It will
+    also include the TCP port number on which it is listening (either the one
+    specified by tub.port, or whichever port was assigned by the kernel when
+    tub.port is left unspecified).
+    You might want to override this value if your node lives behind a firewall
+    that is doing inbound port forwarding, or if you are using other proxies
+    such that the local IP address or port number is not the same one that
+    remote clients should use to connect. You might also want to control this
+    when using a Tor proxy to avoid revealing your actual IP address through the
+    Introducer announcement.
+    The value is a comma-separated string of host:port location hints, like
+    this:
+.45.67.89:8098,tahoe.example.com:8098,127.0.0.1:8098
+    A few examples:
+      Emulate default behavior, assuming your host has IP address 123.45.67.89
+      and the kernel-allocated port number was 8098:
+        tub.port = 8098
+        tub.location = 123.45.67.89:8098,127.0.0.1:8098
+      Use a DNS name so you can change the IP address more easily:
+        tub.port = 8098
+        tub.location = tahoe.example.com:8098
+      Run a node behind a firewall (which has an external IP address) that has
+      been configured to forward port 7912 to our internal node's port 8098:
+        tub.port = 8098
+        tub.location = external-firewall.example.com:7912
+      Run a node behind a Tor proxy (perhaps via torsocks), in client-only mode
+      (i.e. we can make outbound connections, but other nodes will not be able to
+      connect to us). The literal 'unreachable.example.org' will not resolve, but
+      will serve as a reminder to human observers that this node cannot be
+      reached. "Don't call us.. we'll call you":
+        tub.port = 8098
+        tub.location = unreachable.example.org:0
+      Run a node behind a Tor proxy, and make the server available as a Tor
+      "hidden service". (this assumes that other clients are running their node
+      with torsocks, such that they are prepared to connect to a .onion address).
+      The hidden service must first be configured in Tor, by giving it a local
+      port number and then obtaining a .onion name, using something in the torrc
+      file like:
+        HiddenServiceDir /var/lib/tor/hidden_services/tahoe
+        HiddenServicePort 29212 127.0.0.1:8098
+      once Tor is restarted, the .onion hostname will be in
+      /var/lib/tor/hidden_services/tahoe/hostname . Then set up your tahoe.cfg
+      like:
+        tub.port = 8098
+        tub.location = ualhejtq2p7ohfbb.onion:29212
+    Most users will not need to set tub.location .
+    Note that the old 'advertised_ip_addresses' file from earlier releases is no
+    longer supported. Tahoe 1.3.0 and later will ignore this file.
+  log_gatherer.furl = (FURL, optional)
+    If provided, this contains a single FURL string which is used to contact a
+    'log gatherer', which will be granted access to the logport. This can be
+    used by centralized storage meshes to gather operational logs in a single
+    place. Note that when an old-style BASEDIR/log_gatherer.furl file exists
+    (see 'Backwards Compatibility Files', below), both are used. (for most other
+    items, the separate config file overrides the entry in tahoe.cfg)
+  timeout.keepalive = (integer in seconds, optional)
+  timeout.disconnect = (integer in seconds, optional)
+    If timeout.keepalive is provided, it is treated as an integral number of
+    seconds, and sets the Foolscap "keepalive timer" to that value. For each
+    connection to another node, if nothing has been heard for a while, we will
+    attempt to provoke the other end into saying something. The duration of
+    silence that passes before sending the PING will be between KT and 2*KT.
+    This is mainly intended to keep NAT boxes from expiring idle TCP sessions,
+    but also gives TCP's long-duration keepalive/disconnect timers some traffic
+    to work with. The default value is 240 (i.e. 4 minutes).
+    If timeout.disconnect is provided, this is treated as an integral number of
+    seconds, and sets the Foolscap "disconnect timer" to that value. For each
+    connection to another node, if nothing has been heard for a while, we will
+    drop the connection. The duration of silence that passes before dropping the
+    connection will be between DT-2*KT and 2*DT+2*KT (please see ticket #521 for
+    more details). If we are sending a large amount of data to the other end
+    (which takes more than DT-2*KT to deliver), we might incorrectly drop the
+    connection. The default behavior (when this value is not provided) is to
+    disable the disconnect timer.
+    See ticket #521 for a discussion of how to pick these timeout values. Using
+minutes means we'll disconnect after 22 to 68 minutes of inactivity.
+    Receiving data will reset this timeout, however if we have more than 22min
+    of data in the outbound queue (such as 800kB in two pipelined segments of 10
+    shares each) and the far end has no need to contact us, our ping might be
+    delayed, so we may disconnect them by accident.
+  ssh.port = (strports string, optional)
+  ssh.authorized_keys_file = (filename, optional)
+    This enables an SSH-based interactive Python shell, which can be used to
+    inspect the internal state of the node, for debugging. To cause the node to
+    accept SSH connections on port 8022 from the same keys as the rest of your
+    account, use:
+      [tub]
+      ssh.port = 8022
+      ssh.authorized_keys_file = ~/.ssh/authorized_keys
+  tempdir = (string, optional)
+    This specifies a temporary directory for the webapi server to use, for
+    holding large files while they are being uploaded. If a webapi client
+    attempts to upload a 10GB file, this tempdir will need to have at least 10GB
+    available for the upload to complete.
+    The default value is the "tmp" directory in the node's base directory (i.e.
+    $NODEDIR/tmp), but it can be placed elsewhere. This directory is used for
+    files that usually (on a unix system) go into /tmp . The string will be
+    interpreted relative to the node's base directory.
+Client Configuration
+====================
+::
+  [client]
+  introducer.furl = (FURL string, mandatory)
+    This FURL tells the client how to connect to the introducer. Each Tahoe grid
+    is defined by an introducer. The introducer's furl is created by the
+    introducer node and written into its base directory when it starts,
+    whereupon it should be published to everyone who wishes to attach a client
+    to that grid
+  helper.furl = (FURL string, optional)
+    If provided, the node will attempt to connect to and use the given helper
+    for uploads. See docs/helper.txt for details.
+  key_generator.furl = (FURL string, optional)
+    If provided, the node will attempt to connect to and use the given
+    key-generator service, using RSA keys from the external process rather than
+    generating its own.
+  stats_gatherer.furl = (FURL string, optional)
+    If provided, the node will connect to the given stats gatherer and provide
+    it with operational statistics.
+  shares.needed = (int, optional) aka "k", default 3
+  shares.total = (int, optional) aka "N", N >= k, default 10
+  shares.happy = (int, optional) 1 <= happy <= N, default 7
+    These three values set the default encoding parameters. Each time a new file
+    is uploaded, erasure-coding is used to break the ciphertext into separate
+    pieces. There will be "N" (i.e. shares.total) pieces created, and the file
+    will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved.
+    The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10).
+    Setting k to 1 is equivalent to simple replication (uploading N copies of
+    the file).
+    These values control the tradeoff between storage overhead, performance, and
+    reliability. To a first approximation, a 1MB file will use (1MB*N/k) of
+    backend storage space (the actual value will be a bit more, because of other
+    forms of overhead). Up to N-k shares can be lost before the file becomes
+    unrecoverable, so assuming there are at least N servers, up to N-k servers
+    can be offline without losing the file. So large N/k ratios are more
+    reliable, and small N/k ratios use less disk space. Clearly, k must never be
+    smaller than N.
+    Large values of N will slow down upload operations slightly, since more
+    servers must be involved, and will slightly increase storage overhead due to
+    the hash trees that are created. Large values of k will cause downloads to
+    be marginally slower, because more servers must be involved. N cannot be
+    larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe
+    uses.
+    shares.happy allows you control over the distribution of your immutable file.
+    For a successful upload, shares are guaranteed to be initially placed on
+    at least 'shares.happy' distinct servers, the correct functioning of any
+    k of which is sufficient to guarantee the availability of the uploaded file.
+    This value should not be larger than the number of servers on your grid.
+    A value of shares.happy <= k is allowed, but does not provide any redundancy
+    if some servers fail or lose shares.
+    (Mutable files use a different share placement algorithm that does not
+    consider this parameter.)
+Storage Server Configuration
+============================
+::
+  [storage]
+  enabled = (boolean, optional)
+    If this is True, the node will run a storage server, offering space to other
+    clients. If it is False, the node will not run a storage server, meaning
+    that no shares will be stored on this node. Use False this for clients who
+    do not wish to provide storage service. The default value is True.
+  readonly = (boolean, optional)
+    If True, the node will run a storage server but will not accept any shares,
+    making it effectively read-only. Use this for storage servers which are
+    being decommissioned: the storage/ directory could be mounted read-only,
+    while shares are moved to other servers. Note that this currently only
+    affects immutable shares. Mutable shares (used for directories) will be
+    written and modified anyway. See ticket #390 for the current status of this
+    bug. The default value is False.
+  reserved_space = (str, optional)
+    If provided, this value defines how much disk space is reserved: the storage
+    server will not accept any share which causes the amount of free disk space
+    to drop below this value. (The free space is measured by a call to statvfs(2)
+    on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the
+    user account under which the storage server runs.)
+    This string contains a number, with an optional case-insensitive scale
+    suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So
+    "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same
+    thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing.
+  expire.enabled =
+  expire.mode =
+  expire.override_lease_duration =
+  expire.cutoff_date =
+  expire.immutable =
+  expire.mutable =
+    These settings control garbage-collection, in which the server will delete
+    shares that no longer have an up-to-date lease on them. Please see the
+    neighboring "garbage-collection.txt" document for full details.
+== Running A Helper ==
+Running A Helper
+================
 A "helper" is a regular client node that also offers the "upload helper"
 service.
+[helper]
+enabled = (boolean, optional)
+::
+ If True, the node will run a helper (see docs/helper.txt for details). The
+ helper's contact FURL will be placed in private/helper.furl, from which it
+ can be copied to any clients which wish to use it. Clearly nodes should not
+ both run a helper and attempt to use one: do not create both helper.furl and
+ run_helper in the same node. The default is False.
+  [helper]
+  enabled = (boolean, optional)
+    If True, the node will run a helper (see docs/helper.txt for details). The
+    helper's contact FURL will be placed in private/helper.furl, from which it
+    can be copied to any clients which wish to use it. Clearly nodes should not
+    both run a helper and attempt to use one: do not create both helper.furl and
+    run_helper in the same node. The default is False.
+== Running An Introducer ==
+Running An Introducer
+=====================
 The introducer node uses a different '.tac' file (named introducer.tac), and
 pays attention to the "[node]" section, but not the others.
 …
 copied into new client nodes before they are started for the first time.
+== Other Files in BASEDIR ==
+Other Files in BASEDIR
+======================
 Some configuration is not kept in tahoe.cfg, for the following reasons:
  * it is generated by the node at startup, e.g. encryption keys. The node
    never writes to tahoe.cfg
  * it is generated by user action, e.g. the 'tahoe create-alias' command
+* it is generated by the node at startup, e.g. encryption keys. The node
+  never writes to tahoe.cfg
+* it is generated by user action, e.g. the 'tahoe create-alias' command
 In addition, non-configuration persistent state is kept in the node's base
 directory, next to the configuration knobs.
 This section describes these other files.
+private/node.pem : This contains an SSL private-key certificate. The node
+generates this the first time it is started, and re-uses it on subsequent
+runs. This certificate allows the node to have a cryptographically-strong
+identifier (the Foolscap "TubID"), and to establish secure connections to
+other nodes.
+storage/ : Nodes which host StorageServers will create this directory to hold
+shares of files on behalf of other clients. There will be a directory
+underneath it for each StorageIndex for which this node is holding shares.
+There is also an "incoming" directory where partially-completed shares are
+held while they are being received.
+client.tac : this file defines the client, by constructing the actual Client
+instance each time the node is started. It is used by the 'twistd'
+daemonization program (in the "-y" mode), which is run internally by the
+"tahoe start" command. This file is created by the "tahoe create-node" or
+"tahoe create-client" commands.
+private/control.furl : this file contains a FURL that provides access to a
+control port on the client node, from which files can be uploaded and
+downloaded. This file is created with permissions that prevent anyone else
+from reading it (on operating systems that support such a concept), to insure
+that only the owner of the client node can use this feature. This port is
+intended for debugging and testing use.
+private/logport.furl : this file contains a FURL that provides access to a
+'log port' on the client node, from which operational logs can be retrieved.
+Do not grant logport access to strangers, because occasionally secret
+information may be placed in the logs.
+private/helper.furl : if the node is running a helper (for use by other
+clients), its contact FURL will be placed here. See docs/helper.txt for more
+details.
+private/root_dir.cap (optional): The command-line tools will read a directory
+cap out of this file and use it, if you don't specify a '--dir-cap' option or
+if you specify '--dir-cap=root'.
+private/convergence (automatically generated): An added secret for encrypting
+immutable files. Everyone who has this same string in their
+private/convergence file encrypts their immutable files in the same way when
+uploading them. This causes identical files to "converge" -- to share the
+same storage space since they have identical ciphertext -- which conserves
+space and optimizes upload time, but it also exposes files to the possibility
+of a brute-force attack by people who know that string. In this attack, if
+the attacker can guess most of the contents of a file, then they can use
+brute-force to learn the remaining contents.
+private/node.pem
+  This contains an SSL private-key certificate. The node
+  generates this the first time it is started, and re-uses it on subsequent
+  runs. This certificate allows the node to have a cryptographically-strong
+  identifier (the Foolscap "TubID"), and to establish secure connections to
+  other nodes.
+storage/
+  Nodes which host StorageServers will create this directory to hold
+  shares of files on behalf of other clients. There will be a directory
+  underneath it for each StorageIndex for which this node is holding shares.
+  There is also an "incoming" directory where partially-completed shares are
+  held while they are being received.
+client.tac
+  this file defines the client, by constructing the actual Client
+  instance each time the node is started. It is used by the 'twistd'
+  daemonization program (in the "-y" mode), which is run internally by the
+  "tahoe start" command. This file is created by the "tahoe create-node" or
+  "tahoe create-client" commands.
+private/control.furl
+  this file contains a FURL that provides access to a
+  control port on the client node, from which files can be uploaded and
+  downloaded. This file is created with permissions that prevent anyone else
+  from reading it (on operating systems that support such a concept), to insure
+  that only the owner of the client node can use this feature. This port is
+  intended for debugging and testing use.
+private/logport.furl
+  this file contains a FURL that provides access to a
+  'log port' on the client node, from which operational logs can be retrieved.
+  Do not grant logport access to strangers, because occasionally secret
+  information may be placed in the logs.
+private/helper.furl
+  if the node is running a helper (for use by other
+  clients), its contact FURL will be placed here. See docs/helper.txt for more
+  details.
+private/root_dir.cap (optional)
+  The command-line tools will read a directory
+  cap out of this file and use it, if you don't specify a '--dir-cap' option or
+  if you specify '--dir-cap=root'.
+private/convergence (automatically generated)
+  An added secret for encrypting
+  immutable files. Everyone who has this same string in their
+  private/convergence file encrypts their immutable files in the same way when
+  uploading them. This causes identical files to "converge" -- to share the
+  same storage space since they have identical ciphertext -- which conserves
+  space and optimizes upload time, but it also exposes files to the possibility
+  of a brute-force attack by people who know that string. In this attack, if
+  the attacker can guess most of the contents of a file, then they can use
+  brute-force to learn the remaining contents.
 So the set of people who know your private/convergence string is the set of
 people who converge their storage space with you when you and they upload
 …
 possible, put the empty string (so that private/convergence is a zero-length
 file).
+Other files
+===========
+== Other files ==
+logs/
+  Each Tahoe node creates a directory to hold the log messages produced
+  as the node runs. These logfiles are created and rotated by the "twistd"
+  daemonization program, so logs/twistd.log will contain the most recent
+  messages, logs/twistd.log.1 will contain the previous ones, logs/twistd.log.2
+  will be older still, and so on. twistd rotates logfiles after they grow
+  beyond 1MB in size. If the space consumed by logfiles becomes troublesome,
+  they should be pruned: a cron job to delete all files that were created more
+  than a month ago in this logs/ directory should be sufficient.
+my_nodeid
+  this is written by all nodes after startup, and contains a
+  base32-encoded (i.e. human-readable) NodeID that identifies this specific
+  node. This NodeID is the same string that gets displayed on the web page (in
+  the "which peers am I connected to" list), and the shortened form (the first
+  characters) is recorded in various log messages.
+logs/ : Each Tahoe node creates a directory to hold the log messages produced
+as the node runs. These logfiles are created and rotated by the "twistd"
+daemonization program, so logs/twistd.log will contain the most recent
+messages, logs/twistd.log.1 will contain the previous ones, logs/twistd.log.2
+will be older still, and so on. twistd rotates logfiles after they grow
+beyond 1MB in size. If the space consumed by logfiles becomes troublesome,
+they should be pruned: a cron job to delete all files that were created more
+than a month ago in this logs/ directory should be sufficient.
+my_nodeid : this is written by all nodes after startup, and contains a
+base32-encoded (i.e. human-readable) NodeID that identifies this specific
+node. This NodeID is the same string that gets displayed on the web page (in
+the "which peers am I connected to" list), and the shortened form (the first
+characters) is recorded in various log messages.
+== Backwards Compatibility Files ==
+Backwards Compatibility Files
+=============================
 Tahoe releases before 1.3.0 had no 'tahoe.cfg' file, and used distinct files
 for each item listed below. For each configuration knob, if the distinct file
+exists, it will take precedence over the corresponding item in tahoe.cfg .
+exists, it will take precedence over the corresponding item in tahoe.cfg.
+[node]nickname : BASEDIR/nickname
+[node]web.port : BASEDIR/webport
+[node]tub.port : BASEDIR/client.port  (for Clients, not Introducers)
+[node]tub.port : BASEDIR/introducer.port  (for Introducers, not Clients)
+      (note that, unlike other keys, tahoe.cfg overrides the *.port file)
+[node]tub.location : replaces BASEDIR/advertised_ip_addresses
+[node]log_gatherer.furl : BASEDIR/log_gatherer.furl (one per line)
+[node]timeout.keepalive : BASEDIR/keepalive_timeout
+[node]timeout.disconnect : BASEDIR/disconnect_timeout
+[client]introducer.furl : BASEDIR/introducer.furl
+[client]helper.furl : BASEDIR/helper.furl
+[client]key_generator.furl : BASEDIR/key_generator.furl
+[client]stats_gatherer.furl : BASEDIR/stats_gatherer.furl
+[storage]enabled : BASEDIR/no_storage (False if no_storage exists)
+[storage]readonly : BASEDIR/readonly_storage (True if readonly_storage exists)
+[storage]sizelimit : BASEDIR/sizelimit
+[storage]debug_discard : BASEDIR/debug_discard_storage
+[helper]enabled : BASEDIR/run_helper (True if run_helper exists)
+===========================  ===============================  =================
+Config setting               File                             Comment
+===========================  ===============================  =================
+[node]nickname               BASEDIR/nickname
+[node]web.port               BASEDIR/webport
+[node]tub.port               BASEDIR/client.port              (for Clients, not Introducers)
+[node]tub.port               BASEDIR/introducer.port          (for Introducers, not Clients) (note that, unlike other keys, tahoe.cfg overrides this file)
+[node]tub.location           BASEDIR/advertised_ip_addresses
+[node]log_gatherer.furl      BASEDIR/log_gatherer.furl        (one per line)
+[node]timeout.keepalive      BASEDIR/keepalive_timeout
+[node]timeout.disconnect     BASEDIR/disconnect_timeout
+[client]introducer.furl      BASEDIR/introducer.furl
+[client]helper.furl          BASEDIR/helper.furl
+[client]key_generator.furl   BASEDIR/key_generator.furl
+[client]stats_gatherer.furl  BASEDIR/stats_gatherer.furl
+[storage]enabled             BASEDIR/no_storage               (False if no_storage exists)
+[storage]readonly            BASEDIR/readonly_storage         (True if readonly_storage exists)
+[storage]sizelimit           BASEDIR/sizelimit
+[storage]debug_discard       BASEDIR/debug_discard_storage
+[helper]enabled              BASEDIR/run_helper               (True if run_helper exists)
+===========================  ===============================  =================
 Note: the functionality of [node]ssh.port and [node]ssh.authorized_keys_file
 were previously combined, controlled by the presence of a
 …
 indicated which port the ssh server should listen on, and the contents of the
 file provided the ssh public keys to accept. Support for these files has been
 removed completely. To ssh into your Tahoe node, add [node]ssh.port and
 [node].ssh_authorized_keys_file statements to your tahoe.cfg .
+[node].ssh_authorized_keys_file statements to your tahoe.cfg.
 Likewise, the functionality of [node]tub.location is a variant of the
 now-unsupported BASEDIR/advertised_ip_addresses . The old file was additive
 …
 is not (tub.location is used verbatim).
+== Example ==
+Example
+=======
 The following is a sample tahoe.cfg file, containing values for all keys
 described above. Note that this is not a recommended configuration (most of
 these are not the default values), merely a legal one.
+[node]
+nickname = Bob's Tahoe Node
+tub.port = 34912
+tub.location = 123.45.67.89:8098,44.55.66.77:8098
+web.port = 3456
+log_gatherer.furl = pb://soklj4y7eok5c3xkmjeqpw@192.168.69.247:44801/eqpwqtzm
+timeout.keepalive = 240
+timeout.disconnect = 1800
+ssh.port = 8022
+ssh.authorized_keys_file = ~/.ssh/authorized_keys
+[client]
+introducer.furl = pb://ok45ssoklj4y7eok5c3xkmj@tahoe.example:44801/ii3uumo
+helper.furl = pb://ggti5ssoklj4y7eok5c3xkmj@helper.tahoe.example:7054/kk8lhr
+[storage]
+enabled = True
+readonly_storage = True
+sizelimit = 10000000000
+::
+[helper]
+run_helper = True
+  [node]
+  nickname = Bob's Tahoe Node
+  tub.port = 34912
+  tub.location = 123.45.67.89:8098,44.55.66.77:8098
+  web.port = 3456
+  log_gatherer.furl = pb://soklj4y7eok5c3xkmjeqpw@192.168.69.247:44801/eqpwqtzm
+  timeout.keepalive = 240
+  timeout.disconnect = 1800
+  ssh.port = 8022
+  ssh.authorized_keys_file = ~/.ssh/authorized_keys
+  [client]
+  introducer.furl = pb://ok45ssoklj4y7eok5c3xkmj@tahoe.example:44801/ii3uumo
+  helper.furl = pb://ggti5ssoklj4y7eok5c3xkmj@helper.tahoe.example:7054/kk8lhr
+  [storage]
+  enabled = True
+  readonly_storage = True
+  sizelimit = 10000000000
+  [helper]
+  run_helper = True

docs/debian.txt

diff -rN -u old-tahoe-lafs/docs/debian.txt new-tahoe-lafs/docs/debian.txt

-                      old
+                      new
+= Debian Support =
+==============
+Debian Support
+==============
+.  `Overview`_
+.  `TL;DR supporting package building instructions`_
+.  `TL;DR package building instructions for Tahoe`_
+.  `Building Debian Packages`_
+.  `Using Pre-Built Debian Packages`_
+.  `Building From Source on Debian Systems`_
+.  Overview
+.  TL;DR supporting package building instructions
+.  TL;DR package building instructions for Tahoe
+.  Building Debian Packages
+.  Using Pre-Built Debian Packages
+.  Building From Source on Debian Systems
+= Overview ==
+Overview
+========
 One convenient way to install Tahoe-LAFS is with debian packages.
 This document attempts to explain how to complete a desert island build for
 people in a hurry. It also attempts to explain more about our Debian packaging
 for those willing to read beyond the simple pragmatic packaging exercises.
+== TL;DR supporting package building instructions ==
+TL;DR supporting package building instructions
+==============================================
 There are only four supporting packages that are currently not available from
 the debian apt repositories in Debian Lenny:
+the debian apt repositories in Debian Lenny::
     python-foolscap python-zfec argparse zbase32
 First, we'll install some common packages for development:
+First, we'll install some common packages for development::
     sudo apt-get install -y build-essential debhelper cdbs python-central \
                     python-setuptools python python-dev python-twisted-core \
 …
     sudo apt-file update
 To create packages for Lenny, we'll also install stdeb:
+To create packages for Lenny, we'll also install stdeb::
     sudo apt-get install python-all-dev
     STDEB_VERSION="0.5.1"
 …
     python setup.py --command-packages=stdeb.command bdist_deb
     sudo dpkg -i deb_dist/python-stdeb_$STDEB_VERSION-1_all.deb
 Now we're ready to build and install the zfec Debian package:
+Now we're ready to build and install the zfec Debian package::
     darcs get http://allmydata.org/source/zfec/trunk zfac
     cd zfac/zfec/
 …
     dpkg-buildpackage -rfakeroot -uc -us
     sudo dpkg -i ../python-zfec_1.4.6-r333-1_amd64.deb
 We need to build a pyutil package:
+We need to build a pyutil package::
     wget http://pypi.python.org/packages/source/p/pyutil/pyutil-1.6.1.tar.gz
     tar -xvzf pyutil-1.6.1.tar.gz
 …
     dpkg-buildpackage -rfakeroot -uc -us
     sudo dpkg -i ../python-pyutil_1.6.1-1_all.deb
 We also need to install argparse and zbase32:
+We also need to install argparse and zbase32::
     sudo easy_install argparse # argparse won't install with stdeb (!) :-(
     sudo easy_install zbase32 # XXX TODO: package with stdeb
 Finally, we'll fetch, unpack, build and install foolscap:
+Finally, we'll fetch, unpack, build and install foolscap::
     # You may not already have Brian's key:
     # gpg --recv-key 0x1514A7BD
 …
     dpkg-buildpackage -rfakeroot -uc -us
     sudo dpkg -i ../python-foolscap_0.5.0-1_all.deb
+== TL;DR package building instructions for Tahoe ==
+TL;DR package building instructions for Tahoe
+=============================================
 If you want to build your own Debian packages from the darcs tree or from
 a source release, do the following:
+a source release, do the following::
     cd ~/
     mkdir src && cd src/
 …
 /etc/defaults/allmydata-tahoe file to get Tahoe started. Data is by default
 stored in /var/lib/tahoelafsd/ and Tahoe runs as the 'tahoelafsd' user.
+== Building Debian Packages ==
+Building Debian Packages
+========================
 The Tahoe source tree comes with limited support for building debian packages
 on a variety of Debian and Ubuntu platforms. For each supported platform,
 …
 To create debian packages from a Tahoe tree, you will need some additional
 tools installed. The canonical list of these packages is in the
 "Build-Depends" clause of misc/sid/debian/control , and includes:
+"Build-Depends" clause of misc/sid/debian/control , and includes::
  build-essential
  debhelper
 …
 Note that we haven't tried to build source packages (.orig.tar.gz + dsc) yet,
 and there are no such source packages in our APT repository.
+== Using Pre-Built Debian Packages ==
+Using Pre-Built Debian Packages
+===============================
 The allmydata.org site hosts an APT repository with debian packages that are
 built after each checkin. The following wiki page describes this repository:
+ http://allmydata.org/trac/tahoe/wiki/DownloadDebianPackages
+built after each checkin. `This wiki page
+<http://allmydata.org/trac/tahoe/wiki/DownloadDebianPackages>`_ describes this
+repository.
 The allmydata.org APT repository also includes debian packages of support
 libraries, like Foolscap, zfec, pycryptopp, and everything else you need that
 isn't already in debian.
+== Building From Source on Debian Systems ==
+Building From Source on Debian Systems
+======================================
 Many of Tahoe's build dependencies can be satisfied by first installing
 certain debian packages: simplejson is one of these. Some debian/ubuntu

docs/filesystem-notes.txt

diff -rN -u old-tahoe-lafs/docs/filesystem-notes.txt new-tahoe-lafs/docs/filesystem-notes.txt

-                      old
+                      new
+=========================
+Filesystem-specific notes
+=========================
+. ext3_
 Tahoe storage servers use a large number of subdirectories to store their
 shares on local disk. This format is simple and robust, but depends upon the
 local filesystem to provide fast access to those directories.
+= ext3 =
+ext3
+====
 For moderate- or large-sized storage servers, you'll want to make sure the
 "directory index" feature is enabled on your ext3 directories, otherwise
 share lookup may be very slow. Recent versions of ext3 enable this
 automatically, but older filesystems may not have it enabled.
+automatically, but older filesystems may not have it enabled::
 $ sudo tune2fs -l /dev/sda1 |grep feature
 Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file
+  $ sudo tune2fs -l /dev/sda1 |grep feature
+  Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file
 If "dir_index" is present in the "features:" line, then you're all set. If
 not, you'll need to use tune2fs and e2fsck to enable and build the index. See
 this page for some hints: http://wiki.dovecot.org/MailboxFormat/Maildir .
+<http://wiki.dovecot.org/MailboxFormat/Maildir> for some hints.

docs/garbage-collection.txt

diff -rN -u old-tahoe-lafs/docs/garbage-collection.txt new-tahoe-lafs/docs/garbage-collection.txt

-                      old
+                      new
+= Garbage Collection in Tahoe =
+===========================
+Garbage Collection in Tahoe
+===========================
+. `Overview`_
+. `Client-side Renewal`_
+. `Server Side Expiration`_
+. `Expiration Progress`_
+. `Future Directions`_
+. Overview
+. Client-side Renewal
+. Server Side Expiration
+. Expiration Progress
+. Future Directions
+== Overview ==
+Overview
+========
 When a file or directory in the virtual filesystem is no longer referenced,
 the space that its shares occupied on each storage server can be freed,
 …
 server can use the "expire.override_lease_duration" configuration setting to
 increase or decrease the effective duration to something other than 31 days).
+== Client-side Renewal ==
+Client-side Renewal
+===================
 If all of the files and directories which you care about are reachable from a
 single starting point (usually referred to as a "rootcap"), and you store
 …
 appropriate for use by individual users as well, and may be incorporated
 directly into the client node.
+== Server Side Expiration ==
+Server Side Expiration
+======================
 Expiration must be explicitly enabled on each storage server, since the
 default behavior is to never expire shares. Expiration is enabled by adding
 …
 expired whatever it is going to expire, the second and subsequent passes are
 not going to find any new leases to remove.
 The tahoe.cfg file uses the following keys to control lease expiration:
+The tahoe.cfg file uses the following keys to control lease expiration::
 [storage]
+  [storage]
 expire.enabled = (boolean, optional)
+  expire.enabled = (boolean, optional)
  If this is True, the storage server will delete shares on which all leases
  have expired. Other controls dictate when leases are considered to have
  expired. The default is False.
+    If this is True, the storage server will delete shares on which all leases
+    have expired. Other controls dictate when leases are considered to have
+    expired. The default is False.
 expire.mode = (string, "age" or "cutoff-date", required if expiration enabled)
+  expire.mode = (string, "age" or "cutoff-date", required if expiration enabled)
  If this string is "age", the age-based expiration scheme is used, and the
  "expire.override_lease_duration" setting can be provided to influence the
  lease ages. If it is "cutoff-date", the absolute-date-cutoff mode is used,
  and the "expire.cutoff_date" setting must be provided to specify the cutoff
  date. The mode setting currently has no default: you must provide a value.
+        If this string is "age", the age-based expiration scheme is used, and the
+        "expire.override_lease_duration" setting can be provided to influence the
+        lease ages. If it is "cutoff-date", the absolute-date-cutoff mode is used,
+        and the "expire.cutoff_date" setting must be provided to specify the cutoff
+        date. The mode setting currently has no default: you must provide a value.
  In a future release, this setting is likely to default to "age", but in this
  release it was deemed safer to require an explicit mode specification.
+        In a future release, this setting is likely to default to "age", but in this
+        release it was deemed safer to require an explicit mode specification.
 expire.override_lease_duration = (duration string, optional)
+  expire.override_lease_duration = (duration string, optional)
  When age-based expiration is in use, a lease will be expired if its
  "lease.create_renew" timestamp plus its "lease.duration" time is
  earlier/older than the current time. This key, if present, overrides the
  duration value for all leases, changing the algorithm from:
+        When age-based expiration is in use, a lease will be expired if its
+        "lease.create_renew" timestamp plus its "lease.duration" time is
+        earlier/older than the current time. This key, if present, overrides the
+        duration value for all leases, changing the algorithm from:
    if (lease.create_renew_timestamp + lease.duration) < now:
        expire_lease()
+    if (lease.create_renew_timestamp + lease.duration) < now:
+        expire_lease()
  to:
+    to:
    if (lease.create_renew_timestamp + override_lease_duration) < now:
        expire_lease()
+    if (lease.create_renew_timestamp + override_lease_duration) < now:
+        expire_lease()
  The value of this setting is a "duration string", which is a number of days,
  months, or years, followed by a units suffix, and optionally separated by a
  space, such as one of the following:
+        The value of this setting is a "duration string", which is a number of days,
+        months, or years, followed by a units suffix, and optionally separated by a
+        space, such as one of the following:
 days
 day
 days
 mo
 month
 months
 years
+days
+day
+days
+mo
+month
+months
+years
  This key is meant to compensate for the fact that clients do not yet have
  the ability to ask for leases that last longer than 31 days. A grid which
  wants to use faster or slower GC than a 31-day lease timer permits can use
  this parameter to implement it. The current fixed 31-day lease duration
  makes the server behave as if "lease.override_lease_duration = 31days" had
  been passed.
+        This key is meant to compensate for the fact that clients do not yet have
+        the ability to ask for leases that last longer than 31 days. A grid which
+        wants to use faster or slower GC than a 31-day lease timer permits can use
+        this parameter to implement it. The current fixed 31-day lease duration
+        makes the server behave as if "lease.override_lease_duration = 31days" had
+        been passed.
  This key is only valid when age-based expiration is in use (i.e. when
  "expire.mode = age" is used). It will be rejected if cutoff-date expiration
  is in use.
+        This key is only valid when age-based expiration is in use (i.e. when
+        "expire.mode = age" is used). It will be rejected if cutoff-date expiration
+        is in use.
 expire.cutoff_date = (date string, required if mode=cutoff-date)
+  expire.cutoff_date = (date string, required if mode=cutoff-date)
  When cutoff-date expiration is in use, a lease will be expired if its
  create/renew timestamp is older than the cutoff date. This string will be a
  date in the following format:
+        When cutoff-date expiration is in use, a lease will be expired if its
+        create/renew timestamp is older than the cutoff date. This string will be a
+        date in the following format:
 -01-16   (January 16th, 2009)
 -02-02
 -12-25
+-01-16   (January 16th, 2009)
+-02-02
+-12-25
  The actual cutoff time shall be midnight UTC at the beginning of the given
  day. Lease timers should naturally be generous enough to not depend upon
  differences in timezone: there should be at least a few days between the
  last renewal time and the cutoff date.
+        The actual cutoff time shall be midnight UTC at the beginning of the given
+        day. Lease timers should naturally be generous enough to not depend upon
+        differences in timezone: there should be at least a few days between the
+        last renewal time and the cutoff date.
  This key is only valid when cutoff-based expiration is in use (i.e. when
  "expire.mode = cutoff-date"). It will be rejected if age-based expiration is
  in use.
+        This key is only valid when cutoff-based expiration is in use (i.e. when
+        "expire.mode = cutoff-date"). It will be rejected if age-based expiration is
+        in use.
 expire.immutable = (boolean, optional)
+  expire.immutable = (boolean, optional)
  If this is False, then immutable shares will never be deleted, even if their
  leases have expired. This can be used in special situations to perform GC on
  mutable files but not immutable ones. The default is True.
+        If this is False, then immutable shares will never be deleted, even if their
+        leases have expired. This can be used in special situations to perform GC on
+        mutable files but not immutable ones. The default is True.
 expire.mutable = (boolean, optional)
+  expire.mutable = (boolean, optional)
  If this is False, then mutable shares will never be deleted, even if their
  leases have expired. This can be used in special situations to perform GC on
  immutable files but not mutable ones. The default is True.
+        If this is False, then mutable shares will never be deleted, even if their
+        leases have expired. This can be used in special situations to perform GC on
+        immutable files but not mutable ones. The default is True.
+== Expiration Progress ==
+Expiration Progress
+===================
 In the current release, leases are stored as metadata in each share file, and
 no separate database is maintained. As a result, checking and expiring leases
 …
 crawler can be forcibly reset by stopping the node, deleting these two files,
 then restarting the node.
+== Future Directions ==
+Future Directions
+=================
 Tahoe's GC mechanism is undergoing significant changes. The global
 mark-and-sweep garbage-collection scheme can require considerable network

docs/helper.txt

diff -rN -u old-tahoe-lafs/docs/helper.txt new-tahoe-lafs/docs/helper.txt

-                      old
+                      new
+= The Tahoe Upload Helper =
+=======================
+The Tahoe Upload Helper
+=======================
+. `Overview`_
+. `Setting Up A Helper`_
+. `Using a Helper`_
+. `Other Helper Modes`_
+. Overview
+. Setting Up A Helper
+. Using a Helper
+. Other Helper Modes
+== Overview ==
+Overview
+========
 As described in the "SWARMING DOWNLOAD, TRICKLING UPLOAD" section of
 architecture.txt, Tahoe uploads require more bandwidth than downloads: you
 …
 other applications that are sharing the same uplink to compete more evenly
 for the limited bandwidth.
+== Setting Up A Helper ==
+Setting Up A Helper
+===================
 Who should consider running a helper?
  * Benevolent entities which wish to provide better upload speed for clients
    that have slow uplinks
  * Folks which have machines with upload bandwidth to spare.
  * Server grid operators who want clients to connect to a small number of
    helpers rather than a large number of storage servers (a "multi-tier"
    architecture)
+* Benevolent entities which wish to provide better upload speed for clients
+  that have slow uplinks
+* Folks which have machines with upload bandwidth to spare.
+* Server grid operators who want clients to connect to a small number of
+  helpers rather than a large number of storage servers (a "multi-tier"
+  architecture)
 What sorts of machines are good candidates for running a helper?
  * The Helper needs to have good bandwidth to the storage servers. In
    particular, it needs to have at least 3.3x better upload bandwidth than
    the client does, or the client might as well upload directly to the
    storage servers. In a commercial grid, the helper should be in the same
    colo (and preferably in the same rack) as the storage servers.
  * The Helper will take on most of the CPU load involved in uploading a file.
    So having a dedicated machine will give better results.
  * The Helper buffers ciphertext on disk, so the host will need at least as
    much free disk space as there will be simultaneous uploads. When an upload
    is interrupted, that space will be used for a longer period of time.
+* The Helper needs to have good bandwidth to the storage servers. In
+  particular, it needs to have at least 3.3x better upload bandwidth than
+  the client does, or the client might as well upload directly to the
+  storage servers. In a commercial grid, the helper should be in the same
+  colo (and preferably in the same rack) as the storage servers.
+* The Helper will take on most of the CPU load involved in uploading a file.
+  So having a dedicated machine will give better results.
+* The Helper buffers ciphertext on disk, so the host will need at least as
+  much free disk space as there will be simultaneous uploads. When an upload
+  is interrupted, that space will be used for a longer period of time.
 To turn a Tahoe-LAFS node into a helper (i.e. to run a helper service in
 addition to whatever else that node is doing), edit the tahoe.cfg file in your
 …
 helper: you will need to give this FURL to any clients that wish to use your
 helper.
+ cat $BASEDIR/private/helper.furl |mail -s "helper furl" friend@example.com
+::
+  cat $BASEDIR/private/helper.furl | mail -s "helper furl" friend@example.com
 You can tell if your node is running a helper by looking at its web status
 page. Assuming that you've set up the 'webport' to use port 3456, point your
 …
 files in these directories that have not been modified for a week or two.
 Future versions of tahoe will try to self-manage these files a bit better.
+== Using a Helper ==
+Using a Helper
+==============
 Who should consider using a Helper?
  * clients with limited upstream bandwidth, such as a consumer ADSL line
  * clients who believe that the helper will give them faster uploads than
    they could achieve with a direct upload
  * clients who experience problems with TCP connection fairness: if other
    programs or machines in the same home are getting less than their fair
    share of upload bandwidth. If the connection is being shared fairly, then
    a Tahoe upload that is happening at the same time as a single FTP upload
    should get half the bandwidth.
  * clients who have been given the helper.furl by someone who is running a
    Helper and is willing to let them use it
+* clients with limited upstream bandwidth, such as a consumer ADSL line
+* clients who believe that the helper will give them faster uploads than
+  they could achieve with a direct upload
+* clients who experience problems with TCP connection fairness: if other
+  programs or machines in the same home are getting less than their fair
+  share of upload bandwidth. If the connection is being shared fairly, then
+  a Tahoe upload that is happening at the same time as a single FTP upload
+  should get half the bandwidth.
+* clients who have been given the helper.furl by someone who is running a
+  Helper and is willing to let them use it
 To take advantage of somebody else's Helper, take the helper.furl file that
 they give you, and copy it into your node's base directory, then restart the
 node:
+ cat email >$BASEDIR/helper.furl
+ tahoe restart $BASEDIR
+::
+  cat email >$BASEDIR/helper.furl
+  tahoe restart $BASEDIR
 This will signal the client to try and connect to the helper. Subsequent
 uploads will use the helper rather than using direct connections to the
 …
 The upload/download status page (http://localhost:3456/status) will announce
 the using-helper-or-not state of each upload, in the "Helper?" column.
+== Other Helper Modes ==
+Other Helper Modes
+==================
 The Tahoe Helper only currently helps with one kind of operation: uploading
 immutable files. There are three other things it might be able to help with
 in the future:
  * downloading immutable files
  * uploading mutable files (such as directories)
  * downloading mutable files (like directories)
+* downloading immutable files
+* uploading mutable files (such as directories)
+* downloading mutable files (like directories)
 Since mutable files are currently limited in size, the ADSL upstream penalty
 is not so severe for them. There is no ADSL penalty to downloads, but there

docs/known_issues.txt

diff -rN -u old-tahoe-lafs/docs/known_issues.txt new-tahoe-lafs/docs/known_issues.txt

-                      old
+                      new
+= known issues =
+============
+Known issues
+============
+* `Overview`_
+* `Issues in Tahoe-LAFS v1.8.0, released 2010-09-23`
+  *  `Potential unauthorized access by JavaScript in unrelated files`_
+  *  `Potential disclosure of file through embedded hyperlinks or JavaScript in that file`_
+  *  `Command-line arguments are leaked to other local users`_
+  *  `Capabilities may be leaked to web browser phishing filter / "safe browsing" servers`_
+  *  `Known issues in the FTP and SFTP frontends`_
+*  overview
+*  issues in Tahoe-LAFS v1.8.0, released 2010-09-23
+  -  potential unauthorized access by JavaScript in unrelated files
+  -  potential disclosure of file through embedded hyperlinks or JavaScript in that file
+  -  command-line arguments are leaked to other local users
+  -  capabilities may be leaked to web browser phishing filter / "safe browsing" servers ===
+  -  known issues in the FTP and SFTP frontends ===
+== overview ==
+Overview
+========
 Below is a list of known issues in recent releases of Tahoe-LAFS, and how to
 manage them.  The current version of this file can be found at
 …
 http://tahoe-lafs.org/source/tahoe-lafs/trunk/docs/historical/historical_known_issues.txt
+== issues in Tahoe-LAFS v1.8.0, released 2010-09-18 ==
+Issues in Tahoe-LAFS v1.8.0, released 2010-09-23
+================================================
+=== potential unauthorized access by JavaScript in unrelated files ===
+Potential unauthorized access by JavaScript in unrelated files
+--------------------------------------------------------------
 If you view a file stored in Tahoe-LAFS through a web user interface,
 JavaScript embedded in that file might be able to access other files or
 …
 have the ability to modify the contents of those files or directories,
 then that script could modify or delete those files or directories.
+==== how to manage it ====
+how to manage it
+~~~~~~~~~~~~~~~~
 For future versions of Tahoe-LAFS, we are considering ways to close off
 this leakage of authority while preserving ease of use -- the discussion
 of this issue is ticket #615.
+of this issue is ticket `#615 <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/615>`_.
 For the present, either do not view files stored in Tahoe-LAFS through a
 web user interface, or turn off JavaScript in your web browser before
 …
 malicious JavaScript.
+=== potential disclosure of file through embedded hyperlinks or JavaScript in that file ===
+Potential disclosure of file through embedded hyperlinks or JavaScript in that file
+-----------------------------------------------------------------------------------
 If there is a file stored on a Tahoe-LAFS storage grid, and that file
 gets downloaded and displayed in a web browser, then JavaScript or
 …
 browsers, so being careful which hyperlinks you click on is not
 sufficient to prevent this from happening.
+==== how to manage it ====
+how to manage it
+~~~~~~~~~~~~~~~~
 For future versions of Tahoe-LAFS, we are considering ways to close off
 this leakage of authority while preserving ease of use -- the discussion
 of this issue is ticket #127.
+of this issue is ticket `#127 <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/127>`_.
 For the present, a good work-around is that if you want to store and
 view a file on Tahoe-LAFS and you want that file to remain private, then
 …
 written to maliciously leak access.
+=== command-line arguments are leaked to other local users ===
+Command-line arguments are leaked to other local users
+------------------------------------------------------
 Remember that command-line arguments are visible to other users (through
 the 'ps' command, or the windows Process Explorer tool), so if you are
 …
 arguments.  This includes directory caps that you set up with the "tahoe
 add-alias" command.
+==== how to manage it ====
+how to manage it
+~~~~~~~~~~~~~~~~
 As of Tahoe-LAFS v1.3.0 there is a "tahoe create-alias" command that does
 the following technique for you.
 …
 Bypass add-alias and edit the NODEDIR/private/aliases file directly, by
 adding a line like this:
 fun: URI:DIR2:ovjy4yhylqlfoqg2vcze36dhde:4d4f47qko2xm5g7osgo2yyidi5m4muyo2vjjy53q4vjju2u55mfa
+  fun: URI:DIR2:ovjy4yhylqlfoqg2vcze36dhde:4d4f47qko2xm5g7osgo2yyidi5m4muyo2vjjy53q4vjju2u55mfa
 By entering the dircap through the editor, the command-line arguments
 are bypassed, and other users will not be able to see them. Once you've
 …
 access to your files and directories.
+=== capabilities may be leaked to web browser phishing filter / "safe browsing" servers ===
+Capabilities may be leaked to web browser phishing filter / "safe browsing" servers
+-----------------------------------------------------------------------------------
 Firefox, Internet Explorer, and Chrome include a "phishing filter" or
 "safe browing" component, which is turned on by default, and which sends
 …
 version of this file stated that Firefox had abandoned their phishing
 filter; this was incorrect.
+==== how to manage it ====
+how to manage it
+~~~~~~~~~~~~~~~~
 If you use any phishing filter or "safe browsing" feature, consider either
 disabling it, or not using the WUI via that browser. Phishing filters have
 …
 or malware attackers have learnt how to bypass them.
 To disable the filter in IE7 or IE8:
+ - Click Internet Options from the Tools menu.
+ - Click the Advanced tab.
+ - If an "Enable SmartScreen Filter" option is present, uncheck it.
+   If a "Use Phishing Filter" or "Phishing Filter" option is present,
+   set it to Disable.
+ - Confirm (click OK or Yes) out of all dialogs.
+````````````````````````````````````
+- Click Internet Options from the Tools menu.
+- Click the Advanced tab.
+- If an "Enable SmartScreen Filter" option is present, uncheck it.
+  If a "Use Phishing Filter" or "Phishing Filter" option is present,
+  set it to Disable.
+- Confirm (click OK or Yes) out of all dialogs.
 If you have a version of IE that splits the settings between security
 zones, do this for all zones.
 To disable the filter in Firefox:
+ - Click Options from the Tools menu.
+ - Click the Security tab.
+ - Uncheck both the "Block reported attack sites" and "Block reported
+   web forgeries" options.
+ - Click OK.
+`````````````````````````````````
+- Click Options from the Tools menu.
+- Click the Security tab.
+- Uncheck both the "Block reported attack sites" and "Block reported
+  web forgeries" options.
+- Click OK.
 To disable the filter in Chrome:
+ - Click Options from the Tools menu.
+ - Click the "Under the Hood" tab and find the "Privacy" section.
+ - Uncheck the "Enable phishing and malware protection" option.
+ - Click Close.
+````````````````````````````````
+- Click Options from the Tools menu.
+- Click the "Under the Hood" tab and find the "Privacy" section.
+- Uncheck the "Enable phishing and malware protection" option.
+- Click Close.
+=== known issues in the FTP and SFTP frontends ===
+Known issues in the FTP and SFTP frontends
+------------------------------------------
 These are documented in docs/frontends/FTP-and-SFTP.txt and at
 <http://tahoe-lafs.org/trac/tahoe-lafs/wiki/SftpFrontend>.

docs/logging.txt

diff -rN -u old-tahoe-lafs/docs/logging.txt new-tahoe-lafs/docs/logging.txt

-                      old
+                      new
+= Tahoe Logging =
+=============
+Tahoe Logging
+=============
+.  `Overview`_
+.  `Realtime Logging`_
+.  `Incidents`_
+.  `Working with flogfiles`_
+.  `Gatherers`_
+.  `Incident Gatherer`_
+.  `Log Gatherer`_
+.  `Local twistd.log files`_
+.  `Adding log messages`_
+.  `Log Messages During Unit Tests`_
+.  Overview
+.  Realtime Logging
+.  Incidents
+.  Working with flogfiles
+.  Gatherers
+.1.  Incident Gatherer
+.2.  Log Gatherer
+.  Local twistd.log files
+.  Adding log messages
+.  Log Messages During Unit Tests
+== Overview ==
+Overview
+========
 Tahoe uses the Foolscap logging mechanism (known as the "flog" subsystem) to
 record information about what is happening inside the Tahoe node. This is
 …
 /usr/bin/flogtool) which is used to get access to many foolscap logging
 features.
+== Realtime Logging ==
+Realtime Logging
+================
 When you are working on Tahoe code, and want to see what the node is doing,
 the easiest tool to use is "flogtool tail". This connects to the tahoe node
 …
 BASEDIR/private/logport.furl . The following command will connect to this
 port and start emitting log information:
  flogtool tail BASEDIR/private/logport.furl
+  flogtool tail BASEDIR/private/logport.furl
 The "--save-to FILENAME" option will save all received events to a file,
 where then can be examined later with "flogtool dump" or "flogtool
 …
 before subscribing to new ones (without --catch-up, you will only hear about
 events that occur after the tool has connected and subscribed).
+== Incidents ==
+Incidents
+=========
 Foolscap keeps a short list of recent events in memory. When something goes
 wrong, it writes all the history it has (and everything that gets logged in
 …
 parent/child relationships of log events is displayed in a nested format.
 "flogtool web-viewer" is still fairly immature.
+== Working with flogfiles ==
+Working with flogfiles
+======================
 The "flogtool filter" command can be used to take a large flogfile (perhaps
 one created by the log-gatherer, see below) and copy a subset of events into
 …
 were emitted with a given facility (like foolscap.negotiation or
 tahoe.upload).
+== Gatherers ==
+Gatherers
+=========
 In a deployed Tahoe grid, it is useful to get log information automatically
 transferred to a central log-gatherer host. This offloads the (admittedly
 …
 The gatherer will write to files in its working directory, which can then be
 examined with tools like "flogtool dump" as described above.
+=== Incident Gatherer ===
+Incident Gatherer
+-----------------
 The "incident gatherer" only collects Incidents: records of the log events
 that occurred just before and slightly after some high-level "trigger event"
 …
 "gatherer.tac" file should be modified to add classifier functions.
 The incident gatherer writes incident names (which are simply the relative
 pathname of the incident-*.flog.bz2 file) into classified/CATEGORY. For
+pathname of the incident-\*.flog.bz2 file) into classified/CATEGORY. For
 example, the classified/mutable-retrieve-uncoordinated-write-error file
 contains a list of all incidents which were triggered by an uncoordinated
 write that was detected during mutable file retrieval (caused when somebody
 …
 node which generated it to the gatherer. The gatherer will automatically
 catch up to any incidents which occurred while it is offline.
+=== Log Gatherer ===
+Log Gatherer
+------------
 The "Log Gatherer" subscribes to hear about every single event published by
 the connected nodes, regardless of severity. This server writes these log
 …
 the outbound queue grows too large. When this occurs, there will be gaps
 (non-sequential event numbers) in the log-gatherer's flogfiles.
+== Local twistd.log files ==
+Local twistd.log files
+======================
 [TODO: not yet true, requires foolscap-0.3.1 and a change to allmydata.node]
 …
 (i.e. not the log.NOISY debugging events). In addition, foolscap internal
 events (like connection negotiation messages) are not bridged to twistd.log .
+== Adding log messages ==
+Adding log messages
+===================
 When adding new code, the Tahoe developer should add a reasonable number of
 new log events. For details, please see the Foolscap logging documentation,
 but a few notes are worth stating here:
  * use a facility prefix of "tahoe.", like "tahoe.mutable.publish"
+* use a facility prefix of "tahoe.", like "tahoe.mutable.publish"
  * assign each severe (log.WEIRD or higher) event a unique message
    identifier, as the umid= argument to the log.msg() call. The
    misc/coding_tools/make_umid script may be useful for this purpose. This will make it
    easier to write a classification function for these messages.
  * use the parent= argument whenever the event is causally/temporally
    clustered with its parent. For example, a download process that involves
    three sequential hash fetches could announce the send and receipt of those
    hash-fetch messages with a parent= argument that ties them to the overall
    download process. However, each new wapi download request should be
    unparented.
  * use the format= argument in preference to the message= argument. E.g.
    use log.msg(format="got %(n)d shares, need %(k)d", n=n, k=k) instead of
    log.msg("got %d shares, need %d" % (n,k)). This will allow later tools to
    analyze the event without needing to scrape/reconstruct the structured
    data out of the formatted string.
  * Pass extra information as extra keyword arguments, even if they aren't
    included in the format= string. This information will be displayed in the
    "flogtool dump --verbose" output, as well as being available to other
    tools. The umid= argument should be passed this way.
  * use log.err for the catch-all addErrback that gets attached to the end of
    any given Deferred chain. When used in conjunction with LOGTOTWISTED=1,
    log.err() will tell Twisted about the error-nature of the log message,
    causing Trial to flunk the test (with an "ERROR" indication that prints a
    copy of the Failure, including a traceback). Don't use log.err for events
    that are BAD but handled (like hash failures: since these are often
    deliberately provoked by test code, they should not cause test failures):
    use log.msg(level=BAD) for those instead.
+* assign each severe (log.WEIRD or higher) event a unique message
+  identifier, as the umid= argument to the log.msg() call. The
+  misc/coding_tools/make_umid script may be useful for this purpose. This will make it
+  easier to write a classification function for these messages.
+* use the parent= argument whenever the event is causally/temporally
+  clustered with its parent. For example, a download process that involves
+  three sequential hash fetches could announce the send and receipt of those
+  hash-fetch messages with a parent= argument that ties them to the overall
+  download process. However, each new wapi download request should be
+  unparented.
+* use the format= argument in preference to the message= argument. E.g.
+  use log.msg(format="got %(n)d shares, need %(k)d", n=n, k=k) instead of
+  log.msg("got %d shares, need %d" % (n,k)). This will allow later tools to
+  analyze the event without needing to scrape/reconstruct the structured
+  data out of the formatted string.
+* Pass extra information as extra keyword arguments, even if they aren't
+  included in the format= string. This information will be displayed in the
+  "flogtool dump --verbose" output, as well as being available to other
+  tools. The umid= argument should be passed this way.
+* use log.err for the catch-all addErrback that gets attached to the end of
+  any given Deferred chain. When used in conjunction with LOGTOTWISTED=1,
+  log.err() will tell Twisted about the error-nature of the log message,
+  causing Trial to flunk the test (with an "ERROR" indication that prints a
+  copy of the Failure, including a traceback). Don't use log.err for events
+  that are BAD but handled (like hash failures: since these are often
+  deliberately provoked by test code, they should not cause test failures):
+  use log.msg(level=BAD) for those instead.
+== Log Messages During Unit Tests ==
+Log Messages During Unit Tests
+==============================
 If a test is failing and you aren't sure why, start by enabling
 FLOGTOTWISTED=1 like this:
  make test FLOGTOTWISTED=1
+  make test FLOGTOTWISTED=1
 With FLOGTOTWISTED=1, sufficiently-important log events will be written into
 _trial_temp/test.log, which may give you more ideas about why the test is
 …
 If that isn't enough, look at the detailed foolscap logging messages instead,
 by running the tests like this:
  make test FLOGFILE=flog.out.bz2 FLOGLEVEL=1 FLOGTOTWISTED=1
+  make test FLOGFILE=flog.out.bz2 FLOGLEVEL=1 FLOGTOTWISTED=1
 The first environment variable will cause foolscap log events to be written
 to ./flog.out.bz2 (instead of merely being recorded in the circular buffers

docs/performance.txt

diff -rN -u old-tahoe-lafs/docs/performance.txt new-tahoe-lafs/docs/performance.txt

-                      old
+                      new
+= Performance costs for some common operations =
+============================================
+Performance costs for some common operations
+============================================
+.  `Publishing an A-byte immutable file`_
+.  `Publishing an A-byte mutable file`_
+.  `Downloading B bytes of an A-byte immutable file`_
+.  `Downloading B bytes of an A-byte mutable file`_
+.  `Modifying B bytes of an A-byte mutable file`_
+.  `Inserting/Removing B bytes in an A-byte mutable file`_
+.  `Adding an entry to an A-entry directory`_
+.  `Listing an A entry directory`_
+.  `Performing a file-check on an A-byte file`_
+. `Performing a file-verify on an A-byte file`_
+. `Repairing an A-byte file (mutable or immutable)`_
+.  Publishing an A-byte immutable file
+.  Publishing an A-byte mutable file
+.  Downloading B bytes of an A-byte immutable file
+.  Downloading B bytes of an A-byte mutable file
+.  Modifying B bytes of an A-byte mutable file
+.  Inserting/Removing B bytes in an A-byte mutable file
+.  Adding an entry to an A-entry directory
+.  Listing an A entry directory
+.  Performing a file-check on an A-byte file
+. Performing a file-verify on an A-byte file
+. Repairing an A-byte file (mutable or immutable)
+== Publishing an A-byte immutable file ==
+Publishing an ``A``-byte immutable file
+=======================================
 network: A
 memory footprint: N/k*128KiB
 notes: An immutable file upload requires an additional I/O pass over the entire
        source file before the upload process can start, since convergent
        encryption derives the encryption key in part from the contents of the
        source file.
+source file before the upload process can start, since convergent
+encryption derives the encryption key in part from the contents of the
+source file.
+== Publishing an A-byte mutable file ==
+Publishing an ``A``-byte mutable file
+=====================================
 network: A
 memory footprint: N/k*A
 cpu: O(A) + a large constant for RSA keypair generation
+notes: Tahoe-LAFS generates a new RSA keypair for each mutable file that
+       it publishes to a grid. This takes up to 1 or 2 seconds on a
+       typical desktop PC.
+       Part of the process of encrypting, encoding, and uploading a
+       mutable file to a Tahoe-LAFS grid requires that the entire file
+       be in memory at once. For larger files, this may cause
+       Tahoe-LAFS to have an unacceptably large memory footprint (at
+       least when uploading a mutable file).
+notes: Tahoe-LAFS generates a new RSA keypair for each mutable file that it
+publishes to a grid. This takes up to 1 or 2 seconds on a typical desktop PC.
+== Downloading B bytes of an A-byte immutable file ==
+Part of the process of encrypting, encoding, and uploading a mutable file to a
+Tahoe-LAFS grid requires that the entire file be in memory at once. For larger
+files, this may cause Tahoe-LAFS to have an unacceptably large memory footprint
+(at least when uploading a mutable file).
+Downloading ``B`` bytes of an ``A``-byte immutable file
+=======================================================
 network: B
 memory footprint: 128KiB
 notes: When Tahoe-LAFS 1.8.0 or later is asked to read an arbitrary range
        of an immutable file, only the 128-KiB segments that overlap the
        requested range will be downloaded.
+of an immutable file, only the 128-KiB segments that overlap the
+requested range will be downloaded.
        (Earlier versions would download from the beginning of the file up
        until the end of the requested range, and then continue to download
        the rest of the file even after the request was satisfied.)
+(Earlier versions would download from the beginning of the file up
+until the end of the requested range, and then continue to download
+the rest of the file even after the request was satisfied.)
+== Downloading B bytes of an A-byte mutable file ==
+Downloading ``B`` bytes of an ``A``-byte mutable file
+=====================================================
 network: A
 memory footprint: A
 notes: As currently implemented, mutable files must be downloaded in
        their entirety before any part of them can be read. We are
        exploring fixes for this; see ticket #393 for more information.
+their entirety before any part of them can be read. We are
+exploring fixes for this; see ticket #393 for more information.
+== Modifying B bytes of an A-byte mutable file ==
+Modifying ``B`` bytes of an ``A``-byte mutable file
+===================================================
 network: A
 memory footprint: N/k*A
 notes: If you upload a changed version of a mutable file that you
        earlier put onto your grid with, say, 'tahoe put --mutable',
        Tahoe-LAFS will replace the old file with the new file on the
        grid, rather than attempting to modify only those portions of the
        file that have changed. Modifying a file in this manner is
        essentially uploading the file over again, except that it re-uses
        the existing RSA keypair instead of generating a new one.
+earlier put onto your grid with, say, 'tahoe put --mutable',
+Tahoe-LAFS will replace the old file with the new file on the
+grid, rather than attempting to modify only those portions of the
+file that have changed. Modifying a file in this manner is
+essentially uploading the file over again, except that it re-uses
+the existing RSA keypair instead of generating a new one.
+== Inserting/Removing B bytes in an A-byte mutable file ==
+Inserting/Removing ``B`` bytes in an ``A``-byte mutable file
+============================================================
 network: A
 memory footprint: N/k*A
 notes: Modifying any part of a mutable file in Tahoe-LAFS requires that
        the entire file be downloaded, modified, held in memory while it is
        encrypted and encoded, and then re-uploaded. A future version of the
        mutable file layout ("LDMF") may provide efficient inserts and
        deletes. Note that this sort of modification is mostly used internally
        for directories, and isn't something that the WUI, CLI, or other
        interfaces will do -- instead, they will simply overwrite the file to
        be modified, as described in "Modifying B bytes of an A-byte mutable
        file".
+the entire file be downloaded, modified, held in memory while it is
+encrypted and encoded, and then re-uploaded. A future version of the
+mutable file layout ("LDMF") may provide efficient inserts and
+deletes. Note that this sort of modification is mostly used internally
+for directories, and isn't something that the WUI, CLI, or other
+interfaces will do -- instead, they will simply overwrite the file to
+be modified, as described in "Modifying B bytes of an A-byte mutable
+file".
+== Adding an entry to an A-entry directory ==
+Adding an entry to an ``A``-entry directory
+===========================================
 network: O(A)
 memory footprint: N/k*A
 notes: In Tahoe-LAFS, directories are implemented as specialized mutable
        files. So adding an entry to a directory is essentially adding B
        (actually, 300-330) bytes somewhere in an existing mutable file.
+files. So adding an entry to a directory is essentially adding B
+(actually, 300-330) bytes somewhere in an existing mutable file.
+== Listing an A entry directory ==
+Listing an ``A`` entry directory
+================================
 network: O(A)
 memory footprint: N/k*A
 notes: Listing a directory requires that the mutable file storing the
        directory be downloaded from the grid. So listing an A entry
        directory requires downloading a (roughly) 330 * A byte mutable
        file, since each directory entry is about 300-330 bytes in size.
+directory be downloaded from the grid. So listing an A entry
+directory requires downloading a (roughly) 330 * A byte mutable
+file, since each directory entry is about 300-330 bytes in size.
+== Performing a file-check on an A-byte file ==
+Performing a file-check on an ``A``-byte file
+=============================================
 network: O(S), where S is the number of servers on your grid
 memory footprint: negligible
 notes: To check a file, Tahoe-LAFS queries all the servers that it knows
        about. Note that neither of these values directly depend on the size
        of the file. This is relatively inexpensive, compared to the verify
        and repair operations.
+about. Note that neither of these values directly depend on the size
+of the file. This is relatively inexpensive, compared to the verify
+and repair operations.
+== Performing a file-verify on an A-byte file ==
+Performing a file-verify on an ``A``-byte file
+==============================================
 network: N/k*A
 memory footprint: N/k*128KiB
 notes: To verify a file, Tahoe-LAFS downloads all of the ciphertext
        shares that were originally uploaded to the grid and integrity
        checks them. This is, for well-behaved grids, likely to be more
        expensive than downloading an A-byte file, since only a fraction
        of these shares are necessary to recover the file.
+shares that were originally uploaded to the grid and integrity
+checks them. This is, for well-behaved grids, likely to be more
+expensive than downloading an A-byte file, since only a fraction
+of these shares are necessary to recover the file.
+== Repairing an A-byte file (mutable or immutable) ==
+Repairing an ``A``-byte file (mutable or immutable)
+===================================================
 network: variable; up to around O(A)
 memory footprint: from 128KiB to (1+N/k)*128KiB
 notes: To repair a file, Tahoe-LAFS downloads the file, and generates/uploads
        missing shares in the same way as when it initially uploads the file.
        So, depending on how many shares are missing, this can be about as
        expensive as initially uploading the file in the first place.
+missing shares in the same way as when it initially uploads the file.
+So, depending on how many shares are missing, this can be about as
+expensive as initially uploading the file in the first place.

docs/stats.txt

diff -rN -u old-tahoe-lafs/docs/stats.txt new-tahoe-lafs/docs/stats.txt

-                      old
+                      new
+= Tahoe Statistics =
+================
+Tahoe Statistics
+================
+. `Overview`_
+. `Statistics Categories`_
+. `Running a Tahoe Stats-Gatherer Service`_
+. `Using Munin To Graph Stats Values`_
+. Overview
+. Statistics Categories
+. Running a Tahoe Stats-Gatherer Service
+. Using Munin To Graph Stats Values
+== Overview ==
+Overview
+========
 Each Tahoe node collects and publishes statistics about its operations as it
 runs. These include counters of how many files have been uploaded and
 …
 block, along with a copy of the raw counters. To obtain just the raw counters
 (in JSON format), use /statistics?t=json instead.
+== Statistics Categories ==
+Statistics Categories
+=====================
 The stats dictionary contains two keys: 'counters' and 'stats'. 'counters'
 are strictly counters: they are reset to zero when the node is started, and
 …
 The currently available stats (as of release 1.6.0 or so) are described here:
+counters.storage_server.*: this group counts inbound storage-server
+                           operations. They are not provided by client-only
+                           nodes which have been configured to not run a
+                           storage server (with [storage]enabled=false in
+                           tahoe.cfg)
+  allocate, write, close, abort: these are for immutable file uploads.
+                                 'allocate' is incremented when a client asks
+                                 if it can upload a share to the server.
+                                 'write' is incremented for each chunk of
+                                 data written. 'close' is incremented when
+                                 the share is finished. 'abort' is
+                                 incremented if the client abandons the
+                                 uploaed.
+  get, read: these are for immutable file downloads. 'get' is incremented
+             when a client asks if the server has a specific share. 'read' is
+             incremented for each chunk of data read.
+  readv, writev: these are for immutable file creation, publish, and
+                 retrieve. 'readv' is incremented each time a client reads
+                 part of a mutable share. 'writev' is incremented each time a
+                 client sends a modification request.
+  add-lease, renew, cancel: these are for share lease modifications.
+                            'add-lease' is incremented when an 'add-lease'
+                            operation is performed (which either adds a new
+                            lease or renews an existing lease). 'renew' is
+                            for the 'renew-lease' operation (which can only
+                            be used to renew an existing one). 'cancel' is
+                            used for the 'cancel-lease' operation.
+  bytes_freed: this counts how many bytes were freed when a 'cancel-lease'
+               operation removed the last lease from a share and the share
+               was thus deleted.
+  bytes_added: this counts how many bytes were consumed by immutable share
+               uploads. It is incremented at the same time as the 'close'
+               counter.
+stats.storage_server.*:
+ allocated: this counts how many bytes are currently 'allocated', which
+            tracks the space that will eventually be consumed by immutable
+            share upload operations. The stat is increased as soon as the
+            upload begins (at the same time the 'allocated' counter is
+            incremented), and goes back to zero when the 'close' or 'abort'
+            message is received (at which point the 'disk_used' stat should
+            incremented by the same amount).
+ disk_total
+ disk_used
+ disk_free_for_root
+ disk_free_for_nonroot
+ disk_avail
+ reserved_space: these all reflect disk-space usage policies and status.
+                 'disk_total' is the total size of disk where the storage
+                 server's BASEDIR/storage/shares directory lives, as reported
+                 by /bin/df or equivalent. 'disk_used', 'disk_free_for_root',
+                 and 'disk_free_for_nonroot' show related information.
+                 'reserved_space' reports the reservation configured by the
+                 tahoe.cfg [storage]reserved_space value. 'disk_avail'
+                 reports the remaining disk space available for the Tahoe
+                 server after subtracting reserved_space from disk_avail. All
+                 values are in bytes.
+ accepting_immutable_shares: this is '1' if the storage server is currently
+                             accepting uploads of immutable shares. It may be
+                             '0' if a server is disabled by configuration, or
+                             if the disk is full (i.e. disk_avail is less
+                             than reserved_space).
+ total_bucket_count: this counts the number of 'buckets' (i.e. unique
+                     storage-index values) currently managed by the storage
+                     server. It indicates roughly how many files are managed
+                     by the server.
+ latencies.*.*: these stats keep track of local disk latencies for
+                storage-server operations. A number of percentile values are
+                tracked for many operations. For example,
+                'storage_server.latencies.readv.50_0_percentile' records the
+                median response time for a 'readv' request. All values are in
+                seconds. These are recorded by the storage server, starting
+                from the time the request arrives (post-deserialization) and
+                ending when the response begins serialization. As such, they
+                are mostly useful for measuring disk speeds. The operations
+                tracked are the same as the counters.storage_server.* counter
+                values (allocate, write, close, get, read, add-lease, renew,
+                cancel, readv, writev). The percentile values tracked are:
+                mean, 01_0_percentile, 10_0_percentile, 50_0_percentile,
+_0_percentile, 95_0_percentile, 99_0_percentile,
+_9_percentile. (the last value, 99.9 percentile, means that
+out of the last 1000 operations were faster than the
+                given number, and is the same threshold used by Amazon's
+                internal SLA, according to the Dynamo paper).
+counters.uploader.files_uploaded
+counters.uploader.bytes_uploaded
+counters.downloader.files_downloaded
+counters.downloader.bytes_downloaded
+ These count client activity: a Tahoe client will increment these when it
+ uploads or downloads an immutable file. 'files_uploaded' is incremented by
+ one for each operation, while 'bytes_uploaded' is incremented by the size of
+ the file.
+counters.mutable.files_published
+counters.mutable.bytes_published
+counters.mutable.files_retrieved
+counters.mutable.bytes_retrieved
+**counters.storage_server.\***
+    this group counts inbound storage-server operations. They are not provided
+    by client-only nodes which have been configured to not run a storage server
+    (with [storage]enabled=false in tahoe.cfg)
+    allocate, write, close, abort
+        these are for immutable file uploads. 'allocate' is incremented when a
+        client asks if it can upload a share to the server. 'write' is
+        incremented for each chunk of data written. 'close' is incremented when
+        the share is finished. 'abort' is incremented if the client abandons
+        the upload.
+    get, read
+        these are for immutable file downloads. 'get' is incremented
+        when a client asks if the server has a specific share. 'read' is
+        incremented for each chunk of data read.
+    readv, writev
+        these are for immutable file creation, publish, and retrieve. 'readv'
+        is incremented each time a client reads part of a mutable share.
+        'writev' is incremented each time a client sends a modification
+        request.
+    add-lease, renew, cancel
+        these are for share lease modifications. 'add-lease' is incremented
+        when an 'add-lease' operation is performed (which either adds a new
+        lease or renews an existing lease). 'renew' is for the 'renew-lease'
+        operation (which can only be used to renew an existing one). 'cancel'
+        is used for the 'cancel-lease' operation.
+    bytes_freed
+        this counts how many bytes were freed when a 'cancel-lease'
+        operation removed the last lease from a share and the share
+        was thus deleted.
+    bytes_added
+        this counts how many bytes were consumed by immutable share
+        uploads. It is incremented at the same time as the 'close'
+        counter.
+**stats.storage_server.\***
+    allocated
+        this counts how many bytes are currently 'allocated', which
+        tracks the space that will eventually be consumed by immutable
+        share upload operations. The stat is increased as soon as the
+        upload begins (at the same time the 'allocated' counter is
+        incremented), and goes back to zero when the 'close' or 'abort'
+        message is received (at which point the 'disk_used' stat should
+        incremented by the same amount).
+    disk_total, disk_used, disk_free_for_root, disk_free_for_nonroot, disk_avail, reserved_space
+        these all reflect disk-space usage policies and status.
+        'disk_total' is the total size of disk where the storage
+        server's BASEDIR/storage/shares directory lives, as reported
+        by /bin/df or equivalent. 'disk_used', 'disk_free_for_root',
+        and 'disk_free_for_nonroot' show related information.
+        'reserved_space' reports the reservation configured by the
+        tahoe.cfg [storage]reserved_space value. 'disk_avail'
+        reports the remaining disk space available for the Tahoe
+        server after subtracting reserved_space from disk_avail. All
+        values are in bytes.
+    accepting_immutable_shares
+        this is '1' if the storage server is currently accepting uploads of
+        immutable shares. It may be '0' if a server is disabled by
+        configuration, or if the disk is full (i.e. disk_avail is less than
+        reserved_space).
+    total_bucket_count
+        this counts the number of 'buckets' (i.e. unique
+        storage-index values) currently managed by the storage
+        server. It indicates roughly how many files are managed
+        by the server.
+    latencies.*.*
+        these stats keep track of local disk latencies for
+        storage-server operations. A number of percentile values are
+        tracked for many operations. For example,
+        'storage_server.latencies.readv.50_0_percentile' records the
+        median response time for a 'readv' request. All values are in
+        seconds. These are recorded by the storage server, starting
+        from the time the request arrives (post-deserialization) and
+        ending when the response begins serialization. As such, they
+        are mostly useful for measuring disk speeds. The operations
+        tracked are the same as the counters.storage_server.* counter
+        values (allocate, write, close, get, read, add-lease, renew,
+        cancel, readv, writev). The percentile values tracked are:
+        mean, 01_0_percentile, 10_0_percentile, 50_0_percentile,
+_0_percentile, 95_0_percentile, 99_0_percentile,
+_9_percentile. (the last value, 99.9 percentile, means that
+out of the last 1000 operations were faster than the
+        given number, and is the same threshold used by Amazon's
+        internal SLA, according to the Dynamo paper).
+**counters.uploader.files_uploaded**
+**counters.uploader.bytes_uploaded**
+**counters.downloader.files_downloaded**
+**counters.downloader.bytes_downloaded**
+    These count client activity: a Tahoe client will increment these when it
+    uploads or downloads an immutable file. 'files_uploaded' is incremented by
+    one for each operation, while 'bytes_uploaded' is incremented by the size of
+    the file.
+**counters.mutable.files_published**
+**counters.mutable.bytes_published**
+**counters.mutable.files_retrieved**
+**counters.mutable.bytes_retrieved**
  These count client activity for mutable files. 'published' is the act of
  changing an existing mutable file (or creating a brand-new mutable file).
  'retrieved' is the act of reading its current contents.
+counters.chk_upload_helper.*
+**counters.chk_upload_helper.\***
+    These count activity of the "Helper", which receives ciphertext from clients
+    and performs erasure-coding and share upload for files that are not already
+    in the grid. The code which implements these counters is in
+    src/allmydata/immutable/offloaded.py .
+    upload_requests
+        incremented each time a client asks to upload a file
+        upload_already_present: incremented when the file is already in the grid
+    upload_need_upload
+        incremented when the file is not already in the grid
+    resumes
+        incremented when the helper already has partial ciphertext for
+        the requested upload, indicating that the client is resuming an
+        earlier upload
+    fetched_bytes
+        this counts how many bytes of ciphertext have been fetched
+        from uploading clients
+    encoded_bytes
+        this counts how many bytes of ciphertext have been
+        encoded and turned into successfully-uploaded shares. If no
+        uploads have failed or been abandoned, encoded_bytes should
+        eventually equal fetched_bytes.
+**stats.chk_upload_helper.\***
+    These also track Helper activity:
+    active_uploads
+        how many files are currently being uploaded. 0 when idle.
+    incoming_count
+        how many cache files are present in the incoming/ directory,
+        which holds ciphertext files that are still being fetched
+        from the client
+ These count activity of the "Helper", which receives ciphertext from clients
+ and performs erasure-coding and share upload for files that are not already
+ in the grid. The code which implements these counters is in
+ src/allmydata/immutable/offloaded.py .
+  upload_requests: incremented each time a client asks to upload a file
+  upload_already_present: incremented when the file is already in the grid
+  upload_need_upload: incremented when the file is not already in the grid
+  resumes: incremented when the helper already has partial ciphertext for
+           the requested upload, indicating that the client is resuming an
+           earlier upload
+  fetched_bytes: this counts how many bytes of ciphertext have been fetched
+                 from uploading clients
+  encoded_bytes: this counts how many bytes of ciphertext have been
+                 encoded and turned into successfully-uploaded shares. If no
+                 uploads have failed or been abandoned, encoded_bytes should
+                 eventually equal fetched_bytes.
+stats.chk_upload_helper.*
+ These also track Helper activity:
+  active_uploads: how many files are currently being uploaded. 0 when idle.
+  incoming_count: how many cache files are present in the incoming/ directory,
+                  which holds ciphertext files that are still being fetched
+                  from the client
+  incoming_size: total size of cache files in the incoming/ directory
+  incoming_size_old: total size of 'old' cache files (more than 48 hours)
+  encoding_count: how many cache files are present in the encoding/ directory,
+                  which holds ciphertext files that are being encoded and
+                  uploaded
+  encoding_size: total size of cache files in the encoding/ directory
+  encoding_size_old: total size of 'old' cache files (more than 48 hours)
+stats.node.uptime: how many seconds since the node process was started
+stats.cpu_monitor.*:
+  .1min_avg, 5min_avg, 15min_avg: estimate of what percentage of system CPU
+                                  time was consumed by the node process, over
+                                  the given time interval. Expressed as a
+                                  float, 0.0 for 0%, 1.0 for 100%
+  .total: estimate of total number of CPU seconds consumed by node since
+          the process was started. Ticket #472 indicates that .total may
+          sometimes be negative due to wraparound of the kernel's counter.
+stats.load_monitor.*:
+ When enabled, the "load monitor" continually schedules a one-second
+ callback, and measures how late the response is. This estimates system load
+ (if the system is idle, the response should be on time). This is only
+ enabled if a stats-gatherer is configured.
+    incoming_size
+        total size of cache files in the incoming/ directory
  .avg_load: average "load" value (seconds late) over the last minute
  .max_load: maximum "load" value over the last minute
+    incoming_size_old
+        total size of 'old' cache files (more than 48 hours)
+    encoding_count
+        how many cache files are present in the encoding/ directory,
+        which holds ciphertext files that are being encoded and
+        uploaded
+== Running a Tahoe Stats-Gatherer Service ==
+    encoding_size
+        total size of cache files in the encoding/ directory
+    encoding_size_old
+        total size of 'old' cache files (more than 48 hours)
+**stats.node.uptime**
+    how many seconds since the node process was started
+**stats.cpu_monitor.\***
+min_avg, 5min_avg, 15min_avg
+        estimate of what percentage of system CPU time was consumed by the
+        node process, over the given time interval. Expressed as a float, 0.0
+        for 0%, 1.0 for 100%
+    total
+        estimate of total number of CPU seconds consumed by node since
+        the process was started. Ticket #472 indicates that .total may
+        sometimes be negative due to wraparound of the kernel's counter.
+**stats.load_monitor.\***
+    When enabled, the "load monitor" continually schedules a one-second
+    callback, and measures how late the response is. This estimates system load
+    (if the system is idle, the response should be on time). This is only
+    enabled if a stats-gatherer is configured.
+    avg_load
+        average "load" value (seconds late) over the last minute
+    max_load
+        maximum "load" value over the last minute
+Running a Tahoe Stats-Gatherer Service
+======================================
 The "stats-gatherer" is a simple daemon that periodically collects stats from
 several tahoe nodes. It could be useful, e.g., in a production environment,
 …
 host. It merely gatherers statistics from many nodes into a single place: it
 does not do any actual analysis.
 The stats gatherer listens on a network port using the same Foolscap
+The stats gatherer listens on a network port using the same Foolscap_
 connection library that Tahoe clients use to connect to storage servers.
 Tahoe nodes can be configured to connect to the stats gatherer and publish
 their stats on a periodic basis. (in fact, what happens is that nodes connect
+their stats on a periodic basis. (In fact, what happens is that nodes connect
 to the gatherer and offer it a second FURL which points back to the node's
 "stats port", which the gatherer then uses to pull stats on a periodic basis.
 The initial connection is flipped to allow the nodes to live behind NAT
+boxes, as long as the stats-gatherer has a reachable IP address)
+boxes, as long as the stats-gatherer has a reachable IP address.)
+.. _Foolscap: http://foolscap.lothar.com/trac
 The stats-gatherer is created in the same fashion as regular tahoe client
 nodes and introducer nodes. Choose a base directory for the gatherer to live
 in (but do not create the directory). Then run:
+ tahoe create-stats-gatherer $BASEDIR
+::
+   tahoe create-stats-gatherer $BASEDIR
 and start it with "tahoe start $BASEDIR". Once running, the gatherer will
 write a FURL into $BASEDIR/stats_gatherer.furl .
 …
 this FURL into the node's tahoe.cfg file, in a section named "[client]",
 under a key named "stats_gatherer.furl", like so:
+ [client]
+ stats_gatherer.furl = pb://qbo4ktl667zmtiuou6lwbjryli2brv6t@192.168.0.8:49997/wxycb4kaexzskubjnauxeoptympyf45y
+::
+    [client]
+    stats_gatherer.furl = pb://qbo4ktl667zmtiuou6lwbjryli2brv6t@192.168.0.8:49997/wxycb4kaexzskubjnauxeoptympyf45y
 or simply copy the stats_gatherer.furl file into the node's base directory
 (next to the tahoe.cfg file): it will be interpreted in the same way.
 …
 total-disk-available number for the entire grid (however, the "disk watcher"
 daemon, in misc/operations_helpers/spacetime/, is better suited for this specific task).
+== Using Munin To Graph Stats Values ==
+Using Munin To Graph Stats Values
+=================================
 The misc/munin/ directory contains various plugins to graph stats for Tahoe
 nodes. They are intended for use with the Munin system-management tool, which
+nodes. They are intended for use with the Munin_ system-management tool, which
 typically polls target systems every 5 minutes and produces a web page with
 graphs of various things over multiple time scales (last hour, last month,
 last year).
+.. _Munin: http://munin-monitoring.org/
 Most of the plugins are designed to pull stats from a single Tahoe node, and
 are configured with the e.g. http://localhost:3456/statistics?t=json URL. The
 "tahoe_stats" plugin is designed to read from the pickle file created by the

Download in other formats:

Original Format