wiki:AccountingDesign

Version 11 (modified by zooko, at 2012-02-03T13:46:35Z) (diff)

the names of those docs have changed to "old-"...

This is a place to share our work on the Accounting task. Ticket #666 is being used to track this long-term project.

Accounting

Tahoe-1.1 is lacking a significant feature: the ability for the admin of a storage server to keep track of how much disk space is being used by various clients. Friendnet operators can use this information to see how much their friends are using (and perhaps ask them to use less). Commercial grid operators can use this to bill customers according to how much space they use, or to limit them to a pre-defined plan ("1GB maximum"). (related tickets: #119)

Tahoe needs better handling of disk-full situations (related tickets: #390), but this is a server-wide issue, whereas Accounting is specifically per-user.

Potential Requirements

Here is a list of potential requirements, in no particular order of when (or if) we actually need them to be completed:

  1. only grant space to approved clients: "Larry the Leech" should not be able to upload files or cause existing files to be retained
  2. be able to answer the question "How much space is Bob using?"
    • 2a. asking this question about a single server (friendnet)
    • 2b. asking this question system-wide: summed across all storage servers (commercial grid)
  3. prior restraint: prevent Bob from consuming more than X bytes per server
  4. disable a previously-allowed account / revocation
  5. expiration: revoke permission after some amount of time, unless explicitly renewed
  6. delegation: if Bob has permission, he can grant some to Little Bobby Junior
    • 6a. subdivision / resellers : commercial grid operator grants space to a business partner
    • 6b. repair caps: clients delegate limited upload authority to a repairer
    • 6c. renewal cap: clients delegate lease-renewal authority
    • 6d. helper: clients enable the helper to upload files for them
  7. auditing: who owns this share, how did they get permission to upload it
  8. reconcilliation / garbage collection : which shares does Bob own?
  9. measure traffic: how many bytes did Bob upload or download (as opposed to how much is he currently storing)

Immediate Goals

We've established a smaller set of goals for the next few releases:

  • 1.2 (august 08):
    • be able to answer "how much does Bob store?"
    • deny service to Larry The Leech
    • enable accounting in both friendnet and commercial grids
    • enable accounting in webapi and helper interfaces
  • 1.3 (september?)
    • be able to enumerate Bob's shares, for reconcilliation
  • next after that
    • more generalized delegation

Design Overview

As touched upon in source:docs/proposed/old-accounts-pubkey.txt (and source:docs/proposed/old-accounts-introducer.txt), each share on a storage server is kept alive (in a garbage-collection sense) by one or more "leases", and each lease is assigned to a given "account"/"user"/"owner". The server has an imaginary "lease table" (imaginary in the snese that it is not actually implemented as a giant table: instead the data is broken up into more efficient/robust pieces). This two-dimensional lease table has Storage Index along one axis, and Account on the other, and each cell of the table represents a potential lease.

Each account-owner gets control over their column of the table: they can add leases to existing shares, upload new shares (which immediately acquire new leases), cancel their lease on a share (possibly causing the share to be garbage-collected), or get a list of all of their leases (for reconcilliation).

Some clients may be "super-powered", meaning that they may have the authority to affect more than one row of the table. It may be necessary to give a Repairer this sort of authority to let it keep files alive when the original uploading client is not participating in the maintenance process. POLA dictates that we try to avoid needing this sort of authority inflation, so superpower delegation is just a fallback plan.

The admin of each storage server decides their own policy by configuring their server with various certificates and public keys: fundamentally, storage authority originates with the server, and is delegated outwards to the clients. Clients are configured with certificates and private keys that allow them to use some portion of the server's authority.

Each time a client uploads a file (or otherwise makes use of storage authority), they must demonstrate their authority to each server, through a negotiation protocol. The client.upload() API will be modified to accept a new argument, tenatively named "cred=", that represents this authority. The webapi will also acquire such an argument, allowing the HTTP client to pass its authority to the webapi server so the server can perform the upload.

Design Pieces

Rough design tasks that need to be done.

  • Add cred= to upload API
    • client.upload() needs a cred= argument
    • the webapi PUT/POST commands need a cred= argument
    • the javascript-based webfront program (used by allmydata.com) needs cred
    • the human-oriented "wui" needs a way (cookies? sessions?) to express storage authority
  • Define how to configure clients with their storage authority
  • define how to create these credentials
    • certificate-signing tools
    • "tahoe sign" subcommand
  • define how to configure servers with their certificates
  • changes to Introduction
    • advertise accepted pubkeys in the storage-v2 announcements?
  • changes to peer selection
  • furlification process, persistence/optimization
  • label format: how should leases be labeled
  • usage-table management: databases, size totals, what to store in each lease
  • Usage/Aggregator? service
    • web interface
    • petname database / display

Design

Things that we've come to an agreement about.

Terminology

  • pubkey: enough data to securely verify a signature
  • pubkey identifier: enough data to securely identify a pubkey
  • pubkey hint: when trying to find a pubkey that validates a signature,

the pubkey hint provides enough data to reduce the search space to an acceptable level.

(Since we're planning to use ECDSA-192, public keys are short enough to use them directly as both pubkey identifiers and pubkey hints. But if we were using, say, RSA-2048, then we might instead want to use the SHA-256 hash of the pubkey as its identifier. If we are tight on space, we can use an arbitrarily short prefix of the ECDSA-192 public key as the pubkey hint).

Lease Labels

Each lease will be labeled with a single public key. This identifies who is responsible for the lease: which account should "pay" for the storage required by this share. The actual definition of "pay" will depend upon the server's policy: in most systems, simply being able to produce a total of the sizes of all shares with leases held by a given user will be enough to make decisions about that user (restrict to limit, pay-per-byte, nag-above-limit, whatever).

Certificate Chain

The v1 cert chain format: each element in the chain has three parts: the encoded certificate, the signature, and the pubkey hint. The encoded certificate has a number of fields that describe what is being delegated, but the most important is a pubkey identifier that indicates to whom this authority is being delegated. The fields we'll define for v1 are:

  • delegate-pubkey: (string) a pubkey identifier. The holder of the corresponding private key is hereby authorized to use the authority of the signer, as attenuated by the remainder of the fields in this certificate.
  • signer-gets-lease: (bool) if True, the signer of this certificate will be given a lease on the resulting shares. A privkey authorized by this chain will have control over a single full column of the lease table (all leases labeled with the signer's pubkey). In a full request chain (which contains a signed operation as well as the certificate chain), there must be exactly one True signer-gets-lease field, to make sure that there is exactly one lease on the resulting share.
  • other attenuations: TBD (things like until=, SI=, UEBhash=, operation=, max-size=)

Lease Tables

The server will maintain a "lease table", to provide efficient lookup by account. This primarilly supports the "how much is Bob using?" question, and will (in a future version) support the reconcilliation operation.

To avoid revising the v1 share file format (which only offers a 4-byte "ownerid" field), the server maintains a second table that maps from these 4-byte "pubkeyid" numbers to the full pubkey, and puts an additional column in the lease table to map from pubkey to pubkeyid.

  • NODEDIR/accounting/pubkeyids: 32 bytes per pubkeyid, assigned sequentially. pubkey[4] is the 32 bytes at offset 4*32.
  • NODEDIR/accounting/usage/base32(PUBKEY): one file per pubkey. Contents are:
    • bytes 0-3: pubkeyid
    • bytes 4-11: total size
    • bytes 12-19: total number of files
    • (deferred until v1.3): reconcilliation list (variable length list of SIs)

The lease table may be switched to use an intermediate prefix directory later, to make lookup more efficient (some native filesystems get slow when you put thousands or tens-of-thousands of files in a single directory).

For ext3, a 300k-entry lease table is likely to use 1.2GB . For something like reiserfs3 that can pack small files, it might take juse 18MB. A SQL representation would probably be fairly compact too.

Pet Name Table

The "Usage/Aggregator?" component (described below) can display a "pet name" along with each key, to make the results more meaningful ("Bob is using 73MB" instead of "pubkey j7sta2uvcr345znqwfwlxitnii is using 73MB"). This is more likely to be used by the friendnet case than the commercial grid (in which this functionality will most likely exist in an external database). The "pet name table" may contain other information about the public keys to which storage has been delegated.

  • NODEDIR/?/petnames: one line per pubkey, each line is:
    • base32(pubkey): PETNAME\n
  • if multiple pubkeys map to the same pet name, their usage will be added together at display time.

Furlification

An authorized client will have a private key and a certificate chain that authorizes that privkey to perform some operation. Since we use Foolscap to perform storage operations, we need a way to get from the cert-chain/pubkey world to the live-RemoteReference world. (an alternative would be to sign each storage operation, encrypting the result to some public key for confidentiality, but we would prefer an approach that does not require quite so many expensive public-key operations on the server). This process is called "furlification", since it serves to convert the certchain+privkey into a FURL that references an object which has the delegated authority.

The process starts by creating an encoded message that looks very much like the certificate described above. This message can contain any of the limitations or attenuations that the cert-chain message holds. But instead of a field named delegate-pubkey, it will have one named beneficiary-FURL. This message is signed by the private key that sits at the end of the certificate chain. Then the cert-chain, the message, the signature, a nonce, and one other argument named "precache-ignored" are all sent to the storage server's "login" facet. The return value from this message is ignored.

After checking the signatures, the login facet is then required to create a new foolscap.Referenceable object with the given authority (called the "operational object", or "Personal Storage Server Facet"), and to send (nonce, object-FURL, object) to the client-side object designated by the beneficiary-FURL. Any successful return value from this message is ignored (although it may raise AuthorityExpiredError, see below).

The beneficiary-FURL is used for the return path (instead of the return value from the login message) because the server that receives the signed message could easily forward it on to server2 in an attempt to steal the corresponding server2 authority. Since server2 will only send the operational object to the beneficiary, server1 cannot benefit from this sort of violation. However, to avoid a Foolscap round-trip, the beneficiary object is sent as the "precache-ignored" argument: this allows Foolscap to pre-cache the beneficiary without harming any of the security properties.

The object-FURL is expected to be persistent: clients should be able to cache them for later use (to reduce the number of pubkey operations that servers are required to perform). The object itself is sent in the beneficiary message mainly to pre-fill the Foolscap table; the client is allowed to either use the object directly or to getReference the object-FURL.

Servers are expected to create a FURL that contains a MAC'ed description of the object's limitations (and use Foolscap's Tub.registerNameLookupHandler), rather than maintain a big table from random swissnum to object description. Servers can include expiration times in these swissnums. If the client experiences AuthorityExpiredError when using their object reference (or when using getReference on the object-FURL), they must attempt the login process again with a new signed request message. If they experience AuthorityExpiredError during the login process, then part of their certificate chain may have expired.

Typical Delegation Patterns

We've identified 5 kinds of point-to-point delegation operations, described here. Types 1, 2, and 3 involve creating a new certificate (authorizing some private key to perform some set of operations). Type 4 involves giving a public key to someone else and having them start accepting requests that are signed by the corresponding private key.

In the friend-net use case, we expect that type-4 delegation will be used everywhere. When Alice sets up a storage server and wants to let Bob start using it, Bob will give her a public key, and Alice will drop that public key into her server's config with a note that says "accept requests that are signed by the corresponding private key". If Bob also wants Alice to be able to use his storage space ("reciprocity"), then Alice will give a public key to Bob.

In the commercial grid case, we want to achieve some isolation between clients and servers: we will be adding new clients and servers constantly, and it is important that configuring both is cheap. Requiring a client update for every new server (or a server update for every new client) is unacceptable. So in this case, we define an "Account Manager", with its own private key. We use type-4 delegation to have all storage servers delegate authority to the account manager (by dropping the AM's pubkey in the new storage server's cert-roots directory). We then use type 1 (or 2 or 3) delegation to grant authority to the clients: the AM signs a certificate that names each client's public key, and gives that certificate to the client: this certificate is called their "membership card". Each client, when it needs to upload a file, will then send their membership card and a signed request to the storage servers.

User Experience

Since Alice (who is running the storage server in our example) is who starts with full authority over that storage server, we would like the act of delegating authority to Bob to be expressed with a message from Alice to Bob (and not the other way around). Since the most likely form of delegation (type 1) involves a public key being sent in the opposite directory, we need to define a user-friendly command that can establish a communication channel for the two nodes to achieve the desired delegation.

So Alice will type tahoe invite Bob, and her node will create an Invitation object for Bob. This object is single-use, persistent (until claimed), and remembers the pet name "Bob". It gets an unguessable swissnum, and the command then emits a FURL that references this object. Alice sends the invitation FURL to Bob via any convenient secure channel.

Once Bob has received the invitation, he pastes it into an invocation of tahoe accept-invitation Alice (where "Alice" is his pet name for the person who sent him this invitation. Bob's node will then contact the Invitation object and claim it by sending Bob's public key to Alice. Alice's node will add Bob's public key in her root-certs directory, allowing Bob to use storage space on Alice's server. Her node will also record "Bob" as the pet name associated with this pubkey, so that her Usage display will report the usage of Bob instead of the less-useful "pubkey j7sta2uvcr345znqwfwlxitnii".

The default friend-net behavior is reciprocity, so unless this is disabled in the command arguments, the Invitation process will also deliver Alice's pubkey into Bob's root-certs list, and add "Alice" and her key to Bob's pet-name table.

Other material may be exchanged during the invitation process. A pair of new Tahoe directories can be established as an "inbox/outbot" pair, to let Alice and Bob exchange files securely (associated with the pre-established petnames). A FURL referencing an arbitrary object can be established for later expansion use (perhaps to enable real-time communication): while not useful in the current release, by establishing one during the initial invite/accept process, later releases will be able to build upon this secure reference.

All CLI tools should have web-form equivalents, which will probably require that a set of authority-bearing non-file-related web pages must be defined. The easiest way to accomplish this is to have an unguessable "control URL", stored in NODEDIR/private/control.url, and put all these authority-bearing pages (with "Invite" buttons, etc) as children of the control URL. The user would run "tahoe show-control-url" (or "tahoe webopen control-url" or something) to access it.

Attachments (15)