Changes between Version 15 and Version 16 of Summit2016


Ignore:
Timestamp:
2016-11-10T19:10:52Z (8 years ago)
Author:
warner
Comment:

add etherpad notes from summit

Legend:

Unmodified
Added
Removed
Modified
  • Summit2016

    v15 v16  
    7373http://freehaven.net/anonbib/cache/trickle02.pdf
    7474
    75 == things referred to during the Summit ==
    76 
    77 * (notes to be added here)
     75== Raw Notes ==
     76
    7877* https://pad.lqdn.fr/p/tahoe-lafs-summit-2016
     78* Attendees: daira, dawuud, meejah, liz, warner, zooko, secorp. remote: exarkun
     79
     80=== Tuesday AM: applications, use-cases, productization, integration with other apps ===
     81
     82
     83* use cases
     84 * travelling through dangerous places: erase the laptop first, travel, then restore your home directory
     85  * lawyers, business travel rules: to guarantee client confidentiality, they forbid proprietary data from being accessible on stealable devices
     86  * journalists: even the suggestion of encrypted data on a laptop could be dangerous in some regimes
     87  * medical information
     88  * whistleblowing
     89 * digital will: longer-term preservation of data
     90  * erasure-coding slightly more relevant, for long-term reliability
     91  * repair service is more relevant
     92 * backup
     93 * sysadmin/devops secret/credential management
     94  * password manager among an ops team, also ssh keys, AWS creds
     95  * who gets to see what: admin control
     96  * could include revocation management, integration with AWS/etc (automatically roll creds when a user is revoked)
     97 * business information: sensitive client data protection
     98  * lawyers, organizations, medical records, activists, journalists
     99  * technical secrets/proprietary-information
     100  * sometimes run server yourself, sometimes pay commodity/cloud provider, or friendnet
     101 * generalized communication tool: Slack-like UI, chat, file-sharing, directory-syncing
     102 * enterprise document sharing
     103  * some folks use SVN for this
     104 * git-over-tahoe
     105  * back up git repos
     106  * https://git-annex.branchable.com/special_remotes/tahoe/
     107  * use git to share, but backed by tahoe
     108   * on a personal VPS
     109  * can do a Dropbox-like thing
     110   * start with at least 3 participants
     111 * slack-like in-band file-sharing
     112 * new chat app which includes file-sharing UI: maybe base on RocketChat
     113 * plugins for existing apps to share large files via tahoe
     114  * Thunderbird large-file-attachment upload
     115  * gmail suggesting attachments go into Google Docs instead of embedding in email
     116 * basic key-value database
     117  * would probably need to emulate an existing API (etcd? gconf?)
     118  * backup of ~/.config/* (LSB?)
     119
     120* features that (some of) these use cases need
     121 * better multi-writer support
     122
     123* existing tools
     124 * subversion-based enterprise document-sharing
     125  * separate fs-explorer app, "check-out" button, copies document to tempdir and launches application. "check-in" copies it back into SVN
     126  * TortiseSVN
     127 * RocketChat: open-source slack-alike
     128* sketching out enterprise document-sharing tool
     129 * provisioning:
     130  * admin gives app, and maybe a provisioning string, to each client
     131  * client installs app
     132   * on windows, maybe app reads from windows registry, maybe user does name+password, to get provisioning string installed
     133    * daira doesn't like this
     134    * alan fairless from spideroak suggested this
     135    * let's not do this *until* somebody asks for it. and maybe after they pay for it.
     136   * maybe admin mints a new copy of each app with the provisioning data baked in
     137    * probably confusing if user A shares their app with user B
     138  * type in a provisioning string (provided by admin) maybe in argv
     139   * provisioning string: maybe a full JSON file, maybe a (meta)filecap, maybe magic-wormhole invitation code
     140   * provides:
     141    * grid information: which servers to contact
     142    * accounting: authority to write (maybe also read) to storage servers
     143    * initial shared directory cap
     144* one-shot (single-file, single-directory) sharing case - "all in one configuration"
     145 * need a string that includes: grid info, read (maybe write) authority, readcap
     146 * in IPFS and other "one true grid" architectures, this is a single hash
     147 * how to deliver the access authority?
     148  * friendnet/accounts vs agoric/payments
     149  * long digression about pay-per-read as DDoS mitigation ("put a nickel in your router each month")
     150* grid info / gridcaps
     151 * also see https://tahoe-lafs.org/trac/tahoe-lafs/ticket/403
     152 * "grid id": signing keypair, metagrid/DHT holding signed grid-membership rosters
     153 * maybe tahoe-lafs.org runs the durable seeds for the DHT
     154 * contents could be equal to servers.yaml, maybe include introducers
     155*  server operators: tend to allow up to N bytes for free, and only then need to charge (or even pay attention)
     156*  integration with existing apps
     157 * their main focus is not document storage/sharing, but they could use a plugin to help with it
     158  * push the data to somewhere more convenient
     159  * add nice crypto feels
     160  * tahoe is generally not visible to those users
     161 * thunderbird: file attachments
     162 * slack: drag file into chat window
     163* accounting priorities:
     164 * first: permissions: should a given client send shares to a given server, should a server accept shares from a given client
     165 * two: measuring usage
     166 * three: limiting: cut someone off when they're using too much (mark read-only, or delete all data)
     167 * four: in-band payment
     168
     169=== Tuesday PM: accounting, provisioning (including magic-wormhole, allow/deny storage servers), new GUI/WUI/CLI/API ===
     170
     171*  zooko's proposed sequence: what's the simplest thing that would work, then identify the likely attacks, then figure out the next step
     172 * 1: everything is free
     173  * attack: spam, freeloaders, tragedy-of-commons
     174 * 2: servers charge for upload and download. storage (once uploaded) is free. Assume payment efficiency is good enough to allow one-payment-per-operation. No global reputation system, but individual clients remember servers
     175  * global pool of servers, any server can add themselves to this advertisement list
     176  * clients (for each upload) use 10 known-good old servers and 10 new-unknown servers from the list
     177  * servers have an advantage: evil-server behavior is to accept the upload fee and then run away
     178  * server can charge enough for the upload to pay the data-retention costs for some amount of time, then if nobody has downloaded/paid for it, delete the data
     179  * server is never at a disadvantage
     180  * client disadvantage is: if they have known-good servers, then half (10/(10+10)) of their money goes to evil-servers
     181  * the more new servers they use, the faster they can find good ones
     182  * if client pays for download at the end, client has advantage (they can download and then not pay)
     183   * if client pays at the beginning of download, server has advantage (they can accept payment and then send random data)
     184   * maybe do incremental payment: XYZ btc per chunk of data
     185   * or some kind of partially-refundable deposit
     186 * possible next step: payment amortization
     187  * every time you send a coin, include a pubkey, establish a deposit. later if you need to ask the same server to do something, reference the deposit
     188 * another possible next step: ask one of the servers that you've already paid to find the share and download it (and pay for it) for you
     189* why do agoric/pay-for-service over choose-a-grid/account/relationship-based storage?
     190 *  sharing is easier when there's less context/hierarchy ("one true grid" is the best for sharing)
     191 * one-true-grid is easier for clients to connect to (fewer things to provision), easier for clients to understand (one fewer concept to learn)
     192*  OH: "how spoffy do you want it to be?" "that's spiffy!"
     193 * define "spiffy" (resiliency/redundancy) as the opposite of "spoffy"
     194* "One True Grid" OTGv1
     195 *  5 introducers run by 5 different orgs, introducers.yaml points to all of them
     196  * anybody can run a server (which charges for uploads/downloads as above)
     197  * clients learn about all servers
     198 * one predictable problem: once too many servers appear, clients are talking to too many of them
     199  * once too many clients appear, servers are talking to too many, cannot accept new clients
     200  * idea (warner): introducer charges both clients and servers, charges more when more of them connect (client_price = N * len(clients))
     201  * idea (zooko): reject clients after some fixed limit
     202 * introducer could be moved to HTTP, probably scale just fine. client->server foolscap connections are the problem
     203 * how do we tell that we're overloaded?
     204  * servers running out of memory
     205  * requests taking too long to complete
     206  * clients unable to reach servers
     207  * clients running out of memory
     208 * how to limit growth?
     209  * closed beta? issue tickets, one batch at a time, tokens that clients/servers must deliver to introducer
     210   * or introducers are all under our control, they reject requests after some number
     211   * or give tokens to people who pay a nominal BTC/ZEC fee, and the fee grows when we near the scaling limit
     212  * again, how to tell that we're overloaded
     213  * have servers report metrics to introducers: current-client-connections, response times, rate of requests
     214  * have clients report request rates/success-rates
     215  * servers pay (introducers) to get published
     216   * if there are too many servers, clients are overloaded: this throttles it
     217   * money goes to tahoe project, to pay programmers to write code to fix the congestion problem
     218   * since this particular problem needs to be fixed in code/architecture
     219  * clients pay servers per connection?
     220   * servers advertise price (via introducer)
     221 * v1: servers pay tahoe to be advertised, clients get tokens (first N are free, then a nominal charge) to use introducer
     222  * servers accept anybody who learns about them, clients connect to anyone they learn about
     223* raspberry pi with a barcode printer running storage server that dumps pile of ZEC private keys on your livingroom floor
     224* server price curve, client price curve: how to achieve stable/convergent share placement?
     225 * (zooko): if a server hasn't received requests in a while, lower the price. if it receives lots of requests, raise the price.
     226  * maybe track uploads and downloads separately
     227  * if link is saturated and requests can't get through, it will look like no requests -> lower price -> more traffic -> oops
     228  * principle: if you're failing, you probably can't tell. only other people can tell, and they might not be incentivized to tell you
     229  * if you're getting paid, you should raise your prices
     230  * admin configures a lower bound on price (based on e.g. their S3 costs)
     231   * (zooko): don't even do that, let server admins decide at the end of the month whether they made money or not, whether to continue or not
     232   * (warner): eek, unbounded S3 costs, server admins need to be responsible (write extra limiting code, don't use S3, find a pre-paid cloud provider)
     233  * server starts with a completely random price. whee!
     234 * clients: ignore top 10% or 50% of server prices
     235  * is mostly convergent, only increases search cost by 10/50%
     236  * deposit a nickle (in BTC), pay whatever the servers ask
     237  * (warner): put half the shares on the cheapest server, half on the "right" (convergent-placement) servers
     238 * pay half up front, half when upload/download is complete
     239 * (meejah): start with an arbitrary (org-selected) price (maybe 2x S3) (maybe absolute minimum: 1 satoshi per something)
     240* run an experiment, figure out rough max concurrent connection, call that M
     241 * for server tokens, pay 1 zatoshi for first 0.5*M tokens, then start paying more to limit congestion
     242 * because we believe max-concurrent-connections will be the first bottleneck, also it's probably a crashy/non-scaling limit (accepted load drops drastically once capacity is hit)
     243* v1: server requires 1 zatoshi up front, 1 at end, for both uploads and downloads
     244 * if txn fee is 0.1 pennies, you get like 5000 operations for a $5 investment
     245 * v2: amortize by establishing a deposit, send pubkey+minimal money (10x txn fee). spend complexity on protocol to stop spending money on miners
     246* over beers later:
     247 * "deposit" / short-term not-necessarily-named "accounts": useful for both amortizing payment fees, settlement time, and for friendnet (preauthorized pubkeys)
     248 * would there still be leases?
     249 * what "serious" use case would tolerate the uncertainty of storage without some kind of SLA or expected lease period?
     250* over breakfast later:
     251 * OneTrueGrid is one product, other things (with explicit provisioning) "powered by Tahoe" for more durable/"professional" applications
     252 * "ThePublicGrid" "OnePublicGrid"
     253
     254=== Wednesday AM: magic-folder -ish protocols, refresh our brains on #1382 (peer-selection / servers-of-happiness) ===
     255
     256* #1382 "servers of happiness"
     257 * current error message is.. bad
     258 * (exarkun) audience is someone who has just tried an upload, which failed. are they in a position to understand and act upon it?
     259 * can we make the error message more actionable?
     260 * "i only see N servers" or "only N servers were willing to accept shares", and "but you asked me to require H"
     261 * maybe use a Foolscap "incident" to report this (in a managed environment) to an admin
     262  * especially if the admin is the only one who can fix it
     263 * "I was unable to place shares with enough redundancy (N=x/k=x/H=x/etc)"
     264  * searching for that phrase should get people to the tahoe docs that explain the issue and expand N/k/H/etc
     265 * rewriting the algorithm spec (warner's paraphrasal)
     266  * find all the pre-existing shares on readonly servers. choose one of the best mappings, call it M1
     267  * find all new pre-existing shares on readwrite servers (ignore shares that are in M1, since we can't help anything by placing those shares in additional places). choose one of the best mappings of this, call it M2
     268  * find all potential placements of the remaining shares (to readwrite servers that aren't used in M2). choose one of the best mappings of this, call it M3. Prefer earlier servers.
     269  * renew M1+M2, upload M3
     270
     271*  options for PR140 (#573): https://github.com/tahoe-lafs/tahoe-lafs/pull/140
     272 * the __import__ weirds us out
     273 * twisted plugins: the function declares ISomething, config file stores qualified name
     274 * maybe hard-code a list of algorithms, tahoe.cfg specifies a name, big switch statement
     275 * the goal of #573 is to enable more kinds of placement, e.g. "3 shares per colo, no more than 1 share per rack"
     276 * probably needs to merge with #1382: one plugin that does both
     277  * needs to do network calls, no synchronous
     278  * current PR140 is sync
     279
     280=== Wednesday PM: new caps / encoding formats (chacha20, rainhill/elk-point, etc), mutable 2-phase commit, storage protocols, deletion/revocation ===
     281
     282* looking at Rainhill: https://tahoe-lafs.org/trac/tahoe-lafs/wiki/NewCaps/Rainhill
     283 * needs 2+1 passes: one to compute keys, second to encrypt+encode and produce SI, third to push actual shares
     284 * we're (but not zooko) probably ok with non-streaming / save-intermediates-to-disk these days, because of SSDs
     285 * diagram/protocol needs updating to:
     286  * omit plaintext hash tree (assume decryption function works correctly)
     287  * include/explain ciphertext hash tree, share hash tree
     288  * show information/encoding/decoding flow (swirly arrows)
     289  * maybe we can throw out P (needed for diversity/multicollision defense?)
     290* deletion:
     291 * long time ago, we discussed "deletecap -> readcap -> verifycap"
     292 * or for mutables: petrifycap -> writecap -> readcap -> verifycap
     293  * zooko preferred petrifycap==writecap
     294 * use cases:
     295  * I screwed up: upload of sensitive data, omg delete now
     296  * short-term sharing, which then expires
     297  * digital will, revoke with confirmation of was-read or never-read
     298  * share with group
     299 * what should the server do if one person wants to delete it, one wants to delete it
     300  * either preservationist wins or deletionist wins
     301  * uncontested deletion should just work
     302 * zooko's old mark-and-sweep explicit-deletion (as opposed to timed GC) idea #1832
     303  * for each rootcap, build a manifest of childcaps. give whole set to storage server, then server immediately deletes anything removed from that manifest (zooko is uncertain this is accurate)
     304  * actually: client fetches a "garbage collection marker" from the server. then client adds storage-index values to that marker. after adding everything they like, they say "flush everything not included in this marker", and the server deletes them
     305  * markers are scoped to some sort of (accounting identifier, rootcap/machine identifier) pair
     306  * still some race conditions, but probably fail-safe (fail-preservationist)
     307  * this approach has another use: if I could push a list of verifycaps to my locals server it could download them if not present, which would allow for download-then-open workflow which will works way better on low-bandwith or flaky internet connections. I think this is very important use-case that's currently missed.
     308
     309=== concurrent writes talking ideas ===
     310
     311 - 2PC might help in some cases:
     312   - the servers written to by various clients overlap
     313     - either because servers-of-happiness is more than half of the size of grid
     314     - or because of some algorithm making the clients choose the same servers for writing
     315   - servers do compare-and-swap, so if there was write by another node inside read/modify/write it's not easily overwritten
     316   - partition of size of size-of-happiness will cause split-brain behaviour (bad for large grids!)
     317 - locking (either for writes, or for certain operations for more concurrency-aware caps)
     318   - we can have list of nodes to act as lock arbitrators, on more than half of the nodes the locking has to succeed
     319   - if the quora goes down then the files become read-only unless unsafe write is manually foced
     320   - we could maintain this list per-grid or store it in the cap itself
     321   - locking could use the write keypair as identifier to perform locking on
     322 - concurrency-aware capabilities
     323   - message queue (insertcap / retrievecap) could be realized by a keypair
     324     - storage servers store all writes (unordered), each having UUID
     325     - reader removes processed messages using this UUID (optionally locking if there's more than one reader)
     326     - use-case: single message recipient that treats messages as pull requests and manages by itself a mutable data as a sole writer
     327     - use-case: email-like inbox with encryption
     328   - append-only sets/dirs
     329     - use-case: backup storage (until you run out of cake^Wspace and need to erase old ones)
     330       - rotation can be done by creting new append-cap
     331   - CRDT storage
     332     - always needs to present client all the updates, storage can't merge encrypted data
     333     - client could potentially merge the updates
     334       - merge them all using locking
     335       - merge particular ones using UUIDs
     336     - use-case: directories, various application data (eg. caldav/carddav, possibly imap-like storage with flags)
     337     - we can look at how eg. http://www.coda.cs.cmu.edu/ deals with it (it supports merging back offline-modified cached directories and files)
     338
     339
     340
     341