See also OneHundredYearCryptography. Here are some papers that are potentially of interest. == Tahoe-LAFS == * [//~zooko/VailComputingElementsWorkshop2013/slides.html slides] presented at “Vail Computing Elements Workshop” in Vail 2013 * [//~zooko/Ecrypt2020-Tenerife-InternetCryptoWorkshop/slides.html slides] presented at “ECRYPT II Crypto For 2020” in Tenerife 2013 * [//trac/tahoe-lafs/attachment/wiki/News/tahoe-RSA-slides.pdf] presented at RSA 2010 * [http://eprint.iacr.org/2012/524 Tahoe – The Least-Authority Filesystem] presented at [http://storagess.org/2008 Storage Security and Survivability '08] ([//~trac/lafs.pdf local mirror of PDF]) * [//~warner/pycon-tahoe.html Tahoe: A Secure Distributed Filesystem] presented at [http://us.pycon.org/2008/about/ PyCon2008], providing an overview of the Tahoe-LAFS design, and the [//~warner/pycon-tahoe-slides.zip slides] (.zip) that were used for the presentation * [http://www.cs.utk.edu/~plank/plank/papers/FAST-2009.html A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries For Storage] to be presented at [http://www.usenix.org/events/fast09 FAST-2009: 7th USENIX Conference on File and Storage Technologies] * [https://zooko.com/uri/URI%3ADIR2-CHK%3Aooyppj6eshxwmweeelqm3x54nq%3Au5pauln65blikfn5peq7e4s7x5fwdvvhvsklmfmwbjxlvlosldcq%3A1%3A1%3A105588/Carstensen-2011-Robust_Resource_Allocation_In_Distributed_Filesystem.pdf Robust Resource Allocation in Distributed Filesystems] Kevan Carstensen's master's thesis == Crypto == === Symmetric Primitives === ==== Ciphers ==== * [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.59.9522 Salsa20 Design] a fast and secure cipher * [http://cr.yp.to/snuffle.html#security Salsa20 Security Arguments] why Salsa20 is probably safe against this and that threat * [http://www.ecrypt.eu.org/stream The European Stream Cipher project] which evaluated many stream ciphers including Salsa20 ==== Hash Functions ==== * [https://zooko.com/file/URI:CHK:tldsmrqpapayqby3cjwzg4ocqe:6pyawohhusgt3ra765ex6ucv2zljs5ks7bgs32in6nfbcdmr6rta:1:1:1694043/@@named=/Haver-2010-Experimenting_With_Sha-3_Candidates_In_Tahoe-lafs.pdf Experimenting with SHA-3 Candidates in Tahoe-LAFS] performance evaluation of different hash functions in Tahoe-LAFS ===== Combiners a.k.a. Multiple Encryption a.k.a. Cascade Ciphers ===== ====== Hash Function Combiners ====== * [http://tuprints.ulb.tu-darmstadt.de/2094/1/thesis.lehmann.pdf On the Security of Hash Function Combiners] Anja Lehman's dissertation on hash function combiners ====== Cipher Combiners ====== * [http://www.ciphersbyritter.com/GLOSSARY.HTM#MultipleEncryption web page on multiple encryption] by John Ritter * [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.65.8477 Chosen-Ciphertext Security of Multiple Encryption] by Dodis, Katz 2005 ; combining two or more ciphers together * [https://sites.google.com/site/amirherzberg/tolerant.pdf?attredirects=0 Folklore, practice and theory of robust combiners] by Amir Herzberg * [http://www.cs.purdue.edu/homes/ninghui/courses/Spring04/homeworks/p465-merkle.pdf On the security of multiple encryption] by Merkle and Hellman, 1981 === Public Key Cryptography === ==== Hash-Based Digital Signatures ==== * [http://sphincs.cr.yp.to/papers.html SPHINCS: practical stateless hash-based signatures] by Daniel J. Bernstein, Daira Hopwood, Andreas Hülsing, Tanja Lange, Ruben Niederhagen, Louiza Papachristodoulou, Michael Schneider, Peter Schwabe, and Zooko Wilcox-O'Hearn; "introduces the HORST few-time signature scheme, the SPHINCS many-time signature scheme, and SPHINCS-256". This is the current state-of-the-art in stateless hash-based signatures (but I may be biased --Daira). * [http://eprint.iacr.org/2011/484 XMSS - A Practical Forward Secure Signature Scheme based on Minimal Security Assumptions] by Buchmann, Dahmen, Hülsing; “the first provably forward secure and practical [stateful] signature scheme with minimal security requirements: a pseudorandom and a second preimage resistant (hash) function family. Its signature size is reduced to less than 25% compared to the best provably secure [stateful] hash based signature scheme.” * [http://www.cdc.informatik.tu-darmstadt.de/~dahmen/papers/DOTV08.pdf Digital Signatures out of Second-Preimage Resistant Hash Functions] by Dahmen, Okeya, Takagi, Vuillame; This scheme is secure as long as the underlying hash function has ''second-preimage resistance'', which real hash functions are a lot more likely to have than to have a stronger property like ''collision-resistance''. * [http://www.cdc.informatik.tu-darmstadt.de/~dahmen/papers/hashbasedcrypto.pdf Hash-based Digital Signature Schemes] by Buchmann, Dahmen, and Szydlo; A survey of why it might be a good idea. ==== Elliptic Curve Cryptography ==== * [http://ed25519.cr.yp.to/ Ed25519] fast, well-engineered elliptic curve digital signatures by Daniel J. Bernstein * [http://eprint.iacr.org/2009/389 On the Security of 1024-bit RSA and 160-bit Elliptic Curve Cryptography] crypto gurus try to predict whether 160-bit elliptic curve crypto can be brute-force-cracked in the next decade. They conclude: "Right now most certainly not: 2.5 billion PS3s or equivalent devices (such as desktops) for a year is way out of reach. In a decade, very optimistically incorporating 10-fold cryptanalytic advances, still millions of devices would be required, and a successful open community attack on 160-bit ECC even by the year 2020 must be considered very unlikely." * [http://eprint.iacr.org/2009/466 The Certicom Challenges ECC2-X] other crypto gurus launch an effort to brute-force-crack 130-bit and 160-bit ECC. == Erasure Coding == * [http://www.cs.utk.edu/~plank/plank/gflib/index.html a tutorial] and some software for erasure coding. This isn't the software that we use because it isn't as fast as Rizzo's implementation, but the tutorial is nice. * [http://www.cs.utk.edu/~plank/plank/papers/CS-08-625.pdf A Performance Comparison of Open-Source Erasure Coding Libraries] benchmarking some fec implementations including zfec == Direct Attached Storage == === Local Filesystems === * [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.78.8911 Model-Based Failure Analysis of Journaling File Systems] [http://www.cs.wisc.edu/wind/Publications/sfa-dsn05.pdf PDF] compares ext3, reiserfs, and JFS under conditions of latent sector errors. (Impatient people: read the Introduction and look at the table on page 9.) * [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.66.3785 IRON Filesystems] [https://www.cs.wisc.edu/wind/Publications/iron-sosp05.pdf PDF], a follow-on by the authors of "Model-Based Failure Analysis of Journaling File Systems" examines how ext3, reiserfs, xfs, and ntfs handle various sorts of errors (impatient people, see table on page 8, "File System Summary" on page 9, and table on page 10). * [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.80.8142 Using Model Checking to Find Serious File System Errors ] [https://www.stanford.edu/~engler/osdi04-fisc.pdf PDF] analyzes ext3, JFS, and reiserfs (impatient: page 10). * [https://www.stanford.edu/~engler/explode-osdi06.pdf eXplode: A lightweight, general approach for finding serious errors in storage systems], a follow-on by the authors of "Using Model Checking to Find Serious File System Errors", compares ext2, ext3, reiserfs, reiser4, jfs, xfs, msdos, vfat, hfs, and hfs+ to see if you sync them and then crash them if your allegedly synced data is actually recoverable (impatient: page 11) (Summary: basically it looks to me (Zooko) like reiser3 is better-engineered for handling faults than are the other local filesystems. See also the recent revelation that ext3 has been running with write barriers turned off all this time: http://lwn.net/Articles/283161 .) === Disk Failure Rates === * [http://labs.google.com/papers/disk_failures.pdf Failure Trends in a Large Disk Drive Population] by google engineers == P2P / Distributed Systems / Decentralization == * [http://conferences.sigcomm.org/co-next/2009/papers/Jacobson.pdf Networking Named Content] -- "Content-Centric Networking" is decentralized storage as envisioned by networking experts * [http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf Dynamo: Amazon's Highly Available Key-value Store] -- sophisticated distributed hash table polished by extensive high-performance practical usage; An excellent paper! * [http://citeseer.ist.psu.edu/rhea05fixing.html Fixing the Embarrassing Slowness of OpenDHT on PlanetLab (2005)] -- practical lessons in DHT performance that theoreticians learned by deployment * [http://betathoughts.blogspot.com/2007/06/brief-history-of-consensus-2pc-and.html A brief history of Consensus, 2PC and Transaction Commit.] -- a web page summarizing the evolution of the academic theory of decentralized, reliable systems. == See Also == * This page is inspired by [http://flud.org flud]'s [http://flud.org/wiki/index.php/RelatedPapers Related Papers] page, which is well worth reading. * See also Ludovic Courtès's excellent [http://www.laas.fr/~lcourtes/ludo-1.html bibliography of cooperative backup]. ''Whoops, broken link!'' * See also our [wiki:RelatedProjects RelatedProjects page]. == The Back Shelf == These are some references which are less interesting or relevant than the ones above. === Public Key Cryptography === * [http://www.cs.umd.edu/~jkatz/papers/dh-sigs-full.pdf Efficient Signature Schemes with Tight Reductions to the Diffie-Hellman Problems] Scheme 1 in this paper comes with a tight reduction to the Computational Diffie-Hellman problem, which means it is definitely at least as secure as any discrete-log-based scheme and could be more secure. It also has a good pedigree (having been suggested by David Chaum et al. in 1989 and having been proven to tightly reduce to Computational Diffie-Hellman by Katz et al. in 2003). It also has a nice short public key, which could be good for fitting it into our capability security schemes. * [http://tools.ietf.org/html/draft-lochter-pkix-brainpool-ecc-03 ECC Brainpool Standard Curves and Curve Generation] new elliptic curve parameters which come with a proof that they were generated deterministically and pseudorandomly from the first few bits of Π, as well as proofs that they are immune to certain other potential cryptographic weaknesses. === Hash-based Signatures === * [http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=8AC81C407AA3CBF35093032BD01F3085?doi=10.1.1.95.1374&rep=rep1&type=pdf Merkle Signatures with Virtually Unlimited Signature Capacity] by Buchmann, Dahmen, Klintsevich, Okeya, and Vuillaume; includes treating the parameters as an optimization problem and solving it with various weights or constraints to find various good settings for the parameters. Unfortunately their weights and constraints are different from hours: they thought it was fine to let key generation time take tens of hours! We want key generation time to be as few milliseconds as possible. A good rule of thumb for us would probably be try to reduce the time of whichever of the three operations is the slowest: key-generation, signing, and verification. * [https://www.minicrypt.cdc.informatik.tu-darmstadt.de/reports/reports/REDBP08.pdf Fast Hash-Based Signatures on Constrained Devices] by Rohde, Eisenbarth, Dahmen, Buchmann, and Paar; a case study of implementing hash-based digital signatures for a 8-bit microcontroller. Their implementation had some trade-offs that we wouldn't want: it is a "key-evolving" design (the signer has to maintain state in order to avoid a security failure), it can only handle a limited number of signatures, and they spent a lot of time in key generation. Hm, they don't say how long key-generation took in this paper—only that it took so long that they had to run it on a PC instead of on their microcontroller. In [Merkle Signatures with Virtually Unlimited Signature Capacity], the key-generation took tens of hours on a PC!!! On the other hand, they do show a digital signature scheme which is faster at signing and verifying and is also arguably safer than RSA or ECDSA on their 8-bit microcontroller. === Miscellaneous === * [http://citeseer.ist.psu.edu/mislove03post.html POST: A Secure, Resilient, Cooperative Messaging System] -- use a DHT for messaging; includes a suggestion to ameliorate the confidentiality problems of single-instance store by adding random bits to small text messages * [http://srhea.net/papers/ntr-worlds05.pdf Non-Transitive Connectivity and DHTs] -- practical lessons in dealing with not-fully-connected DHTs that theoreticians learned in deployment * [http://www.cs.cmu.edu/~dga/papers/incast-fast2008-abstract.html Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems] -- Hm... Could this happen to us? * [http://eprint.iacr.org/2008/194 Endomorphisms for faster elliptic curve cryptography on general curves] techniques to compute elliptic curve cryptography significantly faster in software. * [http://eprint.iacr.org/2005/391 Some thoughts on Collision Attacks in the Hash Functions MD5, SHA-0 and SHA-1] general musings about design of secure hash functions * [http://enrupt.com EnRUPT] a very simple, fast, and flexible primitive which could be used as stream cipher, secure hash function, or MAC (the first two are primitives that we currently need, and the third one -- MAC -- is a primitive that we may want in the future) and which relies for its security on a large number of rounds. The question of how many rounds to use is decided by semi-automated cryptanalysis. (Note: the SHA-3 candidate version of EnRUPT in stream hashing mode was insecure. The current block cipher mode is insecure. There is a minor change (use a few more rounds) which is thought to fix the stream hashing mode. The author is apparently working on a fix for the block cipher mode.) * [http://defectoscopy.com/results.html defectoscopy.com] a table of semi-automated cryptanalysis results from the inventors of EnRUPT. This technique has not been peer-reviewed by other cryptographers. I (Zooko) can't judge how valid it is. Note that MD4, MD5, SHA-0, SHA-1, SHA-2-256, and GOST are predicted to be insecure, while Tiger is predicted to be secure. AES-128 is predicted to be insecure. Salsa20 is predicted to be secure. * [http://webee.technion.ac.il/~hugo/kdf/kdf.pdf HKDF full paper] defines and analyzes the ''HKDF'' Key-Derivation Algorithm; A KDF is a linchpin component of our crypto schemes. * [http://cr.yp.to/chacha.html ChaChaCha20] even better stream cipher; It might be slightly safer than Salsa20 and it is certainly slightly faster on some platforms, but slightly slower on others. However, the author of Salsa20 and !ChaChaCha20, Daniel J. Bernstein, seems to have settled on using Salsa20 (or a tweak of it named XSalsa20), so probably that is the one to use. * [https://online.tu-graz.ac.at/tug_online/voe_main2.getvolltext?pDocumentNr=81263 Cryptanalysis of the Tiger Hash Function] by Mendel and Rijmen * [http://www.cryptojedi.org/crypto/index.shtml#aesbs Bitsliced AES implementation] The faster and timing resistant implementation of AES-CTR in bitsliced mode by Peter Schwabe and Emilia Kasper. * [http://crypto.stanford.edu/vpaes/ Vector permutations and AES] The fast and timing-resistant implementations of Mike Hamburg using vector permute instructions (read: pshufb and vperm).