Here are some papers that are potentially of interest. == Crypto == === Symmetric Primitives === [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.59.9522 Salsa20 Design] a fast and secure cipher [http://cr.yp.to/snuffle.html#security Salsa20 Security Arguments] why Salsa20 is probably safe against this and that threat [http://cr.yp.to/chacha.html ChaChaCha20] even better stream cipher; It might be slightly safer than Salsa20 and it is certainly slightly faster on some platforms, but slightly slower on others. However, the author of Salsa20 and !ChaChaCha20, Daniel J. Bernstein, seems to have settled on using Salsa20 (or a tweak of it named XSalsa20), so probably that is the one to use. [http://enrupt.com EnRUPT] a very simple, fast, and flexible primitive which could be used as stream cipher, secure hash function, or MAC (the first two are primitives that we currently need, and the third one -- MAC -- is a primitive that we may want in the future) and which relies for its security on a large number of rounds. The question of how many rounds to use is decided by semi-automated cryptanalysis. (Note: the SHA-3 candidate version of EnRUPT in stream hashing mode was insecure. The current block cipher mode is insecure. There is a minor change (use a few more rounds) which is thought to fix the stream hashing mode. The author is apparently working on a fix for the block cipher mode.) [https://online.tu-graz.ac.at/tug_online/voe_main2.getvolltext?pDocumentNr=81263 Cryptanalysis of the Tiger Hash Function] by Mendel and RIjmen [http://defectoscopy.com/results.html defectoscopy.com] a table of semi-automated cryptanalysis results from the inventors of EnRUPT. This technique has not been peer-reviewed by other cryptographers. I (Zooko) can't judge how valid it is. Note that Tiger is one of only two hash functions that are predicted to be secure by this analysis -- the other is Whirlpool. MD-4/5, SHA-0/1/2, and GOST are predicted to be insecure. === Elliptic Curve Cryptography === [http://tools.ietf.org/html/draft-lochter-pkix-brainpool-ecc-03 ECC Brainpool Standard Curves and Curve Generation] new elliptic curve parameters which come with a proof that they were generated deterministically and pseudorandomly from the first few bits of pi, as well as proofs that they are immune to certain other potential cryptographic weaknesses. == Erasure Coding == [http://www.cs.utk.edu/~plank/plank/gflib/index.html a tutorial] and some software for erasure coding. This isn't the software that we use because it isn't as fast as Rizzo's implementation, but the tutorial is nice. [http://www.cs.utk.edu/~library/TechReports/2008/ut-cs-08-625.pdf tech report] benchmarking some fec implementations including zfec == Local Filesystems == [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.78.8911 Model-Based Failure Analysis of Journaling File Systems] [http://www.cs.wisc.edu/wind/Publications/sfa-dsn05.pdf PDF] compares ext3, reiserfs, and JFS under conditions of latent sector errors. (Impatient people: read the Introduction and look at the table on page 9.) [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.66.3785 IRON Filesystems] [https://www.cs.wisc.edu/wind/Publications/iron-sosp05.pdf PDF], a follow-on by the authors of "Model-Based Failure Analysis of Journaling File Systems" examines how ext3, reiserfs, xfs, and ntfs handle various sorts of errors (impatient people, see table on page 8, "File System Summary" on page 9, and table on page 10). [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.80.8142 Using Model Checking to Find Serious File System Errors ] [https://www.stanford.edu/~engler/osdi04-fisc.pdf PDF] analyzes ext3, JFS, and reiserfs (impatient: page 10). [https://www.stanford.edu/~engler/explode-osdi06.pdf eXplode: A lightweight, general approach for finding serious errors in storage systems], a follow-on by the authors of "Using Model Checking to Find Serious File System Errors", compares ext2, ext3, reiserfs, reiser4, jfs, xfs, msdos, vfat, hfs, and hfs+ to see if you sync them and then crash them if your allegedly synced data is actually recoverable (impatient: page 11) (Summary: basically it looks to me (Zooko) like reiser3 is better-engineered for handling faults than are the other local filesystems. See also the recent revelation that ext3 has been running with write barriers turned off all this time: http://lwn.net/Articles/283161 .) == P2P / Distributed Systems / Decentralization == [http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf Dynamo: Amazon's Highly Available Key-value Store] -- sophisticated distributed hash table polished by extensive high-performance practical usage; An excellent paper! [http://citeseer.ist.psu.edu/rhea05fixing.html Fixing the Embarrassing Slowness of OpenDHT on PlanetLab (2005)] -- practical lessons in DHT performance that theoreticians learned by deployment [http://betathoughts.blogspot.com/2007/06/brief-history-of-consensus-2pc-and.html A brief history of Consensus, 2PC and Transaction Commit.] -- a web page summarizing the evolution of the academic theory of decentralized, reliable systems. == See Also == This page is inspired by [http://flud.org flud]'s [http://flud.org/wiki/index.php/RelatedPapers Related Papers] page, which is well worth reading. See also Ludovic Courtès's excellent [http://www.laas.fr/~lcourtes/ludo-1.html bibliography of cooperative backup]. See also our [wiki:RelatedProjects RelatedProjects page]. == The Back Shelf == These are some references which are less interesting or relevant than the ones above. [http://citeseer.ist.psu.edu/mislove03post.html POST: A Secure, Resilient, Cooperative Messaging System] -- use a DHT for messaging; includes a suggestion to ameliorate the confidentiality problems of single-instance store by adding random bits to small text messages [http://srhea.net/papers/ntr-worlds05.pdf Non-Transitive Connectivity and DHTs] -- practical lessons in dealing with not-fully-connected DHTs that theoreticians learned in deployment [http://www.cs.cmu.edu/~dga/papers/incast-fast2008-abstract.html Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems] -- Hm... Could this happen to us? [http://eprint.iacr.org/2008/194 Endomorphisms for faster elliptic curve cryptography on general curves] techniques to compute elliptic curve cryptography significantly faster in software. [http://eprint.iacr.org/2005/391 Some thoughts on Collision Attacks in the Hash Functions MD5, SHA-0 and SHA-1] general musings about design of secure hash functions