Tahoe-LAFS Summer-of-Code Projects This page contains specific suggestions for projects we would like to see in the Summer of Code. Note that they vary a lot in required skills and difficulty. We hope to get applications with a broad spectrum. If you are interested in working on any of these projects, please contact the Mentors listed at the bottom of the page. In addition, you may wish to discuss your proposal on IRC—join us on #tahoe-lafs on irc.freenode.net. We encourage you to come up with your own suggestions, if you cannot find a suitable project here. You can find more project ideas by [wiki:ViewTickets exploring the issue tracker]. Especially see [http://allmydata.org/trac/tahoe-lafs/query?status=!closed&order=priority&keywords=~gsoc tickets labelled 'gsoc'] (developers: please add this label to any tickets that might make a good GSoC project). Deadlines and directions for students' applications to the Google Summer-of-Code can be found on [http://code.google.com/soc/ the Google pages]. ||''Project''||''Difficulty''||''Contact''|| ||[#RedundantArrayofIndependentClouds Redundant Array of Independent Clouds]||Medium||[mailto:zooko@zooko.com Zooko Wilcox-O'Hearn] or any mentor|| ||[#ShareMigration Share Migration]||Medium||any mentor|| ||[#CloudApps Cloud Apps]||Easy–Hard||[http://www.randombit.net Jack Lloyd] or any mentor|| ||[#WebDAV WebDAV]||Medium||[mailto:david-sarah@jacaranda.org David-Sarah Hopwood] or any mentor|| ---- = Redundant Array of Independent Clouds = Add backends to the storage servers so that they store their shares on a cloud storage system instead of on their local filesystem. This means that you can get all of the availability and scalability of services such as Amazon S3 or Rackspace CloudFiles combined with the security properties of Tahoe-LAFS. See [http://allmydata.org/~zooko/RAIC.png the RAIC diagram]. For details read ticket #999 which including pointers to the relevant source code and instructions on how to begin writing the code. = Share Migration = When uploading a file to a grid, Tahoe-LAFS will make sure that the file is healthy (a good discussion of what healthy means is found in #778) before reporting that the file is uploaded successfully. Tools to effectively maintain file health (or to adapt to new definitions of health) aren't quite complete, however -- our users have had several use cases that aren't easily addressed with what we have. Students taking this project would be building tools to address those use cases. A good starting point would be to become familiar with how files are placed on a grid. [http://allmydata.org/trac/tahoe-lafs/browser/docs/architecture.txt architecture.txt], [http://allmydata.org/trac/tahoe-lafs/browser/docs/specifications/file-encoding.txt file-encoding.txt], [http://allmydata.org/trac/tahoe-lafs/browser/docs/specifications/mutable.txt mutable.txt], [http://allmydata.org/trac/tahoe-lafs/browser/src/allmydata/immutable/upload.py the immutable file upload code], and [http://allmydata.org/trac/tahoe-lafs/browser/src/allmydata/mutable/publish.py the mutable file upload code] are good places to do that. Also, you might want to look at the [http://allmydata.org/trac/tahoe-lafs/browser/src/allmydata/storage/server.py storage server code] to understand that better. Some good tickets to start looking at are #699, #543, and #232; you'll find that those link to other tickets. There are many ways to help address these issues. Some ideas: * Alter the CLI and the WUI to give users the ability to rebalance files that they've uploaded already. (#699) * Build tools that allow node administrators to moves shares around a grid (#543, #864) * Alter Tahoe-LAFS to rebalance mutable files when uploading a new version of them. (#232) Any one of these projects is probably too small to fill a summer, but combined they would be a big usability improvement for Tahoe-LAFS. Depending on how you address this, this is tightly integrated with ideas of file health and accounting, so prospective students would do well to explore those open issues, too. A good accounting jumping-off point is #666. A good jumping-off point for health is #778. = Cloud Apps = Difficulty: easy to hard, depending on project choice and how far you want to push it There are a lot of applications that could potentially make good use of Tahoe-LAFS replacing the typical centralized storage of flat files or SQL databases. Currently supported projects include [http://www.tiddlywiki.com/ TiddlyWiki] (one of the Tahoe-LAFS developers hosts his blog using [http://allmydata.org/trac/tiddly_on_tahoe TiddlyWiki stored in Tahoe-LAFS]), [http://hadoop.apache.org/ Hadoop], and [RelatedProjects a number of others]. There are still many useful and interesting things that have yet to be built using Tahoe-LAFS. Perhaps the most promising is in the area of web applications; what applications can you think of that could make use of a highly reliable filesystem accessible from both desktops and [http://github.com/ctrlaltdel/TahoeLAFS-android handheld devices]? Keep in mind that Tahoe-LAFS's architecture allows sharing and delegation opportunities that are difficult or impossible to implement using other backends. Some ideas people have suggested include a calender or photo album, or porting Mozilla's [https://bespin.mozilla.com Bespin] editor). Nathan Wilcox wrote most of interactive tree browser frontend in !JavaScript (see [wiki:RelatedProjects the RelatedProjects page]); Toby Murray wrote [http://allmydata.org/pipermail/tahoe-dev/2010-March/004137.html a front-end in Cajita]; what interesting ways might this be extended? This is in some ways the most interesting area for development as it combines security and distributed systems problems with providing a user interface that lets a person who isn't particularly security minded operate safely by default. This is a hard problem, but offers great rewards in terms of learning, and even the ability to break new ground in safe-by-default interface design. Required skills: HTML and !JavaScript for web applications. For other tie-ins, will depend on the base project (for instance porting the git DVCS to run on Tahoe would good C-fu, with git experience helpful). = WebDAV = Implement a WebDAV front-end for Tahoe-LAFS so that files and directories stored in a distributed grid can be accessed by operating systems (including Windows, Mac, and Linux) and applications that speak the WebDAV protocol. For details see #451 which describes what the Tahoe-LAFS web server does now, how this differs from what a WebDAV web server does, and how to get started experimenting with the relevant source code. ---- = Mentors = ''Who is willing to spend about five hours a week (estimated) helping a student do it right?'' [[br]] * [http://testgrid.allmydata.org:3567/uri/URI:DIR2-RO:j74uhg25nwdpjpacl6rkat2yhm:kav7ijeft5h7r7rxdp5bgtlt3viv32yabqajkrdykozia5544jqa/wiki.html Zooko Wilcox-O'Hearn] (Python/C/C++/JavaScript, cryptography) [mailto:zooko@zooko.com ] * [http://www.randombit.net Jack Lloyd] (C/C++/Python, cryptography) * [mailto:david-sarah@jacaranda.org David-Sarah Hopwood] (Python/C/JavaScript, SFTP frontend, security+cryptography) ---- This page was modelled on [http://www.netbsd.org/contrib/soc-projects.html the NetBSD Summer-of-Code page].