[tahoe-dev] FAQ? - What happens if I loose the Tahoe-LAFS gateway machine? production ready?

Brian Warner warner at lothar.com
Wed Jan 5 21:34:14 UTC 2011


On 1/5/11 12:56 PM, Carsten Krüger wrote:

> maybe it's a FAQ but I didn't find an answer on the website.

Excellent questions! Yeah, we should definitely add these to the FAQ.

> What happens if I loose the gateway? Did the informations on the
> storage servers are sufficent to "rebuild" the hole system? Is the
> index etc. stored in a distributed way?

Nothing, yes, and yes.

The gateway is merely that: a gateway between your HTTP-speaking client
and the Tahoe storage grid. Nothing on the gateway needs to be backed-up
or preserved.

What matters most is the filecap, dircap, or "rootcap" under which you
stored your data. You must retain access to that string.

As long as you still have the rootcap, then you can use a different
gateway to get your data back: there can be lots of gateways for a
single grid, and you can paste the rootcap into any of them and get back
to your data. Every Tahoe node is a gateway, although you may not be
able to reach the HTTP port on all of them.

If you know the grid's "introducer.furl", then you can build a new
gateway to connect to that grid. Typically you get this from somebody
who's already attached to that grid.

Your files do depend upon enough of the shares surviving, which means
that enough of the servers are still running and reachable, and that
they haven't deleted your shares for some reason. Your gateway might
have been storing some shares, so if you lose the gateway then you may
also lose some shares, but typically your one gateway node will be a
small fraction of the overall grid, so you are unlikely to lose enough
shares to threaten the safety of your files.

> What amount of data is expected to be backuped from the gateway if you
> distribute 1 TB of data (10.000 files) to 20 storage servers.

The default encoding parameters are "3-of-10", which means that e.g. a
3MB file would be turned into 10 shares of 1MB each, and spread
mostly-evenly across all the storage servers. If you have 20 servers,
then 10 of them would get one share each, and the other 10 would not get
a share. The exact set of servers used will be different for each file,
so overall, the servers should be filled at the same average rate.

1TB of data in 10k files means each file is about 100MB. Each file will
result in 10 shares of about 33MB each, so there will be 100k shares
total. Spread evenly, you should expect each server to wind up with
about 5000 shares of 33MB each, so about 165GB per server.

(another way to do this calculation: the 1TB of data will be expanded by
10/3 into 3.3TB, then divided across 20 servers, so 3.3TB/20= 165GB per
server)

> I would like to use LAFS for a distributed backup but I'm unsure if
> it's fitting for that purpose.

That's exactly what we built it for :). That number and size of files is
about ideal for Tahoe, it should have no problems with it.

> Is LAFS stable enough for production use?

Yes. It's quite stable. You'll need to experiment to see if it is fast
enough for your purposes, but I've got no concerns about its stability.

> storage servers: windows xp, gateway server windows 2k3 or linux

Linux is always better, of course :). But if you can get Tahoe-LAFS
running on those windows boxes (sometimes it's harder to get the
dependencies installed on windows), then you shouldn't see much of a
difference at runtime.


cheers,
 -Brian


More information about the tahoe-dev mailing list