Storage Server Maintance

Deon George deon at leenooks.net
Wed Nov 11 01:32:41 UTC 2015


Hi,

I’m very new to TLAFS - and like what I see, as well as do like what it is and it’s potential. I am a part time coder, but never done any python work - but if spare time does come my way I would love to dig deeper into the code and perhaps even help with some work.

In the meantime, I’m looking at “maintaining” a storage node - on the premise, that the node owner may not be the maintainer of all nodes in the grid. In my case, I will have a Storage node with some peers - holding my data as well as theirs, and want to make sure that my data is actually well distributed.

Things I want to get a handle on are:

* Expired data - does it actually get removed off all does (and importantly my node).
* When my space is filling because of an imbalance, (I have more shares than I should have), kicking off a procedure to fix that.
* When i discover my data is not well distributed - another node has more shares than it should, how can I rectify that
* Determining who is using more than there fair share of my storage - I know in the in the current incarnation of TLAFS, that is not possible, so looking forward to the Storage Authority/Quotas that I’ve read about.

So, to start with, Ive been playing around with “tahoe debug”, and my node on the test grid “leenooks”. It’s space is full, and I can see it’s because of an imbalance of shares (which may or may not be instigated by me). Here is one example:

[tahoe at c7tahoe testgrid]$ tahoe debug dump-share node/storage/shares/f2/f27gtgapmemd7uqqmnghjveb5m/2
share filename: '/tlafs/testgrid/node/storage/shares/f2/f27gtgapmemd7uqqmnghjveb5m/2'
...
Lease #0: owner=0, expire in -6154048s

In this Storage Index, I have 4 shares (2,7,8,9) (3-of-10), so I would want to:
a) check -repair to make sure that the other 6 exist (on 6 other nodes),
b) delete 3 on my node (leaving 1)
c) check -repair again so that the file is redistributed again.

But I can only do b) right?

IE: 
* for a) and c), I only have access to the verify-cap, so I cannot instigate a repair (do I have that right?)
* for a) and c), if I could repair it, I could never know that the share has been rebalanced over 10 nodes with only the verify cap?

Also, for this Share, it appears the lease has expired a long time ago (negative lease time), why does it still exist (I have expire.enabled = true)?

Appreciate any guidance on “maintaining a storage node”.

…deon


More information about the tahoe-dev mailing list