#2106 closed defect (duplicate)

RAIC behaviour different from RAID behaviour

Reported by: sickness Owned by:
Priority: normal Milestone: 1.10.1
Component: code Version: 1.10.0
Keywords: Cc:
Launchpad Bug:

Description

Let's assume we have a local RAID5 set of 4 identical disks attached on a controller inside a computer.

This RAID5 level guarantees that if we lose 1 of 4 disks, we can continue to not only read, but also write on the set, but in degraded mode.

When we change the failed disk with a new one, the RAID takes care of repairing the set syncing the data in background and the 4th disk gets populated again with chunks of our waluable data (not only parity because we know that in RAID5 parity is striped but explaining this isn't the scope of this ticket)

starting condition:

DISK1[chunk1] DISK2[chunk2] DISK3[chunk3] DISK4[chunk4]

broken disk:

DISK1[chunk1] DISK2[chunk2] DISK3[chunk3] DISK4[XXXXXX]

new disk is put in place:

DISK1[chunk1] DISK2[chunk2] DISK3[chunk3] DISK4[ ]

repair rebuilds DISK4's chunk of data reading the other 3 disks:

DISK1[chunk1] DISK2[chunk2] DISK3[chunk3] DISK4[chunk4]

Now let's assume we have a tahoe-lafs RAIC set of 4 identical servers on a LAN.

To mimic the RAID5 behaviour we configure it to write 4 shares for every file, needing only any 3 of them to succesfully read the file.

So in this way we have a RAIC that should behave like a RAID5.

We can lose any 1 of these 4 servers, and still be able to read the data, and to repair it should we lose 1 server.

But what happens if we actually lose 1 of those 4 servers and then try to read/repair the data? or maybe even write new data?

We will end up having ALL the 4 shares on just 3 servers, and when we rebuild the 4th server and put it back online, even repairing will not put shares on it because the file will be seen as already healthy, but now what if we lose that one server wich actually holds 2 shares of the same file?

starting condition:

SERV1[share1] SERV2[share2] SERV3[share3] SERV4[share4]

broken server:

SERV1[share1] SERV2[share2] SERV3[share3] SERV4[XXXXXX]

data is written, or scheduled repair is attempted and we get to this situation:

SERV1[share1,share4] SERV2[share2] SERV3[share3] SERV4[XXXXXX]

new server is put in place:

SERV1[share1,share4] SERV2[share2] SERV3[share3] SERV4[ ]

now if we try to repair situation remains the same because as of now the repairer DOESN'T know that he has to actually rebalance share4 on SERV4, he just tell us the file is healthy

we can still read and write data, so far so good, isn't it?

but what if SERV1 now suddenly gets broken?

SERV1[XXXXXX] SERV2[share2] SERV3[share3] SERV4[ ]

ok we can replace it:

SERV1[ ] SERV2[share2] SERV3[share3] SERV4[ ]

ok now we have a problem: how can we rebuild if we need 3 shares of 4 but we have just 2 even if we previously had 4 servers and the file was listed as "healthy" by the repairer?

Change History (3)

comment:1 Changed at 2013-11-14T22:53:44Z by zooko

sickness: thanks for the detailed description of the issue! I agree with you that it would be a problem if we got to the end of this story you've written and lost a file that way.

There are several improvements we can make.

improvement 1: let repair improve file health (#1382)

The last chance we have to avoid this fate is in the step where a repair is attempted when the placement is already:

 SERV1[share1,share4] SERV2[share2] SERV3[share3] SERV4[ ] 

If we are ever in that state, and a repair (or upload) is attempted, then a copy of either share1 or share must be uploaded to SERV4 in order to improve the health of the file. The #1382 branch (by Mark Berger; currently in review — almost ready to commit to trunk!) fixes this, so that a repair or upload in that case would upload a share to SERV4.

Note that this improvement "let repair improve file health" is the same whether the state is:

 SERV1[share1,share4] SERV2[share2] SERV3[share3] SERV4[ ] 

or:

 SERV1[share1] SERV2[share2] SERV3[share3] SERV4[ ] 

In either case, we want to upload a share to SERV4! The #1382 branch does this right.

improvement 2: launch a repair job when needed (#614)

If a "check" job is running, and it detects a layout like:

 SERV1[share1,share4] SERV2[share2] SERV3[share3] SERV4[ ] 

or:

 SERV1[share1] SERV2[share2] SERV3[share3] SERV4[ ] 

Then what should it do? Trigger a repair job, or leave well enough alone? That depends on the user's preferred trade-off between file health and bandwidth-consumption. If the user has configured the setting that says "Try to keep the file spread across at least 4 servers", then it will trigger a repair. If the user has configured it to "Try to keep the file spread across at least 3 servers", then it will not. (Because to do so would annoy the user by using up their network bandwidth.)

This is the topic of #614. There is a patch from Mark Berger on that ticket, but I think there is disagreement or confusion over how it should work.

possible improvement 3: don't put multiple shares on a server (#2107)

Another possible change we could make is in the step where an upload-or-repair process was running and it saw this state:

 SERV1[share1] SERV2[share2] SERV3[share3] SERV4[XXXXXX]

and it decided to send an extra share to SERV1, resulting in this state:

 SERV1[share1,share4] SERV2[share2] SERV3[share3] SERV4[XXXXXX]

I used to think this was a good idea for the uploader/repairer to do this (if we would implement improvement 1 and improvement 2 above!), but now I've changed my mind. I explained on #2107 my current reasoning. Possible improvement 3 is not provided by the #1382 branch. As far as I understand, the #1382 branch will go ahead and upload an extra share in this case.

comment:2 Changed at 2013-11-14T22:54:40Z by zooko

sickness: each of the three (possible) improvements listed in comment:1 have a separate ticket to track that improvement. So, unless there are any other changes you think we should consider to help with this situation, we should close this ticket.

comment:3 Changed at 2013-11-14T23:58:48Z by daira

  • Resolution set to duplicate
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.