1 | .. -*- coding: utf-8-with-signature -*- |
---|
2 | |
---|
3 | =================== |
---|
4 | URI Extension Block |
---|
5 | =================== |
---|
6 | |
---|
7 | This block is a serialized dictionary with string keys and string values |
---|
8 | (some of which represent numbers, some of which are SHA-256 hashes). All |
---|
9 | buckets hold an identical copy. The hash of the serialized data is kept in |
---|
10 | the URI. |
---|
11 | |
---|
12 | The download process must obtain a valid copy of this data before any |
---|
13 | decoding can take place. The download process must also obtain other data |
---|
14 | before incremental validation can be performed. Full-file validation (for |
---|
15 | clients who do not wish to do incremental validation) can be performed solely |
---|
16 | with the data from this block. |
---|
17 | |
---|
18 | At the moment, this data block contains the following keys (and an estimate |
---|
19 | on their sizes):: |
---|
20 | |
---|
21 | size 5 |
---|
22 | segment_size 7 |
---|
23 | num_segments 2 |
---|
24 | needed_shares 2 |
---|
25 | total_shares 3 |
---|
26 | |
---|
27 | codec_name 3 |
---|
28 | codec_params 5+1+2+1+3=12 |
---|
29 | tail_codec_params 12 |
---|
30 | |
---|
31 | share_root_hash 32 (binary) or 52 (base32-encoded) each |
---|
32 | plaintext_hash |
---|
33 | plaintext_root_hash |
---|
34 | crypttext_hash |
---|
35 | crypttext_root_hash |
---|
36 | |
---|
37 | Some pieces are needed elsewhere (size should be visible without pulling the |
---|
38 | block, the Tahoe3 algorithm needs total_shares to find the right peers, all |
---|
39 | peer selection algorithms need needed_shares to ask a minimal set of peers). |
---|
40 | Some pieces are arguably redundant but are convenient to have present |
---|
41 | (test_encode.py makes use of num_segments). |
---|
42 | |
---|
43 | The rule for this data block is that it should be a constant size for all |
---|
44 | files, regardless of file size. Therefore hash trees (which have a size that |
---|
45 | depends linearly upon the number of segments) are stored elsewhere in the |
---|
46 | bucket, with only the hash tree root stored in this data block. |
---|
47 | |
---|
48 | This block will be serialized as follows:: |
---|
49 | |
---|
50 | assert that all keys match ^[a-zA-z_\-]+$ |
---|
51 | sort all the keys lexicographically |
---|
52 | for k in keys: |
---|
53 | write("%s:" % k) |
---|
54 | write(netstring(data[k])) |
---|
55 | |
---|
56 | |
---|
57 | Serialized size:: |
---|
58 | |
---|
59 | dense binary (but decimal) packing: 160+46=206 |
---|
60 | including 'key:' (185) and netstring (6*3+7*4=46) on values: 231 |
---|
61 | including 'key:%d\n' (185+13=198) and printable values (46+5*52=306)=504 |
---|
62 | |
---|
63 | We'll go with the 231-sized block, and provide a tool to dump it as text if |
---|
64 | we really want one. |
---|