1 | .. -*- coding: utf-8-with-signature; fill-column: 77 -*- |
---|
2 | |
---|
3 | ====================================================== |
---|
4 | Using Tahoe-LAFS with an anonymizing network: Tor, I2P |
---|
5 | ====================================================== |
---|
6 | |
---|
7 | #. `Overview`_ |
---|
8 | #. `Use cases`_ |
---|
9 | |
---|
10 | #. `Software Dependencies`_ |
---|
11 | |
---|
12 | #. `Tor`_ |
---|
13 | #. `I2P`_ |
---|
14 | |
---|
15 | #. `Connection configuration`_ |
---|
16 | |
---|
17 | #. `Anonymity configuration`_ |
---|
18 | |
---|
19 | #. `Client anonymity`_ |
---|
20 | #. `Server anonymity, manual configuration`_ |
---|
21 | #. `Server anonymity, automatic configuration`_ |
---|
22 | |
---|
23 | #. `Performance and security issues`_ |
---|
24 | |
---|
25 | |
---|
26 | |
---|
27 | Overview |
---|
28 | ======== |
---|
29 | |
---|
30 | Tor is an anonymizing network used to help hide the identity of internet |
---|
31 | clients and servers. Please see the Tor Project's website for more information: |
---|
32 | https://www.torproject.org/ |
---|
33 | |
---|
34 | I2P is a decentralized anonymizing network that focuses on end-to-end anonymity |
---|
35 | between clients and servers. Please see the I2P website for more information: |
---|
36 | https://geti2p.net/ |
---|
37 | |
---|
38 | |
---|
39 | |
---|
40 | Use cases |
---|
41 | ========= |
---|
42 | |
---|
43 | There are three potential use-cases for Tahoe-LAFS on the client side: |
---|
44 | |
---|
45 | 1. User wishes to always use an anonymizing network (Tor, I2P) to protect |
---|
46 | their anonymity when connecting to Tahoe-LAFS storage grids (whether or |
---|
47 | not the storage servers are anonymous). |
---|
48 | |
---|
49 | 2. User does not care to protect their anonymity but they wish to connect to |
---|
50 | Tahoe-LAFS storage servers which are accessible only via Tor Hidden Services or I2P. |
---|
51 | |
---|
52 | * Tor is only used if a server connection hint uses ``tor:``. These hints |
---|
53 | generally have a ``.onion`` address. |
---|
54 | * I2P is only used if a server connection hint uses ``i2p:``. These hints |
---|
55 | generally have a ``.i2p`` address. |
---|
56 | |
---|
57 | 3. User does not care to protect their anonymity or to connect to anonymous |
---|
58 | storage servers. This document is not useful to you... so stop reading. |
---|
59 | |
---|
60 | |
---|
61 | For Tahoe-LAFS storage servers there are three use-cases: |
---|
62 | |
---|
63 | 1. The operator wishes to protect their anonymity by making their Tahoe |
---|
64 | server accessible only over I2P, via Tor Hidden Services, or both. |
---|
65 | |
---|
66 | 2. The operator does not *require* anonymity for the storage server, but they |
---|
67 | want it to be available over both publicly routed TCP/IP and through an |
---|
68 | anonymizing network (I2P, Tor Hidden Services). One possible reason to do |
---|
69 | this is because being reachable through an anonymizing network is a |
---|
70 | convenient way to bypass NAT or firewall that prevents publicly routed |
---|
71 | TCP/IP connections to your server (for clients capable of connecting to |
---|
72 | such servers). Another is that making your storage server reachable |
---|
73 | through an anonymizing network can provide better protection for your |
---|
74 | clients who themselves use that anonymizing network to protect their |
---|
75 | anonymity. |
---|
76 | |
---|
77 | 3. Storage server operator does not care to protect their own anonymity nor |
---|
78 | to help the clients protect theirs. Stop reading this document and run |
---|
79 | your Tahoe-LAFS storage server using publicly routed TCP/IP. |
---|
80 | |
---|
81 | |
---|
82 | See this Tor Project page for more information about Tor Hidden Services: |
---|
83 | https://www.torproject.org/docs/hidden-services.html.en |
---|
84 | |
---|
85 | See this I2P Project page for more information about I2P: |
---|
86 | https://geti2p.net/en/about/intro |
---|
87 | |
---|
88 | |
---|
89 | Software Dependencies |
---|
90 | ===================== |
---|
91 | |
---|
92 | Tor |
---|
93 | --- |
---|
94 | |
---|
95 | Clients who wish to connect to Tor-based servers must install the following. |
---|
96 | |
---|
97 | * Tor (tor) must be installed. See here: |
---|
98 | https://www.torproject.org/docs/installguide.html.en . On Debian/Ubuntu, |
---|
99 | use ``apt-get install tor``. You can also install and run the Tor Browser |
---|
100 | Bundle. |
---|
101 | |
---|
102 | * Tahoe-LAFS must be installed with the ``[tor]`` "extra" enabled. This will |
---|
103 | install ``txtorcon`` :: |
---|
104 | |
---|
105 | pip install tahoe-lafs[tor] |
---|
106 | |
---|
107 | Manually-configured Tor-based servers must install Tor, but do not need |
---|
108 | ``txtorcon`` or the ``[tor]`` extra. Automatic configuration, when |
---|
109 | implemented, will need these, just like clients. |
---|
110 | |
---|
111 | I2P |
---|
112 | --- |
---|
113 | |
---|
114 | Clients who wish to connect to I2P-based servers must install the following. |
---|
115 | As with Tor, manually-configured I2P-based servers need the I2P daemon, but |
---|
116 | no special Tahoe-side supporting libraries. |
---|
117 | |
---|
118 | * I2P must be installed. See here: |
---|
119 | https://geti2p.net/en/download |
---|
120 | |
---|
121 | * The SAM API must be enabled. |
---|
122 | |
---|
123 | * Start I2P. |
---|
124 | * Visit http://127.0.0.1:7657/configclients in your browser. |
---|
125 | * Under "Client Configuration", check the "Run at Startup?" box for "SAM |
---|
126 | application bridge". |
---|
127 | * Click "Save Client Configuration". |
---|
128 | * Click the "Start" control for "SAM application bridge", or restart I2P. |
---|
129 | |
---|
130 | * Tahoe-LAFS must be installed with the ``[i2p]`` extra enabled, to get |
---|
131 | ``txi2p`` :: |
---|
132 | |
---|
133 | pip install tahoe-lafs[i2p] |
---|
134 | |
---|
135 | Both Tor and I2P |
---|
136 | ---------------- |
---|
137 | |
---|
138 | Clients who wish to connect to both Tor- and I2P-based servers must install |
---|
139 | all of the above. In particular, Tahoe-LAFS must be installed with both |
---|
140 | extras enabled:: |
---|
141 | |
---|
142 | pip install tahoe-lafs[tor,i2p] |
---|
143 | |
---|
144 | |
---|
145 | |
---|
146 | Connection configuration |
---|
147 | ======================== |
---|
148 | |
---|
149 | See :ref:`Connection Management` for a description of the ``[tor]`` and |
---|
150 | ``[i2p]`` sections of ``tahoe.cfg``. These control how the Tahoe client will |
---|
151 | connect to a Tor/I2P daemon, and thus make connections to Tor/I2P -based |
---|
152 | servers. |
---|
153 | |
---|
154 | The ``[tor]`` and ``[i2p]`` sections only need to be modified to use unusual |
---|
155 | configurations, or to enable automatic server setup. |
---|
156 | |
---|
157 | The default configuration will attempt to contact a local Tor/I2P daemon |
---|
158 | listening on the usual ports (9050/9150 for Tor, 7656 for I2P). As long as |
---|
159 | there is a daemon running on the local host, and the necessary support |
---|
160 | libraries were installed, clients will be able to use Tor-based servers |
---|
161 | without any special configuration. |
---|
162 | |
---|
163 | However note that this default configuration does not improve the client's |
---|
164 | anonymity: normal TCP connections will still be made to any server that |
---|
165 | offers a regular address (it fulfills the second client use case above, not |
---|
166 | the third). To protect their anonymity, users must configure the |
---|
167 | ``[connections]`` section as follows:: |
---|
168 | |
---|
169 | [connections] |
---|
170 | tcp = tor |
---|
171 | |
---|
172 | With this in place, the client will use Tor (instead of an |
---|
173 | IP-address -revealing direct connection) to reach TCP-based servers. |
---|
174 | |
---|
175 | Anonymity configuration |
---|
176 | ======================= |
---|
177 | |
---|
178 | Tahoe-LAFS provides a configuration "safety flag" for explicitly stating |
---|
179 | whether or not IP-address privacy is required for a node:: |
---|
180 | |
---|
181 | [node] |
---|
182 | reveal-IP-address = (boolean, optional) |
---|
183 | |
---|
184 | When ``reveal-IP-address = False``, Tahoe-LAFS will refuse to start if any of |
---|
185 | the configuration options in ``tahoe.cfg`` would reveal the node's network |
---|
186 | location: |
---|
187 | |
---|
188 | * ``[connections] tcp = tor`` is required: otherwise the client would make |
---|
189 | direct connections to the Introducer, or any TCP-based servers it learns |
---|
190 | from the Introducer, revealing its IP address to those servers and a |
---|
191 | network eavesdropper. With this in place, Tahoe-LAFS will only make |
---|
192 | outgoing connections through a supported anonymizing network. |
---|
193 | |
---|
194 | * ``tub.location`` must either be disabled, or contain safe values. This |
---|
195 | value is advertised to other nodes via the Introducer: it is how a server |
---|
196 | advertises it's location so clients can connect to it. In private mode, it |
---|
197 | is an error to include a ``tcp:`` hint in ``tub.location``. Private mode |
---|
198 | rejects the default value of ``tub.location`` (when the key is missing |
---|
199 | entirely), which is ``AUTO``, which uses ``ifconfig`` to guess the node's |
---|
200 | external IP address, which would reveal it to the server and other clients. |
---|
201 | |
---|
202 | This option is **critical** to preserving the client's anonymity (client |
---|
203 | use-case 3 from `Use cases`_, above). It is also necessary to preserve a |
---|
204 | server's anonymity (server use-case 3). |
---|
205 | |
---|
206 | This flag can be set (to False) by providing the ``--hide-ip`` argument to |
---|
207 | the ``create-node``, ``create-client``, or ``create-introducer`` commands. |
---|
208 | |
---|
209 | Note that the default value of ``reveal-IP-address`` is True, because |
---|
210 | unfortunately hiding the node's IP address requires additional software to be |
---|
211 | installed (as described above), and reduces performance. |
---|
212 | |
---|
213 | Client anonymity |
---|
214 | ---------------- |
---|
215 | |
---|
216 | To configure a client node for anonymity, ``tahoe.cfg`` **must** contain the |
---|
217 | following configuration flags:: |
---|
218 | |
---|
219 | [node] |
---|
220 | reveal-IP-address = False |
---|
221 | tub.port = disabled |
---|
222 | tub.location = disabled |
---|
223 | |
---|
224 | Once the Tahoe-LAFS node has been restarted, it can be used anonymously (client |
---|
225 | use-case 3). |
---|
226 | |
---|
227 | Server anonymity, manual configuration |
---|
228 | -------------------------------------- |
---|
229 | |
---|
230 | To configure a server node to listen on an anonymizing network, we must first |
---|
231 | configure Tor to run an "Onion Service", and route inbound connections to the |
---|
232 | local Tahoe port. Then we configure Tahoe to advertise the ``.onion`` address |
---|
233 | to clients. We also configure Tahoe to not make direct TCP connections. |
---|
234 | |
---|
235 | * Decide on a local listening port number, named PORT. This can be any unused |
---|
236 | port from about 1024 up to 65535 (depending upon the host's kernel/network |
---|
237 | config). We will tell Tahoe to listen on this port, and we'll tell Tor to |
---|
238 | route inbound connections to it. |
---|
239 | * Decide on an external port number, named VIRTPORT. This will be used in the |
---|
240 | advertised location, and revealed to clients. It can be any number from 1 |
---|
241 | to 65535. It can be the same as PORT, if you like. |
---|
242 | * Decide on a "hidden service directory", usually in ``/var/lib/tor/NAME``. |
---|
243 | We'll be asking Tor to save the onion-service state here, and Tor will |
---|
244 | write the ``.onion`` address here after it is generated. |
---|
245 | |
---|
246 | Then, do the following: |
---|
247 | |
---|
248 | * Create the Tahoe server node (with ``tahoe create-node``), but do **not** |
---|
249 | launch it yet. |
---|
250 | |
---|
251 | * Edit the Tor config file (typically in ``/etc/tor/torrc``). We need to add |
---|
252 | a section to define the hidden service. If our PORT is 2000, VIRTPORT is |
---|
253 | 3000, and we're using ``/var/lib/tor/tahoe`` as the hidden service |
---|
254 | directory, the section should look like:: |
---|
255 | |
---|
256 | HiddenServiceDir /var/lib/tor/tahoe |
---|
257 | HiddenServicePort 3000 127.0.0.1:2000 |
---|
258 | |
---|
259 | * Restart Tor, with ``systemctl restart tor``. Wait a few seconds. |
---|
260 | |
---|
261 | * Read the ``hostname`` file in the hidden service directory (e.g. |
---|
262 | ``/var/lib/tor/tahoe/hostname``). This will be a ``.onion`` address, like |
---|
263 | ``u33m4y7klhz3b.onion``. Call this ONION. |
---|
264 | |
---|
265 | * Edit ``tahoe.cfg`` to set ``tub.port`` to use |
---|
266 | ``tcp:PORT:interface=127.0.0.1``, and ``tub.location`` to use |
---|
267 | ``tor:ONION.onion:VIRTPORT``. Using the examples above, this would be:: |
---|
268 | |
---|
269 | [node] |
---|
270 | reveal-IP-address = false |
---|
271 | tub.port = tcp:2000:interface=127.0.0.1 |
---|
272 | tub.location = tor:u33m4y7klhz3b.onion:3000 |
---|
273 | [connections] |
---|
274 | tcp = tor |
---|
275 | |
---|
276 | * Launch the Tahoe server with ``tahoe run $NODEDIR`` |
---|
277 | |
---|
278 | The ``tub.port`` section will cause the Tahoe server to listen on PORT, but |
---|
279 | bind the listening socket to the loopback interface, which is not reachable |
---|
280 | from the outside world (but *is* reachable by the local Tor daemon). Then the |
---|
281 | ``tcp = tor`` section causes Tahoe to use Tor when connecting to the |
---|
282 | Introducer, hiding it's IP address. The node will then announce itself to all |
---|
283 | clients using ``tub.location``, so clients will know that they must use Tor |
---|
284 | to reach this server (and not revealing it's IP address through the |
---|
285 | announcement). When clients connect to the onion address, their packets will |
---|
286 | flow through the anonymizing network and eventually land on the local Tor |
---|
287 | daemon, which will then make a connection to PORT on localhost, which is |
---|
288 | where Tahoe is listening for connections. |
---|
289 | |
---|
290 | Follow a similar process to build a Tahoe server that listens on I2P. The |
---|
291 | same process can be used to listen on both Tor and I2P (``tub.location = |
---|
292 | tor:ONION.onion:VIRTPORT,i2p:ADDR.i2p``). It can also listen on both Tor and |
---|
293 | plain TCP (use-case 2), with ``tub.port = tcp:PORT``, ``tub.location = |
---|
294 | tcp:HOST:PORT,tor:ONION.onion:VIRTPORT``, and ``anonymous = false`` (and omit |
---|
295 | the ``tcp = tor`` setting, as the address is already being broadcast through |
---|
296 | the location announcement). |
---|
297 | |
---|
298 | |
---|
299 | Server anonymity, automatic configuration |
---|
300 | ----------------------------------------- |
---|
301 | |
---|
302 | To configure a server node to listen on an anonymizing network, create the |
---|
303 | node with the ``--listen=tor`` option. This requires a Tor configuration that |
---|
304 | either launches a new Tor daemon, or has access to the Tor control port (and |
---|
305 | enough authority to create a new onion service). On Debian/Ubuntu systems, do |
---|
306 | ``apt install tor``, add yourself to the control group with ``adduser |
---|
307 | YOURUSERNAME debian-tor``, and then logout and log back in: if the ``groups`` |
---|
308 | command includes ``debian-tor`` in the output, you should have permission to |
---|
309 | use the unix-domain control port at ``/var/run/tor/control``. |
---|
310 | |
---|
311 | This option will set ``reveal-IP-address = False`` and ``[connections] tcp = |
---|
312 | tor``. It will allocate the necessary ports, instruct Tor to create the onion |
---|
313 | service (saving the private key somewhere inside NODEDIR/private/), obtain |
---|
314 | the ``.onion`` address, and populate ``tub.port`` and ``tub.location`` |
---|
315 | correctly. |
---|
316 | |
---|
317 | |
---|
318 | Performance and security issues |
---|
319 | =============================== |
---|
320 | |
---|
321 | If you are running a server which does not itself need to be |
---|
322 | anonymous, should you make it reachable via an anonymizing network or |
---|
323 | not? Or should you make it reachable *both* via an anonymizing network |
---|
324 | and as a publicly traceable TCP/IP server? |
---|
325 | |
---|
326 | There are several trade-offs effected by this decision. |
---|
327 | |
---|
328 | NAT/Firewall penetration |
---|
329 | ------------------------ |
---|
330 | |
---|
331 | Making a server be reachable via Tor or I2P makes it reachable (by |
---|
332 | Tor/I2P-capable clients) even if there are NATs or firewalls preventing |
---|
333 | direct TCP/IP connections to the server. |
---|
334 | |
---|
335 | Anonymity |
---|
336 | --------- |
---|
337 | |
---|
338 | Making a Tahoe-LAFS server accessible *only* via Tor or I2P can be used to |
---|
339 | guarantee that the Tahoe-LAFS clients use Tor or I2P to connect |
---|
340 | (specifically, the server should only advertise Tor/I2P addresses in the |
---|
341 | ``tub.location`` config key). This prevents misconfigured clients from |
---|
342 | accidentally de-anonymizing themselves by connecting to your server through |
---|
343 | the traceable Internet. |
---|
344 | |
---|
345 | Clearly, a server which is available as both a Tor/I2P service *and* a |
---|
346 | regular TCP address is not itself anonymous: the .onion address and the real |
---|
347 | IP address of the server are easily linkable. |
---|
348 | |
---|
349 | Also, interaction, through Tor, with a Tor Hidden Service may be more |
---|
350 | protected from network traffic analysis than interaction, through Tor, |
---|
351 | with a publicly traceable TCP/IP server. |
---|
352 | |
---|
353 | **XXX is there a document maintained by Tor developers which substantiates or refutes this belief? |
---|
354 | If so we need to link to it. If not, then maybe we should explain more here why we think this?** |
---|
355 | |
---|
356 | Linkability |
---|
357 | ----------- |
---|
358 | |
---|
359 | As of 1.12.0, the node uses a single persistent Tub key for outbound |
---|
360 | connections to the Introducer, and inbound connections to the Storage Server |
---|
361 | (and Helper). For clients, a new Tub key is created for each storage server |
---|
362 | we learn about, and these keys are *not* persisted (so they will change each |
---|
363 | time the client reboots). |
---|
364 | |
---|
365 | Clients traversing directories (from rootcap to subdirectory to filecap) are |
---|
366 | likely to request the same storage-indices (SIs) in the same order each time. |
---|
367 | A client connected to multiple servers will ask them all for the same SI at |
---|
368 | about the same time. And two clients which are sharing files or directories |
---|
369 | will visit the same SIs (at various times). |
---|
370 | |
---|
371 | As a result, the following things are linkable, even with ``reveal-IP-address |
---|
372 | = false``: |
---|
373 | |
---|
374 | * Storage servers can link recognize multiple connections from the same |
---|
375 | not-yet-rebooted client. (Note that the upcoming Accounting feature may |
---|
376 | cause clients to present a persistent client-side public key when |
---|
377 | connecting, which will be a much stronger linkage). |
---|
378 | * Storage servers can probably deduce which client is accessing data, by |
---|
379 | looking at the SIs being requested. Multiple servers can collude to |
---|
380 | determine that the same client is talking to all of them, even though the |
---|
381 | TubIDs are different for each connection. |
---|
382 | * Storage servers can deduce when two different clients are sharing data. |
---|
383 | * The Introducer could deliver different server information to each |
---|
384 | subscribed client, to partition clients into distinct sets according to |
---|
385 | which server connections they eventually make. For client+server nodes, it |
---|
386 | can also correlate the server announcement with the deduced client |
---|
387 | identity. |
---|
388 | |
---|
389 | Performance |
---|
390 | ----------- |
---|
391 | |
---|
392 | A client connecting to a publicly traceable Tahoe-LAFS server through Tor |
---|
393 | incurs substantially higher latency and sometimes worse throughput than the |
---|
394 | same client connecting to the same server over a normal traceable TCP/IP |
---|
395 | connection. When the server is on a Tor Hidden Service, it incurs even more |
---|
396 | latency, and possibly even worse throughput. |
---|
397 | |
---|
398 | Connecting to Tahoe-LAFS servers which are I2P servers incurs higher latency |
---|
399 | and worse throughput too. |
---|
400 | |
---|
401 | Positive and negative effects on other Tor users |
---|
402 | ------------------------------------------------ |
---|
403 | |
---|
404 | Sending your Tahoe-LAFS traffic over Tor adds cover traffic for other |
---|
405 | Tor users who are also transmitting bulk data. So that is good for |
---|
406 | them -- increasing their anonymity. |
---|
407 | |
---|
408 | However, it makes the performance of other Tor users' interactive |
---|
409 | sessions -- e.g. ssh sessions -- much worse. This is because Tor |
---|
410 | doesn't currently have any prioritization or quality-of-service |
---|
411 | features, so someone else's ssh keystrokes may have to wait in line |
---|
412 | while your bulk file contents get transmitted. The added delay might |
---|
413 | make other people's interactive sessions unusable. |
---|
414 | |
---|
415 | Both of these effects are doubled if you upload or download files to a |
---|
416 | Tor Hidden Service, as compared to if you upload or download files |
---|
417 | over Tor to a publicly traceable TCP/IP server. |
---|
418 | |
---|
419 | Positive and negative effects on other I2P users |
---|
420 | ------------------------------------------------ |
---|
421 | |
---|
422 | Sending your Tahoe-LAFS traffic over I2P adds cover traffic for other I2P users |
---|
423 | who are also transmitting data. So that is good for them -- increasing their |
---|
424 | anonymity. It will not directly impair the performance of other I2P users' |
---|
425 | interactive sessions, because the I2P network has several congestion control and |
---|
426 | quality-of-service features, such as prioritizing smaller packets. |
---|
427 | |
---|
428 | However, if many users are sending Tahoe-LAFS traffic over I2P, and do not have |
---|
429 | their I2P routers configured to participate in much traffic, then the I2P |
---|
430 | network as a whole will suffer degradation. Each Tahoe-LAFS router using I2P has |
---|
431 | their own anonymizing tunnels that their data is sent through. On average, one |
---|
432 | Tahoe-LAFS node requires 12 other I2P routers to participate in their tunnels. |
---|
433 | |
---|
434 | It is therefore important that your I2P router is sharing bandwidth with other |
---|
435 | routers, so that you can give back as you use I2P. This will never impair the |
---|
436 | performance of your Tahoe-LAFS node, because your I2P router will always |
---|
437 | prioritize your own traffic. |
---|