Opened at 2009-02-10T08:24:29Z
Last modified at 2015-04-27T00:16:26Z
#615 assigned defect
Can JavaScript loaded from Tahoe access all your content which is loaded from Tahoe?
Reported by: | zooko | Owned by: | davidsarah |
---|---|---|---|
Priority: | critical | Milestone: | soon |
Component: | code-frontend-web | Version: | 1.3.0 |
Keywords: | newcaps confidentiality integrity preservation capleak gsoc websec | Cc: | nejucomo |
Launchpad Bug: |
Description (last modified by zooko)
Several web security experts (who will remain unnamed in this ticket since they have yet to show me a working exploit) have said that if have a page containing JavaScript in one window or tab of a web browser, and you have another page in a different window or tab of that browser, that the web browser will inspect the "origin" of the JavaScript and the "origin" of the other page to decide whether the JavaScript will be allowed to read or change parts of the other page (including its URL).
By "origin", these web security experts tell me, web browsers mean "host and port number" (or possibly they look at only the top two elements of the host domain name). Since all pages that are stored on tahoe and that you are viewing in a web browser are coming from the same host (sometimes localhost or 127.0.0.1) and port number, this means any JavaScript that you view through your tahoe node can access all the URLs of all the other pages you have loaded (or possibly have ever loaded since you launched your browser) from Tahoe. (Furthermore, just to make things worse, these web security experts allege that it might be possible for the JavaScript program to stay running in your browser even after you close that tab or window and continue to access your other tabs or windows which were loaded from the same "origin".)
If true, this is bad. Because those other pages, while they are loaded from the same host and portnumber, could actually be from very different origins. One might be a cute game that you want to play that was passed along from a friend of a friend. Another might be your personal finance database with all of your bank account numbers and billing information. We would like it if the web browser would allow you to play the fun game in one window, and edit your personal finance document in another window, without giving the game the ability to read (and therefore to upload) or change your personal document. Even though both pages were loaded from http://127.0.0.1:4567 or from http://testgrid.allmydata.org:3567 or whatever.
In the long run it might be possible for us to arrange to do this, such as by embedding a unique string, possibly the verifycap or possibly an incrementing string, into the domain name, or by taking advantage of some not-yet-created mechanism to tell web browsers "No, no, these two things are of different origins even though they are loaded from the same host and port.".
In the short run, it might be wise to avoid looking at pages in tahoe if they might have malicious content on them, unless you first turn off JavaScript in your web browser. Hopefully someone will help us understand exactly how dangerous this situation is, by posting a working exploit or some sort of proof that is is safe.
Change History (30)
comment:1 in reply to: ↑ description ; follow-up: ↓ 6 Changed at 2009-02-10T15:13:29Z by swillden
comment:2 Changed at 2009-03-08T22:01:29Z by warner
- Component changed from unknown to code-frontend-web
- Owner nobody deleted
comment:3 Changed at 2009-10-28T05:03:51Z by zooko
comment:4 Changed at 2009-10-28T05:05:45Z by zooko
#127 was also an old ticket that, if I understand it correctly, mentioned both this issue and the different issue of "Referer Header cap leakage" (which is an issue deserving of a ticket of its own, but apparently not currently having one).
comment:5 Changed at 2009-10-28T06:32:32Z by davidsarah
- Keywords newcaps security added
- Priority changed from major to critical
#821 (now reopened) describes a less serious security problem that would still be present even if every page had a distinct origin. Note that the fix suggested for that bug will only work if this one is also fixed, i.e. #821 is dependent on this bug.
#127 seems to be almost exclusively about Referer header cap leakage, and I've changed its summary to reflect that.
comment:6 in reply to: ↑ 1 Changed at 2009-10-29T23:09:51Z by davidsarah
Replying to swillden:
Another option is to use cookies. A cookie can also be made specific to a host/domain but also to a path. As I understand it (haven't tested), Javascript loaded from path A should not have access to cookies set specific to path B. If Tahoe were to set per-path cookies on first access to a path, then refuse later requests that don't include the right cookie, then Javascript from path B would not be able to successfully load URLs on path A, because it wouldn't have the cookie.
There are numerous downsides to the cookie approach ...
Yes. The following paper (which is essential reading for this ticket) explains why this can't work from a security point of view:
- Beware of Finer-Grained Origins
- Collin Jackson and Adam Barth
- In Web 2.0 Security and Privacy. (W2SP 2008)
- http://crypto.stanford.edu/websec/origins/fgo.pdf
- "Cookie Paths. One classic example of a sub-origin privilege is the ability to read cookies with "path" attributes. In order to read such a cookie, the path of the document's URL must extend the path of the cookie. However, the ability to read these cookies leaks to all documents in the origin because a same-origin document can inject script into a document with the appropriate path (even a 404 "not found" document) and read the cookies. This "vulnerability" has been known for a number of years ... This vulnerability was "fixed" by declaring the path attribute to be a convenience feature rather than a security feature."
comment:7 Changed at 2009-11-01T06:41:25Z by davidsarah
If you like this bug, you may also like #827 (Support forcing download using "Content-Disposition: attachment" in WUI).
comment:8 Changed at 2009-11-07T07:56:45Z by davidsarah
I believe I have a solution for this:
- For file types that are not viewable in typical browsers, clicking the file link would download it as per #827. This limits the problem to the small number of types where not being able to view them directly in the browser is a significant usability problem ([X]HTML, images, and text).
- Images and text are easy, since they don't contain scripts (provided that we can defeat browser sniffing that might cause it to treat files served as these types as something more dangerous).
- The difficult problem is [X]HTML. For that case, we can serve a page containing a "parent script", and a full-page iframe with src="javascript:child_script". javascript: URLs are (or should be) treated as having a special origin that does not compare equal to any other origin, even one for an identical URL. So now we have two scripts running in different origins that are able to obtain references to each other, which implies that they can communicate using a cross-origin comms technique such as Subspace ( http://www.collinjackson.com/research/papers/fp801-jackson.pdf ). The parent script then loads the actual [X]HTML of the page using an XMLHttpRequest, and passes it to the child script, which rewrites its own frame with that content. The parent script shuts down the comms channel immediately after passing the content, so that scripts in the loaded page can't use it.
(I originally thought that it would be possible to create a blank iframe using src="about:blank", and have the parent script inject HTML into it directly using part of the technique described in http://softwareas.com/injecting-html-into-an-iframe . However, if that were possible then it would be a browser security bug, because you shouldn't be able to inject content into a frame with a different origin even if you have a direct reference to it. And we don't really want to rely on exploiting browser security bugs ;-)
Anyway, I think this adequately isolates the injected page. Obviously it needs extensive testing in different browsers; we're relying on the fact that, although the injected page can obtain a reference to its parent (which has an origin shared by other WUI pages) using document.top, the same-origin policy shouldn't allow it to arbitrarily interfere with that parent (even though it can communicate with it). So this is not an example of the "sub-origin" approaches that are criticised in the Jackson/Barth paper.
img tags in the injected page should still work because those aren't subject to the same-origin policy. (It would be a bug if web content could read the pixels of an image, but that wouldn't be a Tahoe-specific bug.) Similarly for nested frames or iframes in the injected page (the contents of these shouldn't be accessible to the injected page because their origins won't compare equal to the unique origin generated for the javascript: URL).
comment:9 Changed at 2009-11-07T07:59:23Z by davidsarah
Last line of the previous comment has been fixed should be "... to the javascript: origin).".
comment:10 follow-up: ↓ 19 Changed at 2009-11-07T08:09:56Z by davidsarah
Ooh, this is interesting:
http://www.whatwg.org/specs/web-apps/current-work/multipage/origin-0.html
If url identifies a resource that is its own trust domain (e.g. it identifies an e-mail on an IMAP server or a post on an NNTP server) then return a globally unique identifier specific to the resource identified by url, so that if this algorithm is invoked again for URLs that identify the same resource, the same identifier will be returned.
If url does not use a server-based naming authority, or if parsing url failed, or if url is not an absolute URL, then return a new globally unique identifier.
I don't know whether this is new proposed HTML5 behaviour, or what browsers currently implement. If the latter, then we could try using an IMAP or NNTP server for the WUI -- bizarre, but possibly simpler than my iframe suggestion above, if it works.
comment:11 Changed at 2009-12-04T04:24:40Z by davidsarah
- Keywords confidentiality integrity dataloss added; security removed
comment:12 Changed at 2009-12-13T03:28:14Z by davidsarah
- Keywords preservation added; dataloss removed
comment:13 Changed at 2010-01-17T14:52:12Z by davidsarah
- Keywords capleak added
comment:14 Changed at 2010-02-23T03:09:02Z by zooko
- Milestone changed from undecided to 2.0.0
comment:15 Changed at 2010-03-12T17:47:01Z by davidsarah
- Keywords gsoc added
comment:16 Changed at 2010-04-12T19:16:36Z by davidsarah
- Milestone changed from 2.0.0 to 1.8.0
- Owner set to davidsarah
- Status changed from new to assigned
comment:17 Changed at 2010-07-26T04:52:23Z by zooko
Wade Simmons tried to figure out how to exploit this and couldn't do it: http://tahoe-lafs.org/pipermail/tahoe-dev/2010-July/004787.html
comment:18 Changed at 2010-08-06T01:37:43Z by zooko
- Milestone changed from 1.8.0 to soon
comment:19 in reply to: ↑ 10 Changed at 2011-07-30T22:55:07Z by davidsarah
Replying to davidsarah:
Ooh, this is interesting:
http://www.whatwg.org/specs/web-apps/current-work/multipage/origin-0.html
If url identifies a resource that is its own trust domain (e.g. it identifies an e-mail on an IMAP server or a post on an NNTP server) then return a globally unique identifier specific to the resource identified by url, so that if this algorithm is invoked again for URLs that identify the same resource, the same identifier will be returned.
If url does not use a server-based naming authority, or if parsing url failed, or if url is not an absolute URL, then return a new globally unique identifier.
I don't know whether this is new proposed HTML5 behaviour, or what browsers currently implement. If the latter, then we could try using an IMAP or NNTP server for the WUI -- bizarre, but possibly simpler than my iframe suggestion above, if it works.
Doesn't work, because Firefox 5 doesn't support news: or nntp: or imap: internally.
comment:20 Changed at 2011-07-30T23:01:31Z by davidsarah
Essential reading on how different browsers handle unique origins (as needed for comment:8 and similar fixes to work): http://code.google.com/p/browsersec/wiki/Part2#Origin_inheritance_rules
comment:21 Changed at 2011-11-29T19:34:34Z by warner
FYI, here's a description of how the browser's window.history JS interface works: http://www.adequatelygood.com/2010/7/Saner-HTML5-History-Management , which relates to the "back-jacking" attack.
comment:22 Changed at 2012-03-29T16:09:30Z by zooko
All right, what does it take to make progress on this ticket? I have seen a demo exploit that relies on the user following a link from protected content to malicious content -- the "back-jacking" attack. A good way to make progress on this ticket would be to make a system test that exercises the system through a live browser and demonstrates the attack! That would be cool. Anybody game to do that?
If not, another good way to make progress on this ticket would be to start implementing David-Sarah's technique from comment:8. Maybe the first step on that would be to write a design document specifying exactly what the comment:8 technique accomplishes? Maybe we should create a new ticket just for the comment:8 technique and retire this ticket?
comment:23 follow-up: ↓ 24 Changed at 2012-03-29T18:49:14Z by davidsarah
Mozilla and other browsers have been making good progress recently on implementing the HTML5 sandbox spec. That's a better approach than what I suggested in comment:8, since it's making use of a fully specified browser feature rather than the behaviour of an implementation-dependent corner case. So, as long as we only relied on the specified behaviour, any security holes in it would be browser bugs and would be the vendors' responsibility to fix.
comment:24 in reply to: ↑ 23 Changed at 2012-03-31T02:04:32Z by davidsarah
Replying to davidsarah:
Mozilla and other browsers have been making good progress recently on implementing the HTML5 sandbox spec.
The Mozilla ticket is https://bugzilla.mozilla.org/show_bug.cgi?id=341604 .
comment:25 Changed at 2012-07-05T12:30:08Z by ChosenOne
One could use Content Security Policy (CSP) to disallow any JavaScript except the one that tahoe needs to operate.
This will break WebApps on tahoe, but foil attacks too. Mh.
comment:26 Changed at 2012-08-10T23:16:06Z by nejucomo
- Cc nejucomo added
comment:27 Changed at 2012-11-15T02:50:38Z by nejucomo
While this ticket is about "accessing all your content" such as recovering the caps of victims, an attacker has a bootstrapping problem. Attack scripts must either:
- Run in the same origin as the tahoe gateway; or
- Violate security guarantees despite the same origin policy.
I've just posted a proof-of-concept attack in #1859 which can inject js into the tahoe grid and then execute it, starting from any domain. Therefore the latter attack approach can be upgraded to the former.
comment:28 Changed at 2013-09-14T17:39:08Z by zooko
- Description modified (diff)
- Keywords websec added
comment:29 Changed at 2015-04-26T23:52:34Z by TheJH
I made a PoC that shows one possible way to exploit this. Use a Tahoe-LAFS instance that is connected to the testnet, browse to different URLs in the testnet, then navigate the same tab to this URL:
Click anywhere on the page. The following attack will happen:
The evil HTML file opens itself in a second tab using "window.open(location.toString(), 'foo')" (requires a click to bypass popup blockers). Then the evil HTML file in the second tab can access the first tab using "window.opener". The evil second tab does this again and again:
- run window.parent.history.go(-1) to let the first tab go one step back in the browsing history
- grab the current URL of the first tab using window.parent.location.toString()
- send the URL out to the attacker's server
This will work until a page with a different origin is reached.
After the attack has run, you'll see the URLs that you have visited in the same tab before.
This is a copy of the HTML file: https://var.thejh.net/lafs_historysteal.html.bin
comment:30 Changed at 2015-04-27T00:16:26Z by warner
- Cc tahoe-dev@… removed
Replying to zooko:
One option is to use loopback addresses other than 127.0.0.1. The entire 127/8 class A is technically reserved for loopback, and so any of the 224-2 (127.0.0.0 and 127.255.255.255 aren't allowed) addresses in that range should be usable to connect to your Tahoe node. The node could issue 304 redirects to automatically shift you from one "host" to another.
Some possible problems with this:
(1) I don't know if all IP implementations around actually honor the "unusual" loopback addresses. Linux does. Windows appears to (at least, 'ping 127.42.94.19' works).
(2) Javascript implementations may know that 127.x.x.x is all the same host and allow cross-address connections.
(3) It's not clear to me how Tahoe should know when to issue redirects.
Another option is to use cookies. A cookie can also be made specific to a host/domain but also to a path. As I understand it (haven't tested), Javascript loaded from path A should not have access to cookies set specific to path B. If Tahoe were to set per-path cookies on first access to a path, then refuse later requests that don't include the right cookie, then Javascript from path B would not be able to successfully load URLs on path A, because it wouldn't have the cookie.
There are numerous downsides to the cookie approach, and the only advantages I see are if it perhaps works around (1) or (2) and the fact that it allows arbitrarily-large authentication strings.