Version 0.5 - 2005-04-02
The perfect way to prevent cross-site scripting (XSS) attacks would be for the user agent to read the website designer's mind to determine which scripts embedded in a page were legitimate and which were malicious. In the absence of affordable and reliable mind-reading technology, and in consideration of the mental fatigue this would undoubtedly induce in web page authors, this paper presents a way for a site designer to explain his state of mind to the user agent by specifying restrictions on the capabilities of his content. It is an updated version of an original proposal in a blog post.
As a real-world example, a webmail system might serve an HTML email and specify (in an HTTP header) that the user agent should not execute any script in the body of that page. This means that, even if the webmail system's content-filtering process failed, the user of a conforming user agent would not be at risk from malicious content in the attachment.
This paper is made up of a specification with interspersed commentary, coloured like this. I am particularly looking for feedback on which restrictions are useful, which are easy or hard to implement in particular browser codebases, and how the restrictions could be made more useful and easier to implement. After the first round of feedback, I hope to turn this document into a draft RFC.
If we were redesigning the web from the ground up, this concept might be better specified in terms of capabilities rather than restrictions. However, for backward-compatibility reasons, as the current default is "full capabilities", it makes sense to implement it and think about it as restrictions.
Having restrictions also removes any conflict with the desires of the user, and therefore any need for UI. In a capabilities situation, you get into a situation where a page wants a capability and the user may have to be asked whether to grant it. In a restrictions situation, you just apply page-defined restrictions on top of any a user may specify in their preferences. A restrictions model also allows user agents to implement the specification incrementally.
The primary use for this specification is intended to be the prevention or mitigation of cross-site scripting (XSS) attacks. Sites would define and serve content restrictions for pages which contained untrusted content which they had filtered. If the filtering failed, the content restrictions may still prevent malicious script from executing or doing damage. Additionally, his specification could also be used to, for example, define a set of restrictions on a Greasemonkey script, potentially allowing the safe(r) running of script from untrusted sources within the user's session with a particular site.
Note that this specification is designed to be a backstop to server-side content filtering, not a replacement for it. There is intentionally no way for a server to easily determine the existence of or level of support for this specification in a given user agent. It's about protecting the user and covering the designer's ass, not about allowing him to be lazy.
The following are the defined names and allowed values for the different available restrictions. This part of the document is written in an as content-agnostic way as possible, but the exact meaning for HTML or XHTML content is specified as a guide. "all" is the default in all cases.
Ideally, each set of values could be ordered in a "restriction hierarchy", to allow user-agents not supporting a particular value to fall back to the next least restrictive one. The hierarchy is not necessarily linear - for example, for cookie, it's none -> (read|write) -> all. In most cases the hierarchy is obvious, but in some places making it is not so easy, and it has been flagged as an issue.
|none||No script may execute.|
|internal||Only script directly written in the page is allowed.
This is useful to allow vetted script, while preventing
other arbitrary script from being imported from externally.
HTML: only embedded <script> allowed; no <script src="[url]">.
|external||Only scripts external to the page are allowed. This is useful to
allow some script, while still mitigating against e.g. injection
HTML: <script src="[url]"> only.
|header||Only script defined in the document header
is allowed. For document types which don't have such a header,
this is equivalent to "all".
HTML: script in the <head> element only.
ISSUE: this is hard to fit into a restriction hierarchy, as it is not a subset or superset of either internal or external.
Ideally, for "external" and "head", the script could dynamically add event handlers to content where script was forbidden, and they would still work.
|none||No access to cookies from script.|
|write||Write access only.|
|read||Read access only.|
HTML: controls access to document.cookie. Many sites do all their cookie stuff server-side, so have no need for JS access to cookies. The value "none" has roughly the same effect as the "httpOnly" cookie header extension, although this is more fine-grained because you can allow access to cookies from script on some pages and not on others.
|none||No ability to create new nodes. This allows simple
changes but not a complete page transformation.|
HTML: no access to any method whose name starts "create" on the document interface. No document.write() or write access to innerHTML.
|noblock||No block level elements; inline only.|
create() methods only, because we can vet those. Is this a generic and useful way to divide elements?
|nosub||No creation of embedded subdocument containers.|
HTML: No creation of <frame>, <iframe> or <object>. create() methods only, because we can vet those.
|none||No requesting of URLs by script.
The network is effectively disabled for script in this page until
the user manually clicks an <a> link or submits a form.|
HTML: This includes forms, XMLHttpRequest, setting src and href attributes, meta refresh, window.open, document.location etc. The idea is to prevent information leaks by stopping all communication.
|nopost||GET but not POST. This includes manually-submitted POSTs, including the way the user leaves the page. The idea is that database-changing operations can only use POST, and we can prevent the script taking those actions on behalf of the user. ISSUE: Is including manually-submitted POSTS moving outside arena of scripting? The problem we are trying to mitigate is is user being convinced to press Enter and submitting the hacked-up form for the attacker.|
|none||No frame hierarchy traversal allowed.|
|children||The children are accessible, but not the parent.
This allows sites to sandbox same-domain content inside an <iframe>.|
HTML: the frames array is accessible, but not parent or top.
|parent||the parent is accessible, but not the children.|
HTML: the opposite of the above.
The same-origin policy still applies, of course.
|none||No read or write access to any attribute of form controls.|
|read||Read access only.|
|write||Write access only.|
|nopassword||No access to any attribute of <input type="password">. ISSUE: how does this fit in the restriction hierarchy? Could we combine it with read - i.e. never allow reading of password fields if there's any sort of restriction?|
The value for this name is a domain to which all requests initiated by the page (embedded or script) are restricted. This prevents malicious scripts phoning home or importing unwanted content like applets. ISSUE: In specifying "all requests", this moves us out of the arena of restricting script actions - but it would probably be much harder to implement if we don't. The domain given doesn't have to be a suffix of the current domain. Multiple values of domain from different sources make it possible to access any of the named domains or their subdomains. Is that right? Does it open up risks? For IDN domains, the punycode form is specified.
This does not affect the usual same-origin checks. If domain is given a value in the restrictions, then writing to document.domain from script is not permitted. This mechanism supercedes that one.
Applying to Content
The restrictions are applied to content served over the web by serving it with an HTTP header, as follows:
or, in an XHTML or HTML page:
<meta http-equiv="Content-Restrictions" content="<policy-string>">
Do we need to support meta? Is it too risky, or does it widen access? Does the way the <meta> tag is defined in HTML require that all HTTP headers can be reflected in it?
The syntax is approximately (apologies for my bad BNF) of the following form:
<policy-string> ::= <version-number>;<pair-string> <version-number> ::= (digit)+ <pair-string> ::= (<name>=<value>,)+ <name> ::= [A-Za-z0-9-.]+ <value> ::= [A-Za-z0-9-.]+
The <name>s and <value>s are those defined in the earlier part of this specification. E.g.:
Do we need a way of shortening the string, using e.g. cookies,frames=none?
There may be multiple headers or meta tags. The implementation should prefer HTTP over meta, because an attacker may have more control over the existence or contents of a meta version. The implementation should then use the policy string with the highest version number it understands. Later versions are not guaranteed to maintain compatible syntax past the semi-colon. If there are multiple such strings, the implementation should use the first one it encounters.
New versions of this policy definition may be given a distinguishing version-number; this is version 1. Compatible sub-versioning is handled by the fact that unrecognised names are ignored, along with their values; unrecognised values are treated as "all". Recognised but unsupported values should be treated as the next least restrictive supported value in the ordering defined above, which may be "all".
If the chosen string has a parsing error, the next one in priority order should be used.
What happens if, say, a page is served with one Content-Restrictions header and it includes a JS file with another one? Do you combine the two and take the toughest restriction using the hierarchy? Does that apply to all the script or just that in the included file?