You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@jackrabbit.apache.org by Apache Wiki <wi...@apache.org> on 2009/09/30 09:03:03 UTC

[Jackrabbit Wiki] Update of "EncodingAndEscaping" by brookingcharlie

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Jackrabbit Wiki" for change notification.

The "EncodingAndEscaping" page has been changed by brookingcharlie:
http://wiki.apache.org/jackrabbit/EncodingAndEscaping

New page:
= Encoding and Escaping =

This pages covers escaping/encoding of paths, names, and values in the context of JCR-based web applications.

There are utility methods for escaping/encoding in the {{{org.apache.jackrabbit.util.ISO9075}}} and {{{org.apache.jackrabbit.util.Text}}} classes. Although developed under Jackrabbit, they are part of the JCR Commons module which only depends on the JCR API.

If you're building a path from user-supplied names, you need to escape illegal JCR characters (eg "item:1" becomes "item%3A1"):

{{{
String path = "/foo/" + Text.escapeIllegalJcrChars(name);
}}}

Such paths are useful for JCR methods like {{{Session.getItem(...)}}} etc.

If you want to use paths in XPath queries, though, you need to escape according to ISO9075 rules (eg "1hr0" becomes "_x0031_hr0"):

{{{
String query = "/jcr:root" + ISO9075.encodePath(node.getPath()) + "/" + ISO9075.encode(name);
}}}

For a user-supplied string, this could lead to something like {{{ISO9075.encode(Text.escapeIllegalJcrChars(name))}}}.

For values inserted into the queries, you should do escaping to prevent incorrect values and query injection. Generally, if you enclose values in single quotes, you just need to replace any literal single quote character with '' (two consecutive single quote characters). There is also a {{{Text.escapeIllegalXpathSearchChars(...)}}} method you should use for calls to {{{jcr:contains(...)}}}.

{{{
String q =
  "/jcr:root/foo/element(*, foo)" +
  "[jcr:contains(@title, '" + Text.escapeIllegalXpathSearchChars(q).replaceAll("'", "''") + "')]" +
  "[@itemID = '" + itemID.replaceAll("'", "''") + "']";
}}}

There are further encoding/decoding methods in the {{{Text}}} class for dealing with URIs in a webapp. The allowed chars for JCR names contains the URI set plus a few others (eg. spaces). Thus the URI set is acutally more constrained. Therefore, if you have a valid URI, you can map it directly onto a JCR path without having to worry about escaping (this is by design). If you go the other way, ie. have a JCR path and want to create an URI for it, you simply use plain URI escaping for it. To make everything simpler in the context of URIs, one suggestion is to only create JCR nodes with names that are valid URIs. 

== See also ==

 * [[Examples]]