You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by Brian Moseley <bc...@osafoundation.org> on 2005/05/17 22:04:59 UTC

encoding jcr names

section 6.2.5.2 of the jcr spec (0.16.3) disallows several characters in 
jcr names. my webdav server needs to be able to support nodes with names 
containing at least one of these characters ("'"). what's the best 
strategy for handling this requirement?

my initial thought is to use URL encoding, but i don't want to conflict 
with the encoding that occurs when transforming to the xml views. are 
there any good, simple alternatives?

also, i'm not sure where within the simple webdav server to handle the 
encoding and decoding. probably in LocatorFactoryImpl and 
DavResourceImpl .. anywhere else?

Re: encoding jcr names

Posted by Tobias Strasser <to...@gmail.com>.
> Tobias Strasser wrote:
> > well, the idea is to disentangle the jcr-server/jcr-webdav/jcr-client
> > stuff and to create a commons-jcr.jar and commons-jackrabbit.jar on
> > the long run (see 'WebDAV exploration suites' thread).
> 
> right, but who knows how long that will take? :)

you got it :-)

> > but of course you could help us here in fixing the escaping issue. so
> > we can put the respective methods on the commons-jcr or where ever.
> 
> ok. if y'all haven't gotten to it by the time it's a pressing issue for
> me, i'll send a patch against the simple server, common-jcr, or whatever
> else seems most appropriate at that time.

cool. thanks.
-- 
------------------------------------------< tobias.strasser@day.com >---
Tobias Strasser, Day Management AG, Barfuesserplatz 6, CH - 4001 Basel
T +41 61 226 98 98, F +41 61 226 98 97 
-----------------------------------------------< http://www.day.com >---

Re: encoding jcr names

Posted by Brian Moseley <bc...@osafoundation.org>.
Tobias Strasser wrote:
> well, the idea is to disentangle the jcr-server/jcr-webdav/jcr-client
> stuff and to create a commons-jcr.jar and commons-jackrabbit.jar on
> the long run (see 'WebDAV exploration suites' thread).

right, but who knows how long that will take? :)

> but of course you could help us here in fixing the escaping issue. so
> we can put the respective methods on the commons-jcr or where ever.

ok. if y'all haven't gotten to it by the time it's a pressing issue for 
me, i'll send a patch against the simple server, common-jcr, or whatever 
else seems most appropriate at that time.

Re: encoding jcr names

Posted by Tobias Strasser <to...@gmail.com>.
well, the idea is to disentangle the jcr-server/jcr-webdav/jcr-client
stuff and to create a commons-jcr.jar and commons-jackrabbit.jar on
the long run (see 'WebDAV exploration suites' thread).

but of course you could help us here in fixing the escaping issue. so
we can put the respective methods on the commons-jcr or where ever.

cheers, tobi

On 5/19/05, Brian Moseley <bc...@osafoundation.org> wrote:
> Tobias Strasser wrote:
> > imo, the single quote could be allowed in names by the spec. but the
> > general question remains: how to escape illegal jcr name characters?
> >
> > the _x0000_ method mentioned above only applies for escaping non-valid
> > xml characters in names when exporting to xml. if using the same
> > mechanism here would probably confuse the situation.
> >
> > i suggest to perform a url encoding of the non-valid characters. namely:
> >
> > nonspace ::= (* Any Unicode character except:
> >                 '/', ':', '[', ']', '*',
> >                 ''', '"', '|' or any whitespace
> >                 character *)
> >
> > this escaping would also be a good candidate to go into the commons-jcr library.
> > comments?
> 
> seems reasonable to me.
> 
> i can take a stab at this in the simple webdav server if none of you
> folks are motivated to work on it. let me know if i should.
> 
> thanks!
> 


-- 
------------------------------------------< tobias.strasser@day.com >---
Tobias Strasser, Day Management AG, Barfuesserplatz 6, CH - 4001 Basel
T +41 61 226 98 98, F +41 61 226 98 97 
-----------------------------------------------< http://www.day.com >---

Re: encoding jcr names

Posted by Brian Moseley <bc...@osafoundation.org>.
Tobias Strasser wrote:
> imo, the single quote could be allowed in names by the spec. but the
> general question remains: how to escape illegal jcr name characters?
> 
> the _x0000_ method mentioned above only applies for escaping non-valid
> xml characters in names when exporting to xml. if using the same
> mechanism here would probably confuse the situation.
> 
> i suggest to perform a url encoding of the non-valid characters. namely:
> 
> nonspace ::= (* Any Unicode character except: 
>                 '/', ':', '[', ']', '*', 
>                 ''', '"', '|' or any whitespace
>                 character *)
> 
> this escaping would also be a good candidate to go into the commons-jcr library.
> comments?

seems reasonable to me.

i can take a stab at this in the simple webdav server if none of you 
folks are motivated to work on it. let me know if i should.

thanks!

Re: encoding jcr names

Posted by Tobias Strasser <to...@gmail.com>.
imo, the single quote could be allowed in names by the spec. but the
general question remains: how to escape illegal jcr name characters?

the _x0000_ method mentioned above only applies for escaping non-valid
xml characters in names when exporting to xml. if using the same
mechanism here would probably confuse the situation.

i suggest to perform a url encoding of the non-valid characters. namely:

nonspace ::= (* Any Unicode character except: 
                '/', ':', '[', ']', '*', 
                ''', '"', '|' or any whitespace
                character *)

this escaping would also be a good candidate to go into the commons-jcr library.
comments?

On 5/18/05, Brian Moseley <bc...@osafoundation.org> wrote:
> Tobias Strasser wrote:
> 
> > i doubt that this is the desired result. the path on the webdav
> > request should be unescaped, and the correct node be created. if this
> > is not the case, then its a bug.
> 
> except the 0.16.3 spec clearly disallows the "'" character (and several
> others) in jcr names.
> 
> > what servlet engine are you running the server in? (we encountered
> > cases, where the servlet engine did not unescape the path as specified
> > in the servlet spec.)
> 
> tomcat 5.0.28.
> 


-- 
------------------------------------------< tobias.strasser@day.com >---
Tobias Strasser, Day Management AG, Barfuesserplatz 6, CH - 4001 Basel
T +41 61 226 98 98, F +41 61 226 98 97 
-----------------------------------------------< http://www.day.com >---

Re: encoding jcr names

Posted by Brian Moseley <bc...@osafoundation.org>.
Tobias Strasser wrote:

> i doubt that this is the desired result. the path on the webdav
> request should be unescaped, and the correct node be created. if this
> is not the case, then its a bug.

except the 0.16.3 spec clearly disallows the "'" character (and several 
others) in jcr names.

> what servlet engine are you running the server in? (we encountered
> cases, where the servlet engine did not unescape the path as specified
> in the servlet spec.)

tomcat 5.0.28.

Re: encoding jcr names

Posted by Tobias Strasser <to...@gmail.com>.
On 5/18/05, Brian Moseley <bc...@osafoundation.org> wrote:
> Peeter Piegaze wrote:
> 
> > Brian, in your case you are going from a some webdavish resource name
> > and trying to produce a valid JCR name. The ugliness of the above
> > escaping mechanism stems from the limitations of XML. Since JCR names
> > do not suffer from these limitations your options for converting your
> > resource name to JCR are more open. Off the top of my head I don't
> > think URL encoding will conflict with the export to XML...or maybe you
> > have an example in mind?
> 
> nothing specific. i just wanted to make sure that i don't break anything :)
> 
> let's say i PUT or PROPPATCH a webdav resource "brian's calendar". this
> would translate into a node named "brian%2cs calendar".

i doubt that this is the desired result. the path on the webdav
request should be unescaped, and the correct node be created. if this
is not the case, then its a bug.
what servlet engine are you running the server in? (we encountered
cases, where the servlet engine did not unescape the path as specified
in the servlet spec.)

cheers, tobi

Re: encoding jcr names

Posted by Brian Moseley <bc...@osafoundation.org>.
Peeter Piegaze wrote:

> Brian, in your case you are going from a some webdavish resource name
> and trying to produce a valid JCR name. The ugliness of the above
> escaping mechanism stems from the limitations of XML. Since JCR names
> do not suffer from these limitations your options for converting your
> resource name to JCR are more open. Off the top of my head I don't
> think URL encoding will conflict with the export to XML...or maybe you
> have an example in mind?

nothing specific. i just wanted to make sure that i don't break anything :)

let's say i PUT or PROPPATCH a webdav resource "brian's calendar". this 
would translate into a node named "brian%2cs calendar".

it seems easy enough to unescape the name when pulling the node out of 
the repository (during a PROPFIND or GET).

but what if about when i'm querying? in a really dumb case, what if a 
user wants to find all of the calendars in my server with "'" in their 
name? is it problematic to url escape that parameter when constructing 
the jcr query? would the operation be more involved than that?

are there any other jcr operations i'd need to worry about escaping and 
unescaping?

Re: encoding jcr names

Posted by Peeter Piegaze <pe...@gmail.com>.
On 5/18/05, Roy T. Fielding <fi...@gbiv.com> wrote:
> On May 17, 2005, at 1:04 PM, Brian Moseley wrote:
> 
> > section 6.2.5.2 of the jcr spec (0.16.3) disallows several characters
> > in jcr names. my webdav server needs to be able to support nodes with
> > names containing at least one of these characters ("'"). what's the
> > best strategy for handling this requirement?
> 
> That changed in the final versions of the spec.  IIRC, replace the
> character "'" with "_x0027_" (i.e., underscore x UTF-16-hex underscore).
> That's from memory, but I'm sure Peeter will correct me if I forgot
> something.  The convention is from some obscure XML NAMES standard.

Its not so much that anything changed in this department since 0.16.3.
The syntax of names and paths is the same.

The escaping mechanism Roy mentioned describes how JCR names which are
not valid XML names are mangled upon export to XML.

Brian, in your case you are going from a some webdavish resource name
and trying to produce a valid JCR name. The ugliness of the above
escaping mechanism stems from the limitations of XML. Since JCR names
do not suffer from these limitations your options for converting your
resource name to JCR are more open. Off the top of my head I don't
think URL encoding will conflict with the export to XML...or maybe you
have an example in mind?

Cheers,
Peeter

Re: encoding jcr names

Posted by "Roy T. Fielding" <fi...@gbiv.com>.
On May 17, 2005, at 1:04 PM, Brian Moseley wrote:

> section 6.2.5.2 of the jcr spec (0.16.3) disallows several characters 
> in jcr names. my webdav server needs to be able to support nodes with 
> names containing at least one of these characters ("'"). what's the 
> best strategy for handling this requirement?

That changed in the final versions of the spec.  IIRC, replace the
character "'" with "_x0027_" (i.e., underscore x UTF-16-hex underscore).
That's from memory, but I'm sure Peeter will correct me if I forgot
something.  The convention is from some obscure XML NAMES standard.

....Roy