You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by Ben Collins-Sussman <su...@collab.net> on 2005/03/08 18:41:51 UTC
opposite of apr_xml_quote_string() ?
I'm working on the svn 1.2 "locking" feature, and I've come to a point
where mod_dav_svn needs to xml-unescape the incoming comment attached
to a DAV lock, so that it ends up being stored in the svn repository as
something human-readable. (For complex reasons, neither httpd nor
mod_dav is xml-unescaping the data.)
I've searched high and low through apr, apr-util, httpd, and svn APIs.
I've found apr_xml_quote_string() as a nice way of xml-escaping data,
but I've not found anything to xml-unescape.
Any ideas? Do I need to go write this function myself?
Re: opposite of apr_xml_quote_string() ?
Posted by Joe Orton <jo...@redhat.com>.
On Wed, Mar 09, 2005 at 05:16:42PM +0000, Joe Orton wrote:
> On Wed, Mar 09, 2005 at 10:56:46AM -0600, Ben Collins-Sussman wrote:
> > 3. Meanwhile, back in httpd, here's the incoming lock that mod_dav is
> > handing to mod_dav_svn, expecting it to be stored in the repository:
> >
> > (gdb) p *dlock
> > $1 = {
> > rectype = DAV_LOCKREC_DIRECT,
> > is_locknull = 1,
> > scope = DAV_LOCKSCOPE_EXCLUSIVE,
> > type = DAV_LOCKTYPE_WRITE,
> > depth = 0,
> > timeout = 0,
> > locktoken = 0x18d7e90,
> > owner = 0x18d7f38 "<ns0:owner xmlns:ns0=\"DAV:\">this <is> a
> > comment.</ns0:owner>",
> > auth_user = 0x18d7f90 "sussman",
> > info = 0x0,
> > next = 0x0
> > }
> >
> > ... sure looks like the comment is still xml-escaped!
> >
> > So... how is this possible?
>
> Ah; gotcha, sorry, my mistake, it's because mod_dav is re-escaping it; I
> missed this when grepping earlier - dav_lock_parse_lockinfo has:
>
> /* quote all the values in the <DAV:owner> element */
> apr_xml_quote_elem(p, child);
>
> which is necessary by design I expect.
Could you just pass this lock->owner string back through the XML parser?
apr_xml_to_text should guarantee it's well-formed, I think. I'm not
sure adding an "XML unquoting" function to APR just for this is a great
idea; it's such a contrived situation that requires it. Another
alternative would be to enhance the mod_dav API such that you can get to
the stuff unquoted as well.
joe
Re: opposite of apr_xml_quote_string() ?
Posted by Joe Orton <jo...@redhat.com>.
On Wed, Mar 09, 2005 at 10:56:46AM -0600, Ben Collins-Sussman wrote:
> 3. Meanwhile, back in httpd, here's the incoming lock that mod_dav is
> handing to mod_dav_svn, expecting it to be stored in the repository:
>
> (gdb) p *dlock
> $1 = {
> rectype = DAV_LOCKREC_DIRECT,
> is_locknull = 1,
> scope = DAV_LOCKSCOPE_EXCLUSIVE,
> type = DAV_LOCKTYPE_WRITE,
> depth = 0,
> timeout = 0,
> locktoken = 0x18d7e90,
> owner = 0x18d7f38 "<ns0:owner xmlns:ns0=\"DAV:\">this <is> a
> comment.</ns0:owner>",
> auth_user = 0x18d7f90 "sussman",
> info = 0x0,
> next = 0x0
> }
>
> ... sure looks like the comment is still xml-escaped!
>
> So... how is this possible?
Ah; gotcha, sorry, my mistake, it's because mod_dav is re-escaping it; I
missed this when grepping earlier - dav_lock_parse_lockinfo has:
/* quote all the values in the <DAV:owner> element */
apr_xml_quote_elem(p, child);
which is necessary by design I expect.
joe
Re: opposite of apr_xml_quote_string() ?
Posted by Ben Collins-Sussman <su...@collab.net>.
On Mar 9, 2005, at 2:54 AM, Joe Orton wrote:
>
> mod_dav never chooses nor refuses to "XML unescape" anything: the XML
> parser *always* does it unconditionally, so I still don't understand
> what's going on. If the fields passed down to mod_dav_svn are
> XML-escaped, either mod_dav has *chosen* to re-XML-escape it (I can't
> see where that would be happening), or it was double-escaped to begin
> with.
>
> Can you check protocol traces?
>
Here's all my evidence. Maybe you can explain what's going on?
Status Quo:
1. libsvn_ra_dav calls ne_lock() to create a lock. It first
initializes a ne_lock structure, which includes:
nlock = ne_lock_create();
nlock->owner = ne_strdup(comment);
2. from the commandline, I run:
$ svn lock Foo.java -m "this <is> a comment."
subversion/libsvn_ra_dav/util.c:292: (apr_err=175002)
svn: Lock request failed: 400 Bad Request (http://localhost)
Ethereal shows me:
LOCK /svn/testrepos/Foo.java HTTP/1.1
Host: localhost
User-Agent: SVN/1.2.0 (dev build) neon/0.24.7
Connection: TE
TE: trailers
Content-Length: 182
Content-Type: application/xml
Depth: 0
Authorization: Basic c3Vzc21hbjpibG9ydA==
X-SVN-Options: svn-client-lock
X-SVN-Version-Name: 31
<?xml version="1.0" encoding="utf-8"?>
<lockinfo xmlns='DAV:'>
<lockscope><exclusive/></lockscope>
<locktype><write/></locktype><owner>this <is> a comment.</owner>
</lockinfo>
HTTP/1.1 400 Bad Request
Date: Wed, 09 Mar 2005 15:57:15 GMT
Server: Apache/2.0.51 (Unix) SVN/1.2.0-dev DAV/2
Content-Length: 321
Connection: close
Content-Type: text/html; charset=iso-8859-1
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>400 Bad Request</title>
</head><body>
<h1>Bad Request</h1>
<p>Your browser sent a request that this server could not
understand.<br />
[...]
So for starters, the ne_lock() call isn't xml-escaping the owner
field. But we already discussed this on neon's dev list, and
you added it to your to-do list. No big deal.
Tweaked Status Quo:
1. Try xml-escaping the outbound comment:
nlock->owner = ne_strdup(apr_xml_quote_string(pool, comment, 1));
And from the commandline, the lock works:
$ svn lock Foo.java -m "this <is> a comment."
'Foo.java' locked by user 'sussman'.
2. Ethereal shows me that the outbound comment is indeed xml-escaped:
LOCK /svn/testrepos/Foo.java HTTP/1.1
Host: localhost
User-Agent: SVN/1.2.0 (dev build) neon/0.24.7
Connection: TE
TE: trailers
Content-Length: 188
Content-Type: application/xml
Depth: 0
Authorization: Basic c3Vzc21hbjpibG9ydA==
X-SVN-Options: svn-client-lock
X-SVN-Version-Name: 31
<?xml version="1.0" encoding="utf-8"?>
<lockinfo xmlns='DAV:'>
<lockscope><exclusive/></lockscope>
<locktype><write/></locktype><owner>this <is> a comment.</owner>
</lockinfo>
3. Meanwhile, back in httpd, here's the incoming lock that mod_dav is
handing to mod_dav_svn, expecting it to be stored in the repository:
(gdb) p *dlock
$1 = {
rectype = DAV_LOCKREC_DIRECT,
is_locknull = 1,
scope = DAV_LOCKSCOPE_EXCLUSIVE,
type = DAV_LOCKTYPE_WRITE,
depth = 0,
timeout = 0,
locktoken = 0x18d7e90,
owner = 0x18d7f38 "<ns0:owner xmlns:ns0=\"DAV:\">this <is> a
comment.</ns0:owner>",
auth_user = 0x18d7f90 "sussman",
info = 0x0,
next = 0x0
}
... sure looks like the comment is still xml-escaped!
So... how is this possible?
Re: opposite of apr_xml_quote_string() ?
Posted by Joe Orton <jo...@redhat.com>.
On Tue, Mar 08, 2005 at 05:10:46PM -0600, Ben Collins-Sussman wrote:
>
> On Mar 8, 2005, at 2:20 PM, Joe Orton wrote:
>
> >On Tue, Mar 08, 2005 at 11:41:51AM -0600, Ben Collins-Sussman wrote:
> >>I'm working on the svn 1.2 "locking" feature, and I've come to a point
> >>where mod_dav_svn needs to xml-unescape the incoming comment attached
> >>to a DAV lock, so that it ends up being stored in the svn repository
> >>as
> >>something human-readable. (For complex reasons, neither httpd nor
> >>mod_dav is xml-unescaping the data.)
> >>
> >>I've searched high and low through apr, apr-util, httpd, and svn APIs.
> >>I've found apr_xml_quote_string() as a nice way of xml-escaping data,
> >>but I've not found anything to xml-unescape.
> >
> >It's because the XML parser does it automatically; you never normally
> >get to see the escaped form when parsing the XML. How complex are
> >these
> >complex reasons - is the data getting XML-escaped twice, or something?
> >
>
> An svn client uses neon to send a DAV lock to mod_dav. mod_dav treats
> the <D:owner> field as sacred, and refuses to xml-unescape it or touch
> it at all. Then it hands all the fields down to mod_dav_svn.
mod_dav never chooses nor refuses to "XML unescape" anything: the XML
parser *always* does it unconditionally, so I still don't understand
what's going on. If the fields passed down to mod_dav_svn are
XML-escaped, either mod_dav has *chosen* to re-XML-escape it (I can't
see where that would be happening), or it was double-escaped to begin
with.
Can you check protocol traces?
joe
Re: opposite of apr_xml_quote_string() ?
Posted by Ben Collins-Sussman <su...@collab.net>.
On Mar 8, 2005, at 2:20 PM, Joe Orton wrote:
> On Tue, Mar 08, 2005 at 11:41:51AM -0600, Ben Collins-Sussman wrote:
>> I'm working on the svn 1.2 "locking" feature, and I've come to a point
>> where mod_dav_svn needs to xml-unescape the incoming comment attached
>> to a DAV lock, so that it ends up being stored in the svn repository
>> as
>> something human-readable. (For complex reasons, neither httpd nor
>> mod_dav is xml-unescaping the data.)
>>
>> I've searched high and low through apr, apr-util, httpd, and svn APIs.
>> I've found apr_xml_quote_string() as a nice way of xml-escaping data,
>> but I've not found anything to xml-unescape.
>
> It's because the XML parser does it automatically; you never normally
> get to see the escaped form when parsing the XML. How complex are
> these
> complex reasons - is the data getting XML-escaped twice, or something?
>
An svn client uses neon to send a DAV lock to mod_dav. mod_dav treats
the <D:owner> field as sacred, and refuses to xml-unescape it or touch
it at all. Then it hands all the fields down to mod_dav_svn.
If the lock came from a generic DAV client, then mod_dav_svn stores the
owner field verbatim, and retrieves it verbatim later on when asked.
If the lock came from a subverison cilent, then mod_dav_svn needs to
make the value palatable to the rest of subversion. Before storing in
the repository, it strips away the <D:owner> tags, and xml-unescapes
the data. (ra_dav xml-escapes this 'comment' field before sending it
over neon, since neon isn't doing it.) When retrieving the lock later
on, it remembers to xml-escape the data and re-add the <D:owner> tags,
so that the http response is valid.
On the client side of subversion, the lock coming back from
ne_lock_discover() needs to have the owner field unescaped as well.
Re: opposite of apr_xml_quote_string() ?
Posted by Joe Orton <jo...@redhat.com>.
On Tue, Mar 08, 2005 at 11:41:51AM -0600, Ben Collins-Sussman wrote:
> I'm working on the svn 1.2 "locking" feature, and I've come to a point
> where mod_dav_svn needs to xml-unescape the incoming comment attached
> to a DAV lock, so that it ends up being stored in the svn repository as
> something human-readable. (For complex reasons, neither httpd nor
> mod_dav is xml-unescaping the data.)
>
> I've searched high and low through apr, apr-util, httpd, and svn APIs.
> I've found apr_xml_quote_string() as a nice way of xml-escaping data,
> but I've not found anything to xml-unescape.
It's because the XML parser does it automatically; you never normally
get to see the escaped form when parsing the XML. How complex are these
complex reasons - is the data getting XML-escaped twice, or something?
joe