You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by Ben Collins-Sussman <su...@collab.net> on 2005/03/08 18:41:51 UTC

opposite of apr_xml_quote_string() ?

I'm working on the svn 1.2 "locking" feature, and I've come to a point 
where mod_dav_svn needs to xml-unescape the incoming comment attached 
to a DAV lock, so that it ends up being stored in the svn repository as 
something human-readable.  (For complex reasons, neither httpd nor 
mod_dav is xml-unescaping the data.)

I've searched high and low through apr, apr-util, httpd, and svn APIs.  
I've found apr_xml_quote_string() as a nice way of xml-escaping data, 
but I've not found anything to xml-unescape.

Any ideas?  Do I need to go write this function myself?


Re: opposite of apr_xml_quote_string() ?

Posted by Joe Orton <jo...@redhat.com>.
On Wed, Mar 09, 2005 at 05:16:42PM +0000, Joe Orton wrote:
> On Wed, Mar 09, 2005 at 10:56:46AM -0600, Ben Collins-Sussman wrote:
> > 3. Meanwhile, back in httpd, here's the incoming lock that mod_dav is
> >    handing to mod_dav_svn, expecting it to be stored in the repository:
> > 
> > (gdb) p *dlock
> > $1 = {
> >   rectype = DAV_LOCKREC_DIRECT,
> >   is_locknull = 1,
> >   scope = DAV_LOCKSCOPE_EXCLUSIVE,
> >   type = DAV_LOCKTYPE_WRITE,
> >   depth = 0,
> >   timeout = 0,
> >   locktoken = 0x18d7e90,
> >   owner = 0x18d7f38 "<ns0:owner xmlns:ns0=\"DAV:\">this &lt;is&gt; a 
> > comment.</ns0:owner>",
> >   auth_user = 0x18d7f90 "sussman",
> >   info = 0x0,
> >   next = 0x0
> > }
> > 
> >        ... sure looks like the comment is still xml-escaped!
> > 
> > So... how is this possible?
> 
> Ah; gotcha, sorry, my mistake, it's because mod_dav is re-escaping it; I
> missed this when grepping earlier - dav_lock_parse_lockinfo has:
> 
>             /* quote all the values in the <DAV:owner> element */
>             apr_xml_quote_elem(p, child);
> 
> which is necessary by design I expect.

Could you just pass this lock->owner string back through the XML parser? 
apr_xml_to_text should guarantee it's well-formed, I think.  I'm not
sure adding an "XML unquoting" function to APR just for this is a great
idea; it's such a contrived situation that requires it.  Another
alternative would be to enhance the mod_dav API such that you can get to
the stuff unquoted as well.

joe


Re: opposite of apr_xml_quote_string() ?

Posted by Joe Orton <jo...@redhat.com>.
On Wed, Mar 09, 2005 at 10:56:46AM -0600, Ben Collins-Sussman wrote:
> 3. Meanwhile, back in httpd, here's the incoming lock that mod_dav is
>    handing to mod_dav_svn, expecting it to be stored in the repository:
> 
> (gdb) p *dlock
> $1 = {
>   rectype = DAV_LOCKREC_DIRECT,
>   is_locknull = 1,
>   scope = DAV_LOCKSCOPE_EXCLUSIVE,
>   type = DAV_LOCKTYPE_WRITE,
>   depth = 0,
>   timeout = 0,
>   locktoken = 0x18d7e90,
>   owner = 0x18d7f38 "<ns0:owner xmlns:ns0=\"DAV:\">this &lt;is&gt; a 
> comment.</ns0:owner>",
>   auth_user = 0x18d7f90 "sussman",
>   info = 0x0,
>   next = 0x0
> }
> 
>        ... sure looks like the comment is still xml-escaped!
> 
> So... how is this possible?

Ah; gotcha, sorry, my mistake, it's because mod_dav is re-escaping it; I
missed this when grepping earlier - dav_lock_parse_lockinfo has:

            /* quote all the values in the <DAV:owner> element */
            apr_xml_quote_elem(p, child);

which is necessary by design I expect.

joe

Re: opposite of apr_xml_quote_string() ?

Posted by Ben Collins-Sussman <su...@collab.net>.
On Mar 9, 2005, at 2:54 AM, Joe Orton wrote:
>
> mod_dav never chooses nor refuses to "XML unescape" anything: the XML
> parser *always* does it unconditionally, so I still don't understand
> what's going on.  If the fields passed down to mod_dav_svn are
> XML-escaped, either mod_dav has *chosen* to re-XML-escape it (I can't
> see where that would be happening), or it was double-escaped to begin
> with.
>
> Can you check protocol traces?
>

Here's all my evidence.  Maybe you can explain what's going on?


Status Quo:

1. libsvn_ra_dav calls ne_lock() to create a lock.  It first
    initializes a ne_lock structure, which includes:

       nlock = ne_lock_create();
       nlock->owner = ne_strdup(comment);

2. from the commandline, I run:

       $ svn lock Foo.java -m "this <is> a comment."
       subversion/libsvn_ra_dav/util.c:292: (apr_err=175002)
       svn: Lock request failed: 400 Bad Request (http://localhost)

    Ethereal shows me:

LOCK /svn/testrepos/Foo.java HTTP/1.1
Host: localhost
User-Agent: SVN/1.2.0 (dev build) neon/0.24.7
Connection: TE
TE: trailers
Content-Length: 182
Content-Type: application/xml
Depth: 0
Authorization: Basic c3Vzc21hbjpibG9ydA==
X-SVN-Options: svn-client-lock
X-SVN-Version-Name: 31

<?xml version="1.0" encoding="utf-8"?>
<lockinfo xmlns='DAV:'>
  <lockscope><exclusive/></lockscope>
<locktype><write/></locktype><owner>this <is> a comment.</owner>
</lockinfo>

HTTP/1.1 400 Bad Request
Date: Wed, 09 Mar 2005 15:57:15 GMT
Server: Apache/2.0.51 (Unix) SVN/1.2.0-dev DAV/2
Content-Length: 321
Connection: close
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>400 Bad Request</title>
</head><body>
<h1>Bad Request</h1>
<p>Your browser sent a request that this server could not 
understand.<br />
[...]


       So for starters, the ne_lock() call isn't xml-escaping the owner
       field.  But we already discussed this on neon's dev list, and
       you added it to your to-do list.  No big deal.


Tweaked Status Quo:


1. Try xml-escaping the outbound comment:

       nlock->owner = ne_strdup(apr_xml_quote_string(pool, comment, 1));

    And from the commandline, the lock works:

       $ svn lock Foo.java -m "this <is> a comment."
       'Foo.java' locked by user 'sussman'.

2. Ethereal shows me that the outbound comment is indeed xml-escaped:

LOCK /svn/testrepos/Foo.java HTTP/1.1
Host: localhost
User-Agent: SVN/1.2.0 (dev build) neon/0.24.7
Connection: TE
TE: trailers
Content-Length: 188
Content-Type: application/xml
Depth: 0
Authorization: Basic c3Vzc21hbjpibG9ydA==
X-SVN-Options: svn-client-lock
X-SVN-Version-Name: 31

<?xml version="1.0" encoding="utf-8"?>
<lockinfo xmlns='DAV:'>
  <lockscope><exclusive/></lockscope>
<locktype><write/></locktype><owner>this &lt;is&gt; a comment.</owner>
</lockinfo>


3. Meanwhile, back in httpd, here's the incoming lock that mod_dav is
    handing to mod_dav_svn, expecting it to be stored in the repository:

(gdb) p *dlock
$1 = {
   rectype = DAV_LOCKREC_DIRECT,
   is_locknull = 1,
   scope = DAV_LOCKSCOPE_EXCLUSIVE,
   type = DAV_LOCKTYPE_WRITE,
   depth = 0,
   timeout = 0,
   locktoken = 0x18d7e90,
   owner = 0x18d7f38 "<ns0:owner xmlns:ns0=\"DAV:\">this &lt;is&gt; a 
comment.</ns0:owner>",
   auth_user = 0x18d7f90 "sussman",
   info = 0x0,
   next = 0x0
}

        ... sure looks like the comment is still xml-escaped!

So... how is this possible?



Re: opposite of apr_xml_quote_string() ?

Posted by Joe Orton <jo...@redhat.com>.
On Tue, Mar 08, 2005 at 05:10:46PM -0600, Ben Collins-Sussman wrote:
> 
> On Mar 8, 2005, at 2:20 PM, Joe Orton wrote:
> 
> >On Tue, Mar 08, 2005 at 11:41:51AM -0600, Ben Collins-Sussman wrote:
> >>I'm working on the svn 1.2 "locking" feature, and I've come to a point
> >>where mod_dav_svn needs to xml-unescape the incoming comment attached
> >>to a DAV lock, so that it ends up being stored in the svn repository 
> >>as
> >>something human-readable.  (For complex reasons, neither httpd nor
> >>mod_dav is xml-unescaping the data.)
> >>
> >>I've searched high and low through apr, apr-util, httpd, and svn APIs.
> >>I've found apr_xml_quote_string() as a nice way of xml-escaping data,
> >>but I've not found anything to xml-unescape.
> >
> >It's because the XML parser does it automatically; you never normally
> >get to see the escaped form when parsing the XML.  How complex are 
> >these
> >complex reasons - is the data getting XML-escaped twice, or something?
> >
> 
> An svn client uses neon to send a DAV lock to mod_dav.  mod_dav treats 
> the <D:owner> field as sacred, and refuses to xml-unescape it or touch 
> it at all.  Then it hands all the fields down to mod_dav_svn.

mod_dav never chooses nor refuses to "XML unescape" anything: the XML
parser *always* does it unconditionally, so I still don't understand
what's going on.  If the fields passed down to mod_dav_svn are
XML-escaped, either mod_dav has *chosen* to re-XML-escape it (I can't
see where that would be happening), or it was double-escaped to begin
with.

Can you check protocol traces?

joe

Re: opposite of apr_xml_quote_string() ?

Posted by Ben Collins-Sussman <su...@collab.net>.
On Mar 8, 2005, at 2:20 PM, Joe Orton wrote:

> On Tue, Mar 08, 2005 at 11:41:51AM -0600, Ben Collins-Sussman wrote:
>> I'm working on the svn 1.2 "locking" feature, and I've come to a point
>> where mod_dav_svn needs to xml-unescape the incoming comment attached
>> to a DAV lock, so that it ends up being stored in the svn repository 
>> as
>> something human-readable.  (For complex reasons, neither httpd nor
>> mod_dav is xml-unescaping the data.)
>>
>> I've searched high and low through apr, apr-util, httpd, and svn APIs.
>> I've found apr_xml_quote_string() as a nice way of xml-escaping data,
>> but I've not found anything to xml-unescape.
>
> It's because the XML parser does it automatically; you never normally
> get to see the escaped form when parsing the XML.  How complex are 
> these
> complex reasons - is the data getting XML-escaped twice, or something?
>

An svn client uses neon to send a DAV lock to mod_dav.  mod_dav treats 
the <D:owner> field as sacred, and refuses to xml-unescape it or touch 
it at all.  Then it hands all the fields down to mod_dav_svn.

If the lock came from a generic DAV client, then mod_dav_svn stores the 
owner field verbatim, and retrieves it verbatim later on when asked.

If the lock came from a subverison cilent, then mod_dav_svn needs to 
make the value palatable to the rest of subversion.  Before storing in 
the repository, it strips away the <D:owner> tags, and xml-unescapes 
the data.  (ra_dav xml-escapes this 'comment' field before sending it 
over neon, since neon isn't doing it.)  When retrieving the lock later 
on, it remembers to xml-escape the data and re-add the <D:owner> tags, 
so that the http response is valid.

On the client side of subversion, the lock coming back from 
ne_lock_discover() needs to have the owner field unescaped as well.


Re: opposite of apr_xml_quote_string() ?

Posted by Joe Orton <jo...@redhat.com>.
On Tue, Mar 08, 2005 at 11:41:51AM -0600, Ben Collins-Sussman wrote:
> I'm working on the svn 1.2 "locking" feature, and I've come to a point 
> where mod_dav_svn needs to xml-unescape the incoming comment attached 
> to a DAV lock, so that it ends up being stored in the svn repository as 
> something human-readable.  (For complex reasons, neither httpd nor 
> mod_dav is xml-unescaping the data.)
> 
> I've searched high and low through apr, apr-util, httpd, and svn APIs.  
> I've found apr_xml_quote_string() as a nice way of xml-escaping data, 
> but I've not found anything to xml-unescape.

It's because the XML parser does it automatically; you never normally
get to see the escaped form when parsing the XML.  How complex are these
complex reasons - is the data getting XML-escaped twice, or something?

joe