You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Ben Reser <be...@reser.org> on 2014/03/01 04:07:37 UTC

Re: [PATCH]: fix check-mime-type.pl for changes to 'svnlook proplist' output

On 2/25/14, 4:21 PM, Leo Davis wrote:
> I recently discovered that the old 'check-mime-type.pl' contrib script is I 
> had installed on my server was broken when using subversion 1.8.5.  I looked 
> for an updated script in the 1.8.5 branch and in trunk and discovered the 
> script was also broken in those places as well.  After some digging, I found 
> out when this change happened and patched the script.  I suspect (but don't 
> know for sure) that this only affects subversion 1.8.x.
> 
> This patch is from trunk.
> 
> [[[
>    Fix check-mime-type.pl for output changes to 'svnlook proplist'.
> 
>    The output format of 'svnlook proplist' was changed in revision 1416637.
> 
>    See also http://svn.haxx.se/dev/archive-2012-11/0510.shtml
> 
>    * contrib/hook-scripts/check-mime-type.pl:  Fix reading process output from 
> 'svnlook proplist'.
> ]]]

First of all thanks for your contribution.  You're quite right that this is
broken for 1.8.x and trunk.

Unfortunately, your patch isn't quite right either.

If you set a property with contains "svn:mime-type" or "svn:eol-style" at the
start of a line or with only leading whitespace your parser will read that as
the name of a property and will read the next line as the property value.

I'd also avoid the assumption that svn:mime-type and svn:eol-style properties
are always only a single line or that they have no leading spaces in the
values.  While the svn client makes efforts to avoid you setting invalid
things, the repository backend makes no effort to block this.

To that end I'd suggest that you implement a finite state machine to parse
this.  With the following states:

Starting state is fileheader.

fileheader: Sanity check state by checking that line has zero whitespace at
start (tempting to use /^Properties/ but that breaks with non-English locales).
 Next state is propname.

propname: Sanity check state by checking that line has at exactly 2 leading
spaces (not sure if we allow leading whitespace in property names, the client
disallows it and I think it's hard to marshal it over the network proptocols
but might still be possible with repos/fs layer).  Next state is propval.

propval: Has at least 4 leading spaces.  Only 4 leading spaces are removed from
the start of each line before adding to the property value.  Next state is
either propval, propname, or eof.  Determined by looking at next line (number
of leading spaces or eof result).

eof: End parsing.

Can you take on implementing the above?

Note that the above states does not account for --show-inherited-props, but I
don't think that's a problem since the hook script doesn't use that.  To
support it you'd need to allow multiple fileheader lines (Inherited line is
followed by from line) and empty line ahead of fileheader lines.  So you'd have
to look ahead up to two lines for that.

Side note if it is possible to get property names with leading spaces in them
it makes the output of svnlook proplist --verbose impossible to parse reliably.
 Sadly our old format had a clearer problem with this because setting another
property with multiple lines could end up being interpreted as a property.
E.G. say I sent a property named "zzz-bypass-mime-check" with the value:
line1
  svn:mime-type : expected/type

The name zzz is to make it be the last property the script sees.  At least with
fsfs we seem to always return the properties in alphabetical order.  I'm not
seeing any sorting but I suspect we're putting property values in the
representation in a stable order and so we end up with them coming out in the
same order.

RE: [PATCH]: fix check-mime-type.pl for changes to 'svnlook proplist' output

Posted by Leo Davis <ld...@speechfxinc.com>.
Hello,

With more explanation of how properties work now, I agree that my patch is broken.

I can implement a FSM to get the properties the right way.

I had considered using XML output, but didn't want to add needless complexity, especially to old Perl code.  I may have to rethink that if the network layer or repos/fs hands me property names with leading spaces.

It should be possible to try and inject leading spaces to the property name without the client over the network.   I'm not sure how to attempt something similar with the repos/fs layer.

Regards,

Leo

________________________________________
From: Ben Reser <be...@reser.org>
Sent: Friday, February 28, 2014 8:16 PM
To: Leo Davis; dev@subversion.apache.org
Subject: Re: [PATCH]: fix check-mime-type.pl for changes to 'svnlook proplist' output

On 2/28/14, 7:07 PM, Ben Reser wrote:
> To that end I'd suggest that you implement a finite state machine to parse
> this.  With the following states:
>
> Starting state is fileheader.
>
> fileheader: Sanity check state by checking that line has zero whitespace at
> start (tempting to use /^Properties/ but that breaks with non-English locales).
>  Next state is propname.
>
> propname: Sanity check state by checking that line has at exactly 2 leading
> spaces (not sure if we allow leading whitespace in property names, the client
> disallows it and I think it's hard to marshal it over the network proptocols
> but might still be possible with repos/fs layer).  Next state is propval.
>
> propval: Has at least 4 leading spaces.  Only 4 leading spaces are removed from
> the start of each line before adding to the property value.  Next state is
> either propval, propname, or eof.  Determined by looking at next line (number
> of leading spaces or eof result).
>
> eof: End parsing.
>
> Can you take on implementing the above?
>
> Note that the above states does not account for --show-inherited-props, but I
> don't think that's a problem since the hook script doesn't use that.  To
> support it you'd need to allow multiple fileheader lines (Inherited line is
> followed by from line) and empty line ahead of fileheader lines.  So you'd have
> to look ahead up to two lines for that.
>
> Side note if it is possible to get property names with leading spaces in them
> it makes the output of svnlook proplist --verbose impossible to parse reliably.
>  Sadly our old format had a clearer problem with this because setting another
> property with multiple lines could end up being interpreted as a property.
> E.G. say I sent a property named "zzz-bypass-mime-check" with the value:
> line1
>   svn:mime-type : expected/type
>
> The name zzz is to make it be the last property the script sees.  At least with
> fsfs we seem to always return the properties in alphabetical order.  I'm not
> seeing any sorting but I suspect we're putting property values in the
> representation in a stable order and so we end up with them coming out in the
> same order.

Forgot to mention.  You can always use --xml.  But then you need a full fledge
XML parser.  Which brings its own problems but wouldn't have these issues.


Re: [PATCH]: fix check-mime-type.pl for changes to 'svnlook proplist' output

Posted by Ben Reser <be...@reser.org>.
On 2/28/14, 7:07 PM, Ben Reser wrote:
> To that end I'd suggest that you implement a finite state machine to parse
> this.  With the following states:
> 
> Starting state is fileheader.
> 
> fileheader: Sanity check state by checking that line has zero whitespace at
> start (tempting to use /^Properties/ but that breaks with non-English locales).
>  Next state is propname.
> 
> propname: Sanity check state by checking that line has at exactly 2 leading
> spaces (not sure if we allow leading whitespace in property names, the client
> disallows it and I think it's hard to marshal it over the network proptocols
> but might still be possible with repos/fs layer).  Next state is propval.
> 
> propval: Has at least 4 leading spaces.  Only 4 leading spaces are removed from
> the start of each line before adding to the property value.  Next state is
> either propval, propname, or eof.  Determined by looking at next line (number
> of leading spaces or eof result).
> 
> eof: End parsing.
> 
> Can you take on implementing the above?
> 
> Note that the above states does not account for --show-inherited-props, but I
> don't think that's a problem since the hook script doesn't use that.  To
> support it you'd need to allow multiple fileheader lines (Inherited line is
> followed by from line) and empty line ahead of fileheader lines.  So you'd have
> to look ahead up to two lines for that.
> 
> Side note if it is possible to get property names with leading spaces in them
> it makes the output of svnlook proplist --verbose impossible to parse reliably.
>  Sadly our old format had a clearer problem with this because setting another
> property with multiple lines could end up being interpreted as a property.
> E.G. say I sent a property named "zzz-bypass-mime-check" with the value:
> line1
>   svn:mime-type : expected/type
> 
> The name zzz is to make it be the last property the script sees.  At least with
> fsfs we seem to always return the properties in alphabetical order.  I'm not
> seeing any sorting but I suspect we're putting property values in the
> representation in a stable order and so we end up with them coming out in the
> same order.

Forgot to mention.  You can always use --xml.  But then you need a full fledge
XML parser.  Which brings its own problems but wouldn't have these issues.


Re: Re: [PATCH]: fix check-mime-type.pl for changes to 'svnlook proplist' output

Posted by Matthias Buecher / Germany <ma...@maddes.net>.
On 17.03.2014 20:25, Leo Davis wrote:
> Hello,
>
> I decided to go with the last approach.  I've tried it on subversion 1.1.4 and 
> 1.8.5, and the simple testcases seem to indicate it works on both of them.
>
> I've attached the patch (from trunk) and a small collection of testcases in a 
> tarball.
>
> [[[
>    Fix check-mime-type.pl for output changes to 'svnlook proplist' in a 
> maximally backward compatible way.  Tested against subversion 1.1.4 and 1.8.5.
>
>    The output format of 'svnlook proplist' was changed in revision 1416637.
>
>    See also http://svn.haxx.se/dev/archive-2012-11/0510.shtml
>
>    * contrib/hook-scripts/check-mime-type.pl:  
>      -- Also test files when properties have been modified, not just added, to 
> ensure that properties don't vanish.
>      -- Replace "svnlook proplist --verbose" with a two step process of 
> "svnlook proplist" and "svnlook propget".  This was done because   "svnlook 
> proplist --verbose" cannot unambiguously be parsed without --xml with 
> multiline properties.
>       --  Remove unused code.
> ]]]
>
> Leo
>
> On Monday, March 03, 2014 09:12:32 AM Leo Davis wrote:
> > Hello,
> > 
> > Another approach is to dump 'svnlook proplist' altogether and use 'svnlook
> > propget svn:mime-type' and 'svnlook propget svn:eol-style' instead.  That
> > could be maximally backward compatible without introducing XML.
> > 
> > Regards,
> > 
> > Leo

Tested Leo's patch successfully on Windows 7(SVN 1.6 - 1.8) and Debian
am64 (SVN 1.6).

Only point with the patch is, that I would keep the chdir to the temp dir.
Just posted a patch for sane temp dir usage and Windows support minutes ago.

Maddes


> > ________________________________________
> > From: Ben Reser <be...@reser.org>
> > Sent: Sunday, March 02, 2014 11:40 PM
> > To: Leo Davis; Daniel Shahaf
> > Cc: dev@subversion.apache.org
> > Subject: Re: [PATCH]: fix check-mime-type.pl for changes to 'svnlook
> > proplist' output
> > On 3/2/14, 5:34 PM, Leo Davis wrote:
> > > As Ben pointed out, the current parser in the script for svnlook <= 1.7.x
> > > is broken and unfixable for multiline properties.  The closest we can get
> > > to DTRT in this case is to have svnlook output XML.  Hopefully no one
> > > still cares about svnlook <= 1.3 (?) that cannot output XML.> 
> > > On Mar 2, 2014, at 8:11 AM, "Daniel Shahaf" <d....@daniel.shahaf.name> 
> wrote:
> > >> One more issue: however you change the parser, it will break if
> > >> svnlook1.7 or older is used.  It would be nice to DTRT in that case
> > >> (either error out or retain the old parser).
> > 
> > It would be nice to have, but I think the effort to provide it is just too
> > great unless we go down the XML path.  Which is a pretty large change in the
> > requirements of the script.
> > 
> > Anyone that wants it can go get a copy of the script off the 1.7.x branch
> > (assuming we merge the fix to the 1.8.x branch).  This is contrib we have no
> > compatibility guarantees to worry about either.
> > 
> > Just put a prominent notice at the top of the script saying that it is only
> > intended for use with 1.8.x or newer servers.


Re: [PATCH]: fix check-mime-type.pl for changes to 'svnlook proplist' output

Posted by Leo Davis <ld...@speechfxinc.com>.
Hello,

I decided to go with the last approach.  I've tried it on subversion 1.1.4 and 
1.8.5, and the simple testcases seem to indicate it works on both of them.

I've attached the patch (from trunk) and a small collection of testcases in a 
tarball.

[[[
   Fix check-mime-type.pl for output changes to 'svnlook proplist' in a 
maximally backward compatible way.  Tested against subversion 1.1.4 and 1.8.5.

   The output format of 'svnlook proplist' was changed in revision 1416637.

   See also http://svn.haxx.se/dev/archive-2012-11/0510.shtml

   * contrib/hook-scripts/check-mime-type.pl:  
     -- Also test files when properties have been modified, not just added, to 
ensure that properties don't vanish.
     -- Replace "svnlook proplist --verbose" with a two step process of 
"svnlook proplist" and "svnlook propget".  This was done because   "svnlook 
proplist --verbose" cannot unambiguously be parsed without --xml with 
multiline properties.
      --  Remove unused code.
]]]

Leo

On Monday, March 03, 2014 09:12:32 AM Leo Davis wrote:
> Hello,
> 
> Another approach is to dump 'svnlook proplist' altogether and use 'svnlook
> propget svn:mime-type' and 'svnlook propget svn:eol-style' instead.  That
> could be maximally backward compatible without introducing XML.
> 
> Regards,
> 
> Leo
> ________________________________________
> From: Ben Reser <be...@reser.org>
> Sent: Sunday, March 02, 2014 11:40 PM
> To: Leo Davis; Daniel Shahaf
> Cc: dev@subversion.apache.org
> Subject: Re: [PATCH]: fix check-mime-type.pl for changes to 'svnlook
> proplist' output
> On 3/2/14, 5:34 PM, Leo Davis wrote:
> > As Ben pointed out, the current parser in the script for svnlook <= 1.7.x
> > is broken and unfixable for multiline properties.  The closest we can get
> > to DTRT in this case is to have svnlook output XML.  Hopefully no one
> > still cares about svnlook <= 1.3 (?) that cannot output XML.> 
> > On Mar 2, 2014, at 8:11 AM, "Daniel Shahaf" <d....@daniel.shahaf.name> 
wrote:
> >> One more issue: however you change the parser, it will break if
> >> svnlook1.7 or older is used.  It would be nice to DTRT in that case
> >> (either error out or retain the old parser).
> 
> It would be nice to have, but I think the effort to provide it is just too
> great unless we go down the XML path.  Which is a pretty large change in the
> requirements of the script.
> 
> Anyone that wants it can go get a copy of the script off the 1.7.x branch
> (assuming we merge the fix to the 1.8.x branch).  This is contrib we have no
> compatibility guarantees to worry about either.
> 
> Just put a prominent notice at the top of the script saying that it is only
> intended for use with 1.8.x or newer servers.

RE: [PATCH]: fix check-mime-type.pl for changes to 'svnlook proplist' output

Posted by Leo Davis <ld...@speechfxinc.com>.
Hello,

Another approach is to dump 'svnlook proplist' altogether and use 'svnlook propget svn:mime-type' and 'svnlook propget svn:eol-style' instead.  That could be maximally backward compatible without introducing XML.

Regards,

Leo
________________________________________
From: Ben Reser <be...@reser.org>
Sent: Sunday, March 02, 2014 11:40 PM
To: Leo Davis; Daniel Shahaf
Cc: dev@subversion.apache.org
Subject: Re: [PATCH]: fix check-mime-type.pl for changes to 'svnlook proplist' output

On 3/2/14, 5:34 PM, Leo Davis wrote:
> As Ben pointed out, the current parser in the script for svnlook <= 1.7.x is broken and unfixable for multiline properties.  The closest we can get to DTRT in this case is to have svnlook output XML.  Hopefully no one still cares about svnlook <= 1.3 (?) that cannot output XML.
>
> On Mar 2, 2014, at 8:11 AM, "Daniel Shahaf" <d....@daniel.shahaf.name> wrote:
>> One more issue: however you change the parser, it will break if
>> svnlook1.7 or older is used.  It would be nice to DTRT in that case
>> (either error out or retain the old parser).

It would be nice to have, but I think the effort to provide it is just too
great unless we go down the XML path.  Which is a pretty large change in the
requirements of the script.

Anyone that wants it can go get a copy of the script off the 1.7.x branch
(assuming we merge the fix to the 1.8.x branch).  This is contrib we have no
compatibility guarantees to worry about either.

Just put a prominent notice at the top of the script saying that it is only
intended for use with 1.8.x or newer servers.



Re: [PATCH]: fix check-mime-type.pl for changes to 'svnlook proplist' output

Posted by Ben Reser <be...@reser.org>.
On 3/2/14, 5:34 PM, Leo Davis wrote:
> As Ben pointed out, the current parser in the script for svnlook <= 1.7.x is broken and unfixable for multiline properties.  The closest we can get to DTRT in this case is to have svnlook output XML.  Hopefully no one still cares about svnlook <= 1.3 (?) that cannot output XML.
> 
> On Mar 2, 2014, at 8:11 AM, "Daniel Shahaf" <d....@daniel.shahaf.name> wrote:
>> One more issue: however you change the parser, it will break if
>> svnlook1.7 or older is used.  It would be nice to DTRT in that case
>> (either error out or retain the old parser).

It would be nice to have, but I think the effort to provide it is just too
great unless we go down the XML path.  Which is a pretty large change in the
requirements of the script.

Anyone that wants it can go get a copy of the script off the 1.7.x branch
(assuming we merge the fix to the 1.8.x branch).  This is contrib we have no
compatibility guarantees to worry about either.

Just put a prominent notice at the top of the script saying that it is only
intended for use with 1.8.x or newer servers.



Re: [PATCH]: fix check-mime-type.pl for changes to 'svnlook proplist' output

Posted by Leo Davis <ld...@speechfxinc.com>.
Hello,

As Ben pointed out, the current parser in the script for svnlook <= 1.7.x is broken and unfixable for multiline properties.  The closest we can get to DTRT in this case is to have svnlook output XML.  Hopefully no one still cares about svnlook <= 1.3 (?) that cannot output XML.

Regards,

Leo

On Mar 2, 2014, at 8:11 AM, "Daniel Shahaf" <d....@daniel.shahaf.name> wrote:

> Ben Reser wrote on Fri, Feb 28, 2014 at 19:07:37 -0800:
>> On 2/25/14, 4:21 PM, Leo Davis wrote:
>>> I recently discovered that the old 'check-mime-type.pl' contrib script is I 
>>> had installed on my server was broken when using subversion 1.8.5.  I looked 
>>> for an updated script in the 1.8.5 branch and in trunk and discovered the 
>>> script was also broken in those places as well.  After some digging, I found 
>>> out when this change happened and patched the script.  I suspect (but don't 
>>> know for sure) that this only affects subversion 1.8.x.
>>> 
>>> This patch is from trunk.
>>> 
>>> [[[
>>>   Fix check-mime-type.pl for output changes to 'svnlook proplist'.
>>> 
>>>   The output format of 'svnlook proplist' was changed in revision 1416637.
>>> 
>>>   See also http://svn.haxx.se/dev/archive-2012-11/0510.shtml
>>> 
>>>   * contrib/hook-scripts/check-mime-type.pl:  Fix reading process output from 
>>> 'svnlook proplist'.
>>> ]]]
>> 
>> First of all thanks for your contribution.  You're quite right that this is
>> broken for 1.8.x and trunk.
>> 
>> Unfortunately, your patch isn't quite right either.
>> 
>> [description of parsing problems]
> 
> One more issue: however you change the parser, it will break if
> svnlook1.7 or older is used.  It would be nice to DTRT in that case
> (either error out or retain the old parser).

Re: [PATCH]: fix check-mime-type.pl for changes to 'svnlook proplist' output

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Ben Reser wrote on Fri, Feb 28, 2014 at 19:07:37 -0800:
> On 2/25/14, 4:21 PM, Leo Davis wrote:
> > I recently discovered that the old 'check-mime-type.pl' contrib script is I 
> > had installed on my server was broken when using subversion 1.8.5.  I looked 
> > for an updated script in the 1.8.5 branch and in trunk and discovered the 
> > script was also broken in those places as well.  After some digging, I found 
> > out when this change happened and patched the script.  I suspect (but don't 
> > know for sure) that this only affects subversion 1.8.x.
> > 
> > This patch is from trunk.
> > 
> > [[[
> >    Fix check-mime-type.pl for output changes to 'svnlook proplist'.
> > 
> >    The output format of 'svnlook proplist' was changed in revision 1416637.
> > 
> >    See also http://svn.haxx.se/dev/archive-2012-11/0510.shtml
> > 
> >    * contrib/hook-scripts/check-mime-type.pl:  Fix reading process output from 
> > 'svnlook proplist'.
> > ]]]
> 
> First of all thanks for your contribution.  You're quite right that this is
> broken for 1.8.x and trunk.
> 
> Unfortunately, your patch isn't quite right either.
> 
> [description of parsing problems]

One more issue: however you change the parser, it will break if
svnlook1.7 or older is used.  It would be nice to DTRT in that case
(either error out or retain the old parser).