You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Johan Corveleyn <jc...@gmail.com> on 2017/12/18 14:20:47 UTC

list --search matching and Windows *-expansion

On Tue, Dec 5, 2017 at 10:12 PM, Evgeny Kotkov
<ev...@visualsvn.com> wrote:
> Stefan Fuhrmann <st...@apache.org> writes:
>
>> There seems to be little that could be done here (suggestions welcome).
>> The problem is that the asterisk is being expanded by the shell itself.
>> I made SVN print its command line parameters and this is the result:
>>
>>         $ ./subversion/svn/svn ls svn://localhost/kde --search M*
>>         0: ./subversion/svn/svn
>>         1: ls
>>         2: svn://localhost/kde
>>         3: --search
>>         4: Makefile
>>         5: Makefile.in
>>
>> That can be prevented by adding quotation marks:
>>
>>         $ ./subversion/svn/svn ls svn://localhost/kde --search "M*"
>>         0: ./subversion/svn/svn
>>         1: ls
>>         2: svn://localhost/kde
>>         3: --search
>>         4: M*
>
> Unfortunately, on Windows both `--search M*` and (quoted) `--search "M*"`
> would expand the asterisk.  While this is not the default shell behavior,
> currently it's enabled for svn and a couple of other binaries by linking
> to setargv.obj.  In turn, this probably means the command-line client
> users on Windows could get quite unexpected results when using the
> `--search ARG` syntax.
>
> A potential cheap solution for this issue, I think, would be to make the
> --search argument work as a substring to search for in filenames, instead
> of using it as a pattern that the (whole) filename should match.  While
> there are some cases where the latter approach gives more flexibility,
> my guess would be that a substring search would work well in the majority
> of scenarios.
>
> (Also, as far as I recall, `log --search` currently searches for a substring,
>  so that would be consistent with it, and would probably avoid surprising
>  the users by having a switch with the same name, but behaving differently.)

Spinning off this discussion from the "Subversion 1.10 RC1?" thread ...

Do we still want to fix this somehow for 1.10?

Doug Robinson pointed out that there is no way to escape * from the
shell on Windows [1]. And Branko hinted at doing the glob expansion
ourselves, instead of depending on setargv.obj, but that didn't seem
very realistic (as in: is there someone to do the work?).

Another option is the cheap solution that Evgeny proposes: always
doing substring matching. We discussed that before [2], and most were
not in favor of it, but maybe we have no choice ...

[1] https://stackoverflow.com/questions/47512829/is-it-possible-to-quote-escape-command-line-arguments-on-windows-while-linking-s

[2] https://svn.haxx.se/dev/archive-2017-09/0002.shtml

-- 
Johan

Re: list --search matching and Windows *-expansion

Posted by Stefan Fuhrmann <st...@alice-dsl.de>.
On 21.12.2017 14:50, Stefan Fuhrmann wrote:
> On 19.12.2017 11:06, Stefan Sperling wrote:
>> On Mon, Dec 18, 2017 at 08:59:08PM +0000, Julian Foad wrote:
>>> Stefan Sperling wrote:
>>>> On Mon, Dec 18, 2017 at 06:01:52PM +0000, Daniel Shahaf wrote:
>>>>> Branko Čibej wrote on Mon, 18 Dec 2017 16:24 +0100:
>>>>>> Or use a different set of match chars on Windows instead of * and 
>>>>>> ?, as
>>>>>> a hacky but usable workaround.  [...]
>>> [...]
>>>>> a magic prefix, so «--search "nostar:foo#bar%"» [...]
>>> [...]
>>>> define some mapping between our fnmatch special chars
>>>> [...] Windows users would [...] use the set of chars which works 
>>>> for them.
>>> Reality check, please!
>>>
>>> This is meant to be a simple feature to help people find files, is 
>>> it not?
>>>
>>> Magic prefixes and non-standard wildcard chars are a non-starter for
>>> usability.
>>>
>>> It would be better to disable the whole '--search' option, or the 
>>> wildcard
>>> support in it, on the Windows CLI.
>> I think we should keep coming up with more ideas until somebody gets 
>> enough
>> of it all and sends a patch to remove our dependency on setargv.obj :)
>>
> Well, that escalated quickly ;)
>
> While I liked Brane's idea for being easy to implement,
> I think Julian's point concerning usability is stronger.
> Therefore, I will go with my initial proposal to make
> 'svn ls --search' in the Windows CLI work the same as
> 'svn log --search'.

Committed in r1819036.

-- Stefan^2.

Re: list --search matching and Windows *-expansion

Posted by Branko Čibej <br...@apache.org>.
On 21.12.2017 14:50, Stefan Fuhrmann wrote:
> On 19.12.2017 11:06, Stefan Sperling wrote:
>> On Mon, Dec 18, 2017 at 08:59:08PM +0000, Julian Foad wrote:
>>> Stefan Sperling wrote:
>>>> On Mon, Dec 18, 2017 at 06:01:52PM +0000, Daniel Shahaf wrote:
>>>>> Branko Čibej wrote on Mon, 18 Dec 2017 16:24 +0100:
>>>>>> Or use a different set of match chars on Windows instead of * and
>>>>>> ?, as
>>>>>> a hacky but usable workaround.  [...]
>>> [...]
>>>>> a magic prefix, so «--search "nostar:foo#bar%"» [...]
>>> [...]
>>>> define some mapping between our fnmatch special chars
>>>> [...] Windows users would [...] use the set of chars which works
>>>> for them.
>>> Reality check, please!
>>>
>>> This is meant to be a simple feature to help people find files, is
>>> it not?
>>>
>>> Magic prefixes and non-standard wildcard chars are a non-starter for
>>> usability.
>>>
>>> It would be better to disable the whole '--search' option, or the
>>> wildcard
>>> support in it, on the Windows CLI.
>> I think we should keep coming up with more ideas until somebody gets
>> enough
>> of it all and sends a patch to remove our dependency on setargv.obj :)
>>
> Well, that escalated quickly ;)
>
> While I liked Brane's idea for being easy to implement,
> I think Julian's point concerning usability is stronger.
> Therefore, I will go with my initial proposal to make
> 'svn ls --search' in the Windows CLI work the same as
> 'svn log --search'.
>
> Given that we have fnmatch available, replacing setargv.obj
> in 1.11 should not be too much of an effort. I don't have
> access to a Windows environment ATM, though.

It's not as simple as all that. We should not just replace setargv.obj
but also implement some kind of quoting (preferably compatible with the
Windows command line in general). Otherwise the wildcard expansion logic
will bleed into the argument interpretation, and we definitely do not
want to fill the command-line code with a mountain of #ifdefs.

-- Brane

Re: list --search matching and Windows *-expansion

Posted by Stefan Fuhrmann <st...@apache.org>.
On 19.12.2017 11:06, Stefan Sperling wrote:
> On Mon, Dec 18, 2017 at 08:59:08PM +0000, Julian Foad wrote:
>> Stefan Sperling wrote:
>>> On Mon, Dec 18, 2017 at 06:01:52PM +0000, Daniel Shahaf wrote:
>>>> Branko Čibej wrote on Mon, 18 Dec 2017 16:24 +0100:
>>>>> Or use a different set of match chars on Windows instead of * and ?, as
>>>>> a hacky but usable workaround.  [...]
>> [...]
>>>> a magic prefix, so «--search "nostar:foo#bar%"» [...]
>> [...]
>>> define some mapping between our fnmatch special chars
>>> [...] Windows users would [...] use the set of chars which works for them.
>> Reality check, please!
>>
>> This is meant to be a simple feature to help people find files, is it not?
>>
>> Magic prefixes and non-standard wildcard chars are a non-starter for
>> usability.
>>
>> It would be better to disable the whole '--search' option, or the wildcard
>> support in it, on the Windows CLI.
> I think we should keep coming up with more ideas until somebody gets enough
> of it all and sends a patch to remove our dependency on setargv.obj :)
>
Well, that escalated quickly ;)

While I liked Brane's idea for being easy to implement,
I think Julian's point concerning usability is stronger.
Therefore, I will go with my initial proposal to make
'svn ls --search' in the Windows CLI work the same as
'svn log --search'.

Given that we have fnmatch available, replacing setargv.obj
in 1.11 should not be too much of an effort. I don't have
access to a Windows environment ATM, though.

-- Stefan^2.


Re: list --search matching and Windows *-expansion

Posted by Stefan Sperling <st...@elego.de>.
On Mon, Dec 18, 2017 at 08:59:08PM +0000, Julian Foad wrote:
> Stefan Sperling wrote:
> > On Mon, Dec 18, 2017 at 06:01:52PM +0000, Daniel Shahaf wrote:
> > > Branko Čibej wrote on Mon, 18 Dec 2017 16:24 +0100:
> > > > Or use a different set of match chars on Windows instead of * and ?, as
> > > > a hacky but usable workaround.  [...]
> [...]
> > > a magic prefix, so «--search "nostar:foo#bar%"» [...]
> [...]
> > define some mapping between our fnmatch special chars
> > [...] Windows users would [...] use the set of chars which works for them.
> 
> Reality check, please!
> 
> This is meant to be a simple feature to help people find files, is it not?
> 
> Magic prefixes and non-standard wildcard chars are a non-starter for
> usability.
> 
> It would be better to disable the whole '--search' option, or the wildcard
> support in it, on the Windows CLI.

I think we should keep coming up with more ideas until somebody gets enough
of it all and sends a patch to remove our dependency on setargv.obj :)

Re: list --search matching and Windows *-expansion

Posted by Julian Foad <ju...@apache.org>.
Stefan Sperling wrote:
> On Mon, Dec 18, 2017 at 06:01:52PM +0000, Daniel Shahaf wrote:
>> Branko Čibej wrote on Mon, 18 Dec 2017 16:24 +0100:
>>> Or use a different set of match chars on Windows instead of * and ?, as
>>> a hacky but usable workaround.  [...]
[...]
>> a magic prefix, so «--search "nostar:foo#bar%"» [...]
[...]
> define some mapping between our fnmatch special chars
> [...] Windows users would [...] use the set of chars which works for them.

Reality check, please!

This is meant to be a simple feature to help people find files, is it not?

Magic prefixes and non-standard wildcard chars are a non-starter for 
usability.

It would be better to disable the whole '--search' option, or the 
wildcard support in it, on the Windows CLI.

- Julian


(Darn, I was trying so hard to keep out of this thread...)

Re: list --search matching and Windows *-expansion

Posted by Stefan Sperling <st...@elego.de>.
On Mon, Dec 18, 2017 at 06:01:52PM +0000, Daniel Shahaf wrote:
> Branko Čibej wrote on Mon, 18 Dec 2017 16:24 +0100:
> > Or use a different set of match chars on Windows instead of * and ?, as
> > a hacky but usable workaround. That would mean that, only on windows,
> > and in the argument of --search, we would replace, say, % and # with *
> > and ? or some such. I'd even go as far as to add a warning that this
> > substitution will be removed once (if?) we get rid of setargv.obj.
> 
> I'm concerned about API divergence between Windows and Unix.  Could we
> try to minimise the possibilities for Windows admins to write code (say,
> Python code using subprocess.check_call()) that won't work on Unix, and
> vice-versa?
> 
> For example, how about if we only perform the substitution you describe
> if the pattern had a magic prefix, so «--search "nostar:foo#bar%"» would
> pass "foo*bar?" to the API's on *all* platforms?  This way, Windows code
> would work on Unix, and Unix code that actually searches for percent signs
> would work on Windows.

How about we use Brane's idea, but with a twist:

Essentially, we'd define some mapping between our fnmatch special chars
and another set of chars that won't cause problems on a windows command
line, chosen as needed to work around this windows-specific issue in
our command line client.

Then we could keep supporting these aliases on all platforms and it
wouldn't matter which set of chars people use. Windows users would
of course be advised to use the set of chars which works for them.

Re: list --search matching and Windows *-expansion

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Branko Čibej wrote on Mon, 18 Dec 2017 16:24 +0100:
> On 18.12.2017 15:33, Stefan Fuhrmann wrote:
> > On 18.12.2017 15:20, Johan Corveleyn wrote:
> >> Spinning off this discussion from the "Subversion 1.10 RC1?" thread ...
> >>
> >> Do we still want to fix this somehow for 1.10?
> >>
> >> Doug Robinson pointed out that there is no way to escape * from the
> >> shell on Windows [1]. And Branko hinted at doing the glob expansion
> >> ourselves, instead of depending on setargv.obj, but that didn't seem
> >> very realistic (as in: is there someone to do the work?).
> >>
> >> Another option is the cheap solution that Evgeny proposes: always
> >> doing substring matching. We discussed that before [2], and most were
> >> not in favor of it, but maybe we have no choice ...
> >
> > Elsethread, it was already mentioned that only the command
> > line client under Windows is effected; other Windows clients
> > like TSVN and language bindings are fine.
> >
> > So, lets add a workaround in the Windows CL client only such
> > that it will use sub-string search.  This will make the new
> > option still quite useful for Windows admins (i.e. the people
> > that might use the CL client).  Technically, the client will
> > pass the "*ParameterValue*" pattern to the underlying libs.
> > #ifdefs will guard that behavior.
> >
> > If nobody objects, I'd like to commit the patch tomorrow.
> 

+1 to the general concept of keeping the Unix API unchanged.

However, this specific solution doesn't allow Windows users to use
"foo*bar" patterns (= embedded asterisks).

> Or use a different set of match chars on Windows instead of * and ?, as
> a hacky but usable workaround. That would mean that, only on windows,
> and in the argument of --search, we would replace, say, % and # with *
> and ? or some such. I'd even go as far as to add a warning that this
> substitution will be removed once (if?) we get rid of setargv.obj.

I'm concerned about API divergence between Windows and Unix.  Could we
try to minimise the possibilities for Windows admins to write code (say,
Python code using subprocess.check_call()) that won't work on Unix, and
vice-versa?

For example, how about if we only perform the substitution you describe
if the pattern had a magic prefix, so «--search "nostar:foo#bar%"» would
pass "foo*bar?" to the API's on *all* platforms?  This way, Windows code
would work on Unix, and Unix code that actually searches for percent signs
would work on Windows.

Re: list --search matching and Windows *-expansion

Posted by Branko Čibej <br...@apache.org>.
On 18.12.2017 15:33, Stefan Fuhrmann wrote:
> On 18.12.2017 15:20, Johan Corveleyn wrote:
>> On Tue, Dec 5, 2017 at 10:12 PM, Evgeny Kotkov
>> <ev...@visualsvn.com> wrote:
>>> Stefan Fuhrmann <st...@apache.org> writes:
>>>
>>>> There seems to be little that could be done here (suggestions
>>>> welcome).
>>>> The problem is that the asterisk is being expanded by the shell
>>>> itself.
>>>> I made SVN print its command line parameters and this is the result:
>>>>
>>>>          $ ./subversion/svn/svn ls svn://localhost/kde --search M*
>>>>          0: ./subversion/svn/svn
>>>>          1: ls
>>>>          2: svn://localhost/kde
>>>>          3: --search
>>>>          4: Makefile
>>>>          5: Makefile.in
>>>>
>>>> That can be prevented by adding quotation marks:
>>>>
>>>>          $ ./subversion/svn/svn ls svn://localhost/kde --search "M*"
>>>>          0: ./subversion/svn/svn
>>>>          1: ls
>>>>          2: svn://localhost/kde
>>>>          3: --search
>>>>          4: M*
>>>
>>> Unfortunately, on Windows both `--search M*` and (quoted) `--search
>>> "M*"`
>>> would expand the asterisk.  While this is not the default shell
>>> behavior,
>>> currently it's enabled for svn and a couple of other binaries by
>>> linking
>>> to setargv.obj.  In turn, this probably means the command-line client
>>> users on Windows could get quite unexpected results when using the
>>> `--search ARG` syntax.
>>>
>>> A potential cheap solution for this issue, I think, would be to make
>>> the
>>> --search argument work as a substring to search for in filenames,
>>> instead
>>> of using it as a pattern that the (whole) filename should match.  While
>>> there are some cases where the latter approach gives more flexibility,
>>> my guess would be that a substring search would work well in the
>>> majority
>>> of scenarios.
>>>
>>> (Also, as far as I recall, `log --search` currently searches for a
>>> substring,
>>>   so that would be consistent with it, and would probably avoid
>>> surprising
>>>   the users by having a switch with the same name, but behaving
>>> differently.)
>>
>> Spinning off this discussion from the "Subversion 1.10 RC1?" thread ...
>>
>> Do we still want to fix this somehow for 1.10?
>>
>> Doug Robinson pointed out that there is no way to escape * from the
>> shell on Windows [1]. And Branko hinted at doing the glob expansion
>> ourselves, instead of depending on setargv.obj, but that didn't seem
>> very realistic (as in: is there someone to do the work?).
>>
>> Another option is the cheap solution that Evgeny proposes: always
>> doing substring matching. We discussed that before [2], and most were
>> not in favor of it, but maybe we have no choice ...
>
> Elsethread, it was already mentioned that only the command
> line client under Windows is effected; other Windows clients
> like TSVN and language bindings are fine.
>
> So, lets add a workaround in the Windows CL client only such
> that it will use sub-string search.  This will make the new
> option still quite useful for Windows admins (i.e. the people
> that might use the CL client).  Technically, the client will
> pass the "*ParameterValue*" pattern to the underlying libs.
> #ifdefs will guard that behavior.
>
> If nobody objects, I'd like to commit the patch tomorrow.

Or use a different set of match chars on Windows instead of * and ?, as
a hacky but usable workaround. That would mean that, only on windows,
and in the argument of --search, we would replace, say, % and # with *
and ? or some such. I'd even go as far as to add a warning that this
substitution will be removed once (if?) we get rid of setargv.obj.

-- Brane

Re: list --search matching and Windows *-expansion

Posted by Stefan Fuhrmann <st...@apache.org>.
On 18.12.2017 15:20, Johan Corveleyn wrote:
> On Tue, Dec 5, 2017 at 10:12 PM, Evgeny Kotkov
> <ev...@visualsvn.com> wrote:
>> Stefan Fuhrmann <st...@apache.org> writes:
>>
>>> There seems to be little that could be done here (suggestions welcome).
>>> The problem is that the asterisk is being expanded by the shell itself.
>>> I made SVN print its command line parameters and this is the result:
>>>
>>>          $ ./subversion/svn/svn ls svn://localhost/kde --search M*
>>>          0: ./subversion/svn/svn
>>>          1: ls
>>>          2: svn://localhost/kde
>>>          3: --search
>>>          4: Makefile
>>>          5: Makefile.in
>>>
>>> That can be prevented by adding quotation marks:
>>>
>>>          $ ./subversion/svn/svn ls svn://localhost/kde --search "M*"
>>>          0: ./subversion/svn/svn
>>>          1: ls
>>>          2: svn://localhost/kde
>>>          3: --search
>>>          4: M*
>>
>> Unfortunately, on Windows both `--search M*` and (quoted) `--search "M*"`
>> would expand the asterisk.  While this is not the default shell behavior,
>> currently it's enabled for svn and a couple of other binaries by linking
>> to setargv.obj.  In turn, this probably means the command-line client
>> users on Windows could get quite unexpected results when using the
>> `--search ARG` syntax.
>>
>> A potential cheap solution for this issue, I think, would be to make the
>> --search argument work as a substring to search for in filenames, instead
>> of using it as a pattern that the (whole) filename should match.  While
>> there are some cases where the latter approach gives more flexibility,
>> my guess would be that a substring search would work well in the majority
>> of scenarios.
>>
>> (Also, as far as I recall, `log --search` currently searches for a substring,
>>   so that would be consistent with it, and would probably avoid surprising
>>   the users by having a switch with the same name, but behaving differently.)
> 
> Spinning off this discussion from the "Subversion 1.10 RC1?" thread ...
> 
> Do we still want to fix this somehow for 1.10?
> 
> Doug Robinson pointed out that there is no way to escape * from the
> shell on Windows [1]. And Branko hinted at doing the glob expansion
> ourselves, instead of depending on setargv.obj, but that didn't seem
> very realistic (as in: is there someone to do the work?).
> 
> Another option is the cheap solution that Evgeny proposes: always
> doing substring matching. We discussed that before [2], and most were
> not in favor of it, but maybe we have no choice ...

Elsethread, it was already mentioned that only the command
line client under Windows is effected; other Windows clients
like TSVN and language bindings are fine.

So, lets add a workaround in the Windows CL client only such
that it will use sub-string search.  This will make the new
option still quite useful for Windows admins (i.e. the people
that might use the CL client).  Technically, the client will
pass the "*ParameterValue*" pattern to the underlying libs.
#ifdefs will guard that behavior.

If nobody objects, I'd like to commit the patch tomorrow.

-- Stefan^2.