You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Johan Corveleyn <jc...@gmail.com> on 2012/01/27 16:30:44 UTC

regexp matching in svn?

Hi,

Is there any existing regexp matching functionality somewhere in de
svn codebase, in a way that is reusable for new functionality? If not,
does anyone know of a reusable third-party library that could be
easily intregrated (I know, it would be Yet Another Dependency, but
I'm just asking for options ...)?

Use-case: I like the '-x-p' (show C-function) option for 'svn diff',
but this is limited to C-like syntax. GNU diff also has a '-F RE'
option, where RE is a regular expression that is used for matching the
"function-line". I know I could make use of this functionality by
invoking GNU diff as external diff command, but for several reasons it
would be interesting if svn could do this internally (for one thing:
GNU diff isn't always available / installed).

And I'm sure there would be other useful things people could do with
regexp functionality ...

Thoughts?

-- 
Johan

Re: regexp matching in svn?

Posted by Peter Samuelson <pe...@p12n.org>.
[Branko Cibej]
> It'd certainly be better to somehow get regular expressions included
> in APR.

The only problem there (aside from the apr 1.0 feature freeze you
mentioned) is that there are so many flavors of RE out there (Emacs,
Perl/PCRE/C#, POSIX Basic, POSIX Extended, and one or two others) that
it sounds like a giant bikeshed to be painted.  That, or they'd try to
support _all_ variants, with switch flags, and end up with a huge mess.

> We did fairly well without requiring regular expressions before
> ... and regex's are a bit of a temptation since the look like an
> all-purpose hammer, and many problems that could be solved without
> them suddenly depend on regular expressions. :)

OTOH, we _did_ stop using XML for wc metadata around 1.3 or 1.4.
Surely XML is an "all-purpose hammer" too.  (:

You're right about the temptation, of course.  In Perl, most
programmers use far more RE matches than strictly required to solve
their problems, since RE matching has such simple, convenient syntax.

Re: regexp matching in svn?

Posted by Branko Čibej <br...@apache.org>.
On 28.01.2012 11:53, Johan Corveleyn wrote:
> On Sat, Jan 28, 2012 at 11:12 AM, Branko Čibej <br...@apache.org> wrote:
>> On 28.01.2012 10:44, Daniel Shahaf wrote:
>>> Johan Corveleyn wrote on Sat, Jan 28, 2012 at 10:10:05 +0100:
>>>> Yes, it's an option. But it's a hassle. I'm investigating other options.
>>> Installing GNU diff requires deploying a precompiled .exe file and two
>>> .dll files into %PATH%.
>>>
>>> Implementing -F support in Subversion requires patching configure,
>>> patching the windows build, revving the relevant libsvn_diff APIs,
>>> waiting for the next minor release, and deploying it.
>>>
>>> Seems to me that the former is the path of least resistance.
>> Especially for the 500.000 users of Subversion on Windows, compared to a
>> few hours for one developer. Not to mention the wonders of DLL hell and
>> the oh-so-standard Windows installer.
> That was my thought as well. Though you're exaggerating just a bit :-)
> ... only a fraction of those 500.000 users will be interested in 'diff
> -F', and it'll be more than a few hours to get this into subversion,
> with the extra work on configure, build, ... (and testing those).
>
> It's a pity we don't have any regex functionality already in svn, or
> this would be a pretty quick win. I guess now it's not that clear cut:
> is it worth it to include an extra dependency just for the sake of
> this diff option? I'm not sure myself actually (I have little
> experience with all this). Still it would be a nice addition to the
> feature set, I think.

It'd certainly be better to somehow get regular expressions included in
APR. However, since APR-1.x is now in maintenance mode, that would imply
switching all of Subversion to APR-2.x, and I can't see that happening.

We did fairly well without requiring regular expressions before ... and
regex's are a bit of a temptation since the look like an all-purpose
hammer, and many problems that could be solved without them suddenly
depend on regular expressions. :)

-- Brane

Re: regexp matching in svn?

Posted by Johan Corveleyn <jc...@gmail.com>.
On Sat, Jan 28, 2012 at 11:12 AM, Branko Čibej <br...@apache.org> wrote:
> On 28.01.2012 10:44, Daniel Shahaf wrote:
>> Johan Corveleyn wrote on Sat, Jan 28, 2012 at 10:10:05 +0100:
>>> Yes, it's an option. But it's a hassle. I'm investigating other options.
>> Installing GNU diff requires deploying a precompiled .exe file and two
>> .dll files into %PATH%.
>>
>> Implementing -F support in Subversion requires patching configure,
>> patching the windows build, revving the relevant libsvn_diff APIs,
>> waiting for the next minor release, and deploying it.
>>
>> Seems to me that the former is the path of least resistance.
>
> Especially for the 500.000 users of Subversion on Windows, compared to a
> few hours for one developer. Not to mention the wonders of DLL hell and
> the oh-so-standard Windows installer.

That was my thought as well. Though you're exaggerating just a bit :-)
... only a fraction of those 500.000 users will be interested in 'diff
-F', and it'll be more than a few hours to get this into subversion,
with the extra work on configure, build, ... (and testing those).

It's a pity we don't have any regex functionality already in svn, or
this would be a pretty quick win. I guess now it's not that clear cut:
is it worth it to include an extra dependency just for the sake of
this diff option? I'm not sure myself actually (I have little
experience with all this). Still it would be a nice addition to the
feature set, I think.

-- 
Johan

Re: regexp matching in svn?

Posted by Branko Čibej <br...@apache.org>.
On 28.01.2012 10:44, Daniel Shahaf wrote:
> Johan Corveleyn wrote on Sat, Jan 28, 2012 at 10:10:05 +0100:
>> Yes, it's an option. But it's a hassle. I'm investigating other options.
> Installing GNU diff requires deploying a precompiled .exe file and two
> .dll files into %PATH%.
>
> Implementing -F support in Subversion requires patching configure,
> patching the windows build, revving the relevant libsvn_diff APIs,
> waiting for the next minor release, and deploying it.
>
> Seems to me that the former is the path of least resistance.

Especially for the 500.000 users of Subversion on Windows, compared to a
few hours for one developer. Not to mention the wonders of DLL hell and
the oh-so-standard Windows installer.

Nice reasoning. :) Pity we didn't think of it when we wasted time
implementing "svn diff" internally.

-- Brane


Re: regexp matching in svn?

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Johan Corveleyn wrote on Sat, Jan 28, 2012 at 10:10:05 +0100:
> On Sat, Jan 28, 2012 at 9:54 AM, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
> > On Fri, Jan 27, 2012, at 21:30, Johan Corveleyn wrote:
> >> On Fri, Jan 27, 2012 at 4:44 PM, Bert Huijben <be...@qqmail.nl> wrote:
> >> >> -----Original Message-----
> >> >> From: Johan Corveleyn [mailto:jcorvel@gmail.com]
> >> >> Sent: vrijdag 27 januari 2012 16:31
> >> >> To: Subversion Development
> >> >> Subject: regexp matching in svn?
> >> >>
> >> >> Hi,
> >> >>
> >> >> Is there any existing regexp matching functionality somewhere in de
> >> >> svn codebase, in a way that is reusable for new functionality? If not,
> >> >> does anyone know of a reusable third-party library that could be
> >> >> easily intregrated (I know, it would be Yet Another Dependency, but
> >> >> I'm just asking for options ...)?
> >> >>
> >> >> Use-case: I like the '-x-p' (show C-function) option for 'svn diff',
> >> >> but this is limited to C-like syntax. GNU diff also has a '-F RE'
> >> >> option, where RE is a regular expression that is used for matching the
> >> >> "function-line". I know I could make use of this functionality by
> >> >> invoking GNU diff as external diff command, but for several reasons it
> >> >> would be interesting if svn could do this internally (for one thing:
> >> >> GNU diff isn't always available / installed).
> >> >>
> >> >> And I'm sure there would be other useful things people could do with
> >> >> regexp functionality ...
> >> >
> >> > Apr (or Apr-Util) has regex support, not sure which but we require both.
> >>
> >> Thanks, but I can't find it. I found apr_strmatch in apr-util, but
> >> that doesn't do regex, it just matches a fixed string (which would
> >> also be useful (it actually covers my concrete use-case), but it's not
> >> -F). Maybe I'm overlooking something.
> >>
> >> If it's not in apr(-util) or some other dependency we already have,
> >> how about PCRE (www.pcre.org) ?
> >
> > I must have missed the point where you explained why decreeing "GNU diff
> > must be installed" company policy is not an option.
> 
> Yes, it's an option. But it's a hassle. I'm investigating other options.

Installing GNU diff requires deploying a precompiled .exe file and two
.dll files into %PATH%.

Implementing -F support in Subversion requires patching configure,
patching the windows build, revving the relevant libsvn_diff APIs,
waiting for the next minor release, and deploying it.

Seems to me that the former is the path of least resistance.

> 
> Other than that, there is something about GNU diff's -F option that I
> don't like: it doesn't trim leading whitespace from the matched
> "function line", unlike the -p option.
> 
> If we implement this ourselves, we can decide to do this (depending on
> community consensus of course). We don't have to do exactly the same
> as GNU diff's -p / -F (just like we also show 50 characters of -p,
> unlike GNU diff which shows 40 characters).
> 
> -- 
> Johan

Re: regexp matching in svn?

Posted by Stefan Sperling <st...@elego.de>.
On Tue, Jan 31, 2012 at 08:17:44AM -0600, Hyrum K Wright wrote:
> We've batted around the idea of allowing regexes in svn:ignore
> properties and authz files, but without proper regex support in
> Subversion we can't.


For authz (see issue #2662) we'd need an efficient way of matching
a regex against _all_ paths within a given revision. Without some
sort of cache (luckily revisions are immutable!) performance would
probably go down the drain.

The svn:ignore case is simpler.

Re: regexp matching in svn?

Posted by Hyrum K Wright <hy...@wandisco.com>.
On Tue, Jan 31, 2012 at 7:52 AM, Bert Huijben <be...@qqmail.nl> wrote:
>
>
>> -----Original Message-----
>> From: Johan Corveleyn [mailto:jcorvel@gmail.com]
>> Sent: dinsdag 31 januari 2012 5:08
>> To: Daniel Shahaf
>> Cc: Bert Huijben; dev@subversion.apache.org
>> Subject: Re: regexp matching in svn?
>
>
>> My most immediate problem is solved, because that's about generating
>> nice diffs on the server for post-commit emails. My server runs on
>> Solaris, and we have diffutils 3.2 installed now. Problem solved
>> (except that svnlook doesn't support --diff-cmd, so I have to fall
>> back to 'svn' for this, but anyway ...).
>
> Adding --diff-cmd to svnlook should be easier than adding regex support :)

We've batted around the idea of allowing regexes in svn:ignore
properties and authz files, but without proper regex support in
Subversion we can't.

While I'm not saying we should add such support now, I am pointing out
that there are more use cases that simply those covered by 'svnlook
--diff-cmd'.

-Hyrum



-- 

uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com/

RE: regexp matching in svn?

Posted by Bert Huijben <be...@qqmail.nl>.

> -----Original Message-----
> From: Johan Corveleyn [mailto:jcorvel@gmail.com]
> Sent: dinsdag 31 januari 2012 5:08
> To: Daniel Shahaf
> Cc: Bert Huijben; dev@subversion.apache.org
> Subject: Re: regexp matching in svn?


> My most immediate problem is solved, because that's about generating
> nice diffs on the server for post-commit emails. My server runs on
> Solaris, and we have diffutils 3.2 installed now. Problem solved
> (except that svnlook doesn't support --diff-cmd, so I have to fall
> back to 'svn' for this, but anyway ...).

Adding --diff-cmd to svnlook should be easier than adding regex support :)

	Bert

> 
> --
> Johan
> 
> [1] http://savannah.gnu.org/forum/forum.php?forum_id=6319
> [2]
> https://sourceforge.net/tracker/?func=detail&aid=3482130&group_id=2361
> 7&atid=379176


Re: regexp matching in svn?

Posted by Johan Corveleyn <jc...@gmail.com>.
On Sat, Jan 28, 2012 at 2:45 PM, Johan Corveleyn <jc...@gmail.com> wrote:
> On Sat, Jan 28, 2012 at 10:10 AM, Johan Corveleyn <jc...@gmail.com> wrote:
>
> [ ... ]
>
>> Other than that, there is something about GNU diff's -F option that I
>> don't like: it doesn't trim leading whitespace from the matched
>> "function line", unlike the -p option.
>
> Sorry, I made a mistake here, I misinterpreted some output. There is
> no such stripping of leading whitespace from the "function-line"
> anywhere. It just so happens that -p only matches lines which start
> with non-whitespace.
>
> (stripping leading whitespace in case of -F would be useful though, if
> it were up to me ... gives you more significant context in those 40/50
> characters)

Apparently, stripping leading whitespace from the function-line with
the -F option is implemented in GNU diff as of version 3.0 [1]. So
that's interesting, it makes that particular argument moot.

Only problem is: the most recent version of diffutils which I can find
for Windows is 2.9 (Cygwin has diffutils-2.9, gnuwin32 has
diffutils-2.8.7). I added a request to the forum of gnuwin32 [2] to
produce a newer release of diffutils, but I don't really expect
anything soon.

Anyway, I'm not going to pursue this any further in svn myself, unless
one day some regex capability magically appears.

My most immediate problem is solved, because that's about generating
nice diffs on the server for post-commit emails. My server runs on
Solaris, and we have diffutils 3.2 installed now. Problem solved
(except that svnlook doesn't support --diff-cmd, so I have to fall
back to 'svn' for this, but anyway ...).

-- 
Johan

[1] http://savannah.gnu.org/forum/forum.php?forum_id=6319
[2] https://sourceforge.net/tracker/?func=detail&aid=3482130&group_id=23617&atid=379176

Re: regexp matching in svn?

Posted by Johan Corveleyn <jc...@gmail.com>.
On Sat, Jan 28, 2012 at 10:10 AM, Johan Corveleyn <jc...@gmail.com> wrote:

[ ... ]

> Other than that, there is something about GNU diff's -F option that I
> don't like: it doesn't trim leading whitespace from the matched
> "function line", unlike the -p option.

Sorry, I made a mistake here, I misinterpreted some output. There is
no such stripping of leading whitespace from the "function-line"
anywhere. It just so happens that -p only matches lines which start
with non-whitespace.

(stripping leading whitespace in case of -F would be useful though, if
it were up to me ... gives you more significant context in those 40/50
characters)

-- 
Johan

Re: regexp matching in svn?

Posted by Johan Corveleyn <jc...@gmail.com>.
On Sat, Jan 28, 2012 at 9:54 AM, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
> On Fri, Jan 27, 2012, at 21:30, Johan Corveleyn wrote:
>> On Fri, Jan 27, 2012 at 4:44 PM, Bert Huijben <be...@qqmail.nl> wrote:
>> >> -----Original Message-----
>> >> From: Johan Corveleyn [mailto:jcorvel@gmail.com]
>> >> Sent: vrijdag 27 januari 2012 16:31
>> >> To: Subversion Development
>> >> Subject: regexp matching in svn?
>> >>
>> >> Hi,
>> >>
>> >> Is there any existing regexp matching functionality somewhere in de
>> >> svn codebase, in a way that is reusable for new functionality? If not,
>> >> does anyone know of a reusable third-party library that could be
>> >> easily intregrated (I know, it would be Yet Another Dependency, but
>> >> I'm just asking for options ...)?
>> >>
>> >> Use-case: I like the '-x-p' (show C-function) option for 'svn diff',
>> >> but this is limited to C-like syntax. GNU diff also has a '-F RE'
>> >> option, where RE is a regular expression that is used for matching the
>> >> "function-line". I know I could make use of this functionality by
>> >> invoking GNU diff as external diff command, but for several reasons it
>> >> would be interesting if svn could do this internally (for one thing:
>> >> GNU diff isn't always available / installed).
>> >>
>> >> And I'm sure there would be other useful things people could do with
>> >> regexp functionality ...
>> >
>> > Apr (or Apr-Util) has regex support, not sure which but we require both.
>>
>> Thanks, but I can't find it. I found apr_strmatch in apr-util, but
>> that doesn't do regex, it just matches a fixed string (which would
>> also be useful (it actually covers my concrete use-case), but it's not
>> -F). Maybe I'm overlooking something.
>>
>> If it's not in apr(-util) or some other dependency we already have,
>> how about PCRE (www.pcre.org) ?
>
> I must have missed the point where you explained why decreeing "GNU diff
> must be installed" company policy is not an option.

Yes, it's an option. But it's a hassle. I'm investigating other options.

Other than that, there is something about GNU diff's -F option that I
don't like: it doesn't trim leading whitespace from the matched
"function line", unlike the -p option.

If we implement this ourselves, we can decide to do this (depending on
community consensus of course). We don't have to do exactly the same
as GNU diff's -p / -F (just like we also show 50 characters of -p,
unlike GNU diff which shows 40 characters).

-- 
Johan

Re: regexp matching in svn?

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
On Fri, Jan 27, 2012, at 21:30, Johan Corveleyn wrote:
> On Fri, Jan 27, 2012 at 4:44 PM, Bert Huijben <be...@qqmail.nl> wrote:
> >> -----Original Message-----
> >> From: Johan Corveleyn [mailto:jcorvel@gmail.com]
> >> Sent: vrijdag 27 januari 2012 16:31
> >> To: Subversion Development
> >> Subject: regexp matching in svn?
> >>
> >> Hi,
> >>
> >> Is there any existing regexp matching functionality somewhere in de
> >> svn codebase, in a way that is reusable for new functionality? If not,
> >> does anyone know of a reusable third-party library that could be
> >> easily intregrated (I know, it would be Yet Another Dependency, but
> >> I'm just asking for options ...)?
> >>
> >> Use-case: I like the '-x-p' (show C-function) option for 'svn diff',
> >> but this is limited to C-like syntax. GNU diff also has a '-F RE'
> >> option, where RE is a regular expression that is used for matching the
> >> "function-line". I know I could make use of this functionality by
> >> invoking GNU diff as external diff command, but for several reasons it
> >> would be interesting if svn could do this internally (for one thing:
> >> GNU diff isn't always available / installed).
> >>
> >> And I'm sure there would be other useful things people could do with
> >> regexp functionality ...
> >
> > Apr (or Apr-Util) has regex support, not sure which but we require both.
> 
> Thanks, but I can't find it. I found apr_strmatch in apr-util, but
> that doesn't do regex, it just matches a fixed string (which would
> also be useful (it actually covers my concrete use-case), but it's not
> -F). Maybe I'm overlooking something.
> 
> If it's not in apr(-util) or some other dependency we already have,
> how about PCRE (www.pcre.org) ?

I must have missed the point where you explained why decreeing "GNU diff
must be installed" company policy is not an option.

> 
> -- 
> Johan
> 

Re: regexp matching in svn?

Posted by Johan Corveleyn <jc...@gmail.com>.
On Fri, Jan 27, 2012 at 4:44 PM, Bert Huijben <be...@qqmail.nl> wrote:
>> -----Original Message-----
>> From: Johan Corveleyn [mailto:jcorvel@gmail.com]
>> Sent: vrijdag 27 januari 2012 16:31
>> To: Subversion Development
>> Subject: regexp matching in svn?
>>
>> Hi,
>>
>> Is there any existing regexp matching functionality somewhere in de
>> svn codebase, in a way that is reusable for new functionality? If not,
>> does anyone know of a reusable third-party library that could be
>> easily intregrated (I know, it would be Yet Another Dependency, but
>> I'm just asking for options ...)?
>>
>> Use-case: I like the '-x-p' (show C-function) option for 'svn diff',
>> but this is limited to C-like syntax. GNU diff also has a '-F RE'
>> option, where RE is a regular expression that is used for matching the
>> "function-line". I know I could make use of this functionality by
>> invoking GNU diff as external diff command, but for several reasons it
>> would be interesting if svn could do this internally (for one thing:
>> GNU diff isn't always available / installed).
>>
>> And I'm sure there would be other useful things people could do with
>> regexp functionality ...
>
> Apr (or Apr-Util) has regex support, not sure which but we require both.

Thanks, but I can't find it. I found apr_strmatch in apr-util, but
that doesn't do regex, it just matches a fixed string (which would
also be useful (it actually covers my concrete use-case), but it's not
-F). Maybe I'm overlooking something.

If it's not in apr(-util) or some other dependency we already have,
how about PCRE (www.pcre.org) ?

-- 
Johan

RE: regexp matching in svn?

Posted by Bert Huijben <be...@qqmail.nl>.

> -----Original Message-----
> From: Johan Corveleyn [mailto:jcorvel@gmail.com]
> Sent: vrijdag 27 januari 2012 16:31
> To: Subversion Development
> Subject: regexp matching in svn?
> 
> Hi,
> 
> Is there any existing regexp matching functionality somewhere in de
> svn codebase, in a way that is reusable for new functionality? If not,
> does anyone know of a reusable third-party library that could be
> easily intregrated (I know, it would be Yet Another Dependency, but
> I'm just asking for options ...)?
> 
> Use-case: I like the '-x-p' (show C-function) option for 'svn diff',
> but this is limited to C-like syntax. GNU diff also has a '-F RE'
> option, where RE is a regular expression that is used for matching the
> "function-line". I know I could make use of this functionality by
> invoking GNU diff as external diff command, but for several reasons it
> would be interesting if svn could do this internally (for one thing:
> GNU diff isn't always available / installed).
> 
> And I'm sure there would be other useful things people could do with
> regexp functionality ...

Apr (or Apr-Util) has regex support, not sure which but we require both.

	Bert