You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by kf...@collab.net on 2005/10/10 21:03:17 UTC

Re: CONTRIB: svnlogfilter, alternative to search-svnlog.pl

David Marshall <dm...@gmail.com> writes:
> On 02 May 2005 22:32:59 -0500, kfogel@collab.net <kf...@collab.net> wrote:
> > David Marshall <dm...@gmail.com> writes:
> > > search-svnlog.pl is fine, but I needed just a little bit more.   I
> > > grant permission for changing the copyright notice from me to whatever
> > > is customary, CollabNet or whatever.
> > >
> > > Thank you for Subversion!  I'm glad to be able to contribute, albeit
> > > in an extremely marginal way.
> > 
> > Thanks!  This looks better than search-svnlog.pl.
> > 
> > I noticed that you're just looking for the standard line of hyphens,
> > and not using the "| NNN lines" portion of the log message header.  In
> > practice this is probably fine (who would put a line of 72 hyphens in
> > their log message?) but to be absolutely secure you might want to
> > check the line count.
> > 
> > In Subversion 2.0, I'm +1 on getting rid of the line count and just
> > disallowing a line of 72 hyphens in the log message, by the way :-).
> > 
> > Instead of having separate flags to indicate negation, why not just
> > long forms '--not-regexp' and '--not-user'?  It seems counterintuitive
> > that the main functionality is available only via longopts, yet the
> > "special cases" are achieved via shortopts.
> > 
> > If this completely subsumes the functionality of search-svnlog.pl,
> > which it looks like it does, I'd be happy to have it just replace
> > search-svnlog.pl in the long run.  Would you be okay with that name,
> > or do you prefer "svnlogfilter"?
> > 
> 
> Below is the latest revision of svnlogfilter, which incorporates
> Karl's suggestion above to use --not-user and --not-regex.  Rather
> than count lines, however, I figured that the vast majority of log
> comments won't have any lines of exactly 72 hyphens, a circumstance
> that would confuse my script.  At the same time, I have included a
> --paranoid switch that will do some extra work making sure that the
> log entries have been split properly.
> 
> It's fine with me if calling it search-svnlog.pl is preferred.  Feel
> free to make any changes in name, copyright, or anything else that
> will make it fit in better.  It's very much my pleasure to give a
> little something back.

David, I was going through my mailbox looking for undone stuff, and
found this.  I'm still happy to replace search-svnlog.pl, but there is
one problem: it turns out that lines of 72 hyphens in log messages do
exist in the wild.  I know this from experience now, because I
encountered *multiple* such log messages in Subversion's own
repository during a log message sweep.

So I think the parsing of the "N lines" portion of the header should
be considerede mandatory after all.  Would you be willing to implement
that?

-Karl


> Index: contrib/client-side/svnlogfilter
> ===================================================================
> --- contrib/client-side/svnlogfilter    (revision 0)
> +++ contrib/client-side/svnlogfilter    (revision 0)
> @@ -0,0 +1,167 @@
> +#!/usr/bin/perl
> +use warnings;
> +use strict;
> +
> +# ====================================================================
> +# Show log messages matching certain patterns.  Usage:
> +#
> +#    svnlogfilter [--user USER] [--not-user USER] [--regex REGEX]
> +#        [--not-regex REGEX] [--paranoid]
> +#
> +# See pod for details.
> +#
> +# ====================================================================
> +# Copyright (c) 2005 David Marshall <ma...@chezmarshall.com>.
> +# All rights reserved.
> +#
> +# This software is licensed as described in the file COPYING, which
> +# you should have received as part of this distribution.  The terms
> +# are also available at http://subversion.tigris.org/license-1.html.
> +# If newer versions of this license are posted there, you may use a
> +# newer version instead, at your option.
> +#
> +# This software consists of voluntary contributions made by many
> +# individuals.  For exact contribution history, see the revision
> +# history and logs, available at http://subversion.tigris.org/.
> +# ====================================================================
> +
> +use Getopt::Long;
> +use Pod::Usage;
> +
> +my ($user, $not_user, $regex, $not_regex, $help, $man, $paranoid);
> +
> +GetOptions(
> +       'user=s' => \$user,
> +       'not-user=s' => \$not_user,
> +       'regex=s' => \$regex,
> +       'not-regex=s' => \$not_regex,
> +        'paranoid' => \$paranoid,
> +       'help|?' => \$help,
> +       'man' => \$man,
> +) or pod2usage(2);
> +
> +pod2usage(1) if $help;
> +pod2usage(-exitstatus => 0, -verbose => 2) if $man;
> +
> +pod2usage("$0: either user, not-user, regex, or not-regex required\n")
> +    unless $user || $not_user || $regex || $not_regex;
> +
> +if ($regex) {
> +       eval {
> +               $regex = qr/$regex/;
> +       };
> +
> +       pod2usage("$0: regex error: $@\n") if $@;
> +}
> +
> +if ($not_regex) {
> +       eval {
> +               $not_regex = qr/$not_regex/;
> +       };
> +
> +       pod2usage("$0: regex error: $@\n") if $@;
> +}
> +
> +pod2usage("$0: bogus user name: $user\n") if $user && $user =~ /\W/;
> +pod2usage("$0: bogus user name: $not_user\n") if $not_user &&
> $not_user =~ /\W/;
> +
> +local $/ = '-' x 72 . "\n";
> +local $\ = $/;
> +local $, = $/;
> +local $" = $/;
> +
> +chomp (my @log = <>);
> +
> +# PARANOIA!!!  If we are feeling paranoid, make sure that each entry
> +# looks right.  That is, make a basic check to see whether a line of
> +# exactly 72 hyphens *really* is a separator between two log entries
> +# and not just pretending to be one.  We do not want to be sucked into
> +# its vortex of lies.
> +if ($paranoid) {
> +        for (my $i = $#log; $i > 0; $i--) {
> +                next if $log[$i] =~ /\Ar\d+.*\d lines/; # probably a real entry
> +                $log[$i - 1] = "@log[$i-1,$i]"; # joined by $"
> +                splice @log, $i, 1;
> +        }
> +}
> +
> +if ($user) {
> +       my $user_regex = qr/^r\d+\s+\|\s+$user/;
> +       @log = grep m/$user_regex/, @log;
> +}
> +
> +if ($not_user) {
> +        my $not_regex = qr/^r\d+\s+\|s+$not_user/;
> +        @log = grep !/$not_regex/, @log;
> +}
> +
> +if ($regex) {
> +       @log = grep m/$regex/, @log;
> +}
> +
> +if ($not_regex) {
> +        @log = grep !/$not_regex/, @log;
> +}
> +
> +print ('', @log);
> +__END__
> +
> +=head1 NAME
> +
> +svnlogfilter - filter Subversion log output
> +
> +=head1 SYNOPSIS
> +
> +svnlogfilter [options] [file...]
> +svn log ... | svnlogfilter [options]
> +
> + Options:
> +  --user       include only changes made by the named user
> +  --not-user    exclude changes made by the named user
> +  --regex      include only changes that match this regex
> +  --not-regex   exclude changes that match this regex
> +  --paranoid    look out for 72-hyphen lines in log entries
> +
> +  --help       brief help message
> +  --man                full documentation
> +
> +=head1 OPTIONS
> +
> +=over 8
> +
> +=item B<--user>
> +
> +Includes only log entries describing changes by this user.
> +
> +=item B<--not-user>
> +
> +Excludes log entries describing changes by this user.
> +
> +=item B<--regex>
> +
> +Includes only log entries that match this regex.
> +
> +=item B<--not-regex>
> +
> +Excludes log entries that match this regex.
> +
> +=item B<--paranoid>
> +
> +Tries to verify that a log entry is really a log entry and not just an
> +artifact of splitting on lines that are 72 hyphens
> +
> +=item B<--help>
> +
> +Prints a brief help message and exits.
> +
> +=item B<--man>
> +
> +Prints the manual page and exits.
> +
> +=back
> +
> +=head1 DESCRIPTION
> +
> +This script filters a list of Subversion log messages to find those
> committed by a particular user, those that match a particular regular
> expression, or both.  It is also possible to filter out users and
> regexes.
> +
> +=cut
> 
> Property changes on: contrib/client-side/svnlogfilter
> ___________________________________________________________________
> Name: svn:executable
>    + *
> Name: svn:eol-style
>    + native
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
> 

-- 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: CONTRIB: svnlogfilter, alternative to search-svnlog.pl

Posted by David Marshall <dm...@gmail.com>.
This was already implemented as the --paranoid option, which I don't
have a problem with making mandatory  -- here's a fresh copy with the
paranoid check turned on permanently.  If the first line of an entry
doesn't start with r\d+ and end with \d lines, then the record
separator was really part of an entry.  In that event, the script
appends to the previous log entry and proceeds.

I really don't like the notion of creating a state machine to do such
trivial parsing.  I think it's reasonable to do a simple check to make
sure that a line of 72 hyphens is a real separator.  If someone
constructs a comment in an effort to confuse the filter beyond that,
he's going to succeed.

I don't think there are any sets of comments that would fool this
script that would not cause search-svnlog.pl to die outright.


Index: contrib/client-side/svnlogfilter
===================================================================
--- contrib/client-side/svnlogfilter    (revision 0)
+++ contrib/client-side/svnlogfilter    (revision 0)
@@ -0,0 +1,157 @@
+#!/usr/bin/perl
+use warnings;
+use strict;
+
+# ====================================================================
+# Show log messages matching certain patterns.  Usage:
+#
+#    svnlogfilter [--user USER] [--not-user USER] [--regex REGEX]
+#        [--not-regex REGEX]
+#
+# See pod for details.
+#
+# ====================================================================
+# Copyright (c) 2005 David Marshall <ma...@chezmarshall.com>.
+# All rights reserved.
+#
+# This software is licensed as described in the file COPYING, which
+# you should have received as part of this distribution.  The terms
+# are also available at http://subversion.tigris.org/license-1.html.
+# If newer versions of this license are posted there, you may use a
+# newer version instead, at your option.
+#
+# This software consists of voluntary contributions made by many
+# individuals.  For exact contribution history, see the revision
+# history and logs, available at http://subversion.tigris.org/.
+# ====================================================================
+
+use Getopt::Long;
+use Pod::Usage;
+
+my ($user, $not_user, $regex, $not_regex, $help, $man);
+
+GetOptions(
+       'user=s' => \$user,
+       'not-user=s' => \$not_user,
+       'regex=s' => \$regex,
+       'not-regex=s' => \$not_regex,
+       'help|?' => \$help,
+       'man' => \$man,
+) or pod2usage(2);
+
+pod2usage(1) if $help;
+pod2usage(-exitstatus => 0, -verbose => 2) if $man;
+
+pod2usage("$0: either user, not-user, regex, or not-regex required\n")
+    unless $user || $not_user || $regex || $not_regex;
+
+if ($regex) {
+       eval {
+               $regex = qr/$regex/;
+       };
+
+       pod2usage("$0: regex error: $@\n") if $@;
+}
+
+if ($not_regex) {
+       eval {
+               $not_regex = qr/$not_regex/;
+       };
+
+       pod2usage("$0: regex error: $@\n") if $@;
+}
+
+pod2usage("$0: bogus user name: $user\n") if $user && $user =~ /\W/;
+pod2usage("$0: bogus user name: $not_user\n") if $not_user &&
$not_user =~ /\W/;
+
+local $/ = '-' x 72 . "\n";
+local $\ = $/;
+local $, = $/;
+local $" = $/;
+
+chomp (my @log = <>);
+
+# Make sure that each entry.
+# looks right.  That is, make a basic check to see whether a line of
+# exactly 72 hyphens *really* is a separator between two log entries
+# and not just pretending to be one.  We do not want to be sucked into
+# its vortex of lies.
+for (my $i = $#log; $i > 0; $i--) {
+    next if $log[$i] =~ /\Ar\d+.*\d lines/; # probably a real entry
+    $log[$i - 1] = "@log[$i-1,$i]"; # joined by $"
+    splice @log, $i, 1;
+}
+
+if ($user) {
+       my $user_regex = qr/^r\d+\s+\|\s+$user/;
+       @log = grep m/$user_regex/, @log;
+}
+
+if ($not_user) {
+        my $not_regex = qr/^r\d+\s+\|s+$not_user/;
+        @log = grep !/$not_regex/, @log;
+}
+
+if ($regex) {
+       @log = grep m/$regex/, @log;
+}
+
+if ($not_regex) {
+        @log = grep !/$not_regex/, @log;
+}
+
+print ('', @log);
+__END__
+
+=head1 NAME
+
+svnlogfilter - filter Subversion log output
+
+=head1 SYNOPSIS
+
+svnlogfilter [options] [file...]
+svn log ... | svnlogfilter [options]
+
+ Options:
+  --user       include only changes made by the named user
+  --not-user    exclude changes made by the named user
+  --regex      include only changes that match this regex
+  --not-regex   exclude changes that match this regex
+  --help       brief help message
+  --man                full documentation
+
+=head1 OPTIONS
+
+=over 8
+
+=item B<--user>
+
+Includes only log entries describing changes by this user.
+
+=item B<--not-user>
+
+Excludes log entries describing changes by this user.
+
+=item B<--regex>
+
+Includes only log entries that match this regex.
+
+=item B<--not-regex>
+
+Excludes log entries that match this regex.
+
+=item B<--help>
+
+Prints a brief help message and exits.
+
+=item B<--man>
+
+Prints the manual page and exits.
+
+=back
+
+=head1 DESCRIPTION
+
+This script filters a list of Subversion log messages to find those
committed by a particular user, those that match a particular regular
expression, or both.  It is also possible to filter out users and
regexes.
+
+=cut


On 10 Oct 2005 16:03:17 -0500, kfogel@collab.net <kf...@collab.net> wrote:
> David Marshall <dm...@gmail.com> writes:
> > On 02 May 2005 22:32:59 -0500, kfogel@collab.net <kf...@collab.net> wrote:
> > > David Marshall <dm...@gmail.com> writes:
> > > > search-svnlog.pl is fine, but I needed just a little bit more.   I
> > > > grant permission for changing the copyright notice from me to whatever
> > > > is customary, CollabNet or whatever.
> > > >
> > > > Thank you for Subversion!  I'm glad to be able to contribute, albeit
> > > > in an extremely marginal way.
> > >
> > > Thanks!  This looks better than search-svnlog.pl.
> > >
> > > I noticed that you're just looking for the standard line of hyphens,
> > > and not using the "| NNN lines" portion of the log message header.  In
> > > practice this is probably fine (who would put a line of 72 hyphens in
> > > their log message?) but to be absolutely secure you might want to
> > > check the line count.
> > >
> > > In Subversion 2.0, I'm +1 on getting rid of the line count and just
> > > disallowing a line of 72 hyphens in the log message, by the way :-).
> > >
> > > Instead of having separate flags to indicate negation, why not just
> > > long forms '--not-regexp' and '--not-user'?  It seems counterintuitive
> > > that the main functionality is available only via longopts, yet the
> > > "special cases" are achieved via shortopts.
> > >
> > > If this completely subsumes the functionality of search-svnlog.pl,
> > > which it looks like it does, I'd be happy to have it just replace
> > > search-svnlog.pl in the long run.  Would you be okay with that name,
> > > or do you prefer "svnlogfilter"?
> > >
> >
> > Below is the latest revision of svnlogfilter, which incorporates
> > Karl's suggestion above to use --not-user and --not-regex.  Rather
> > than count lines, however, I figured that the vast majority of log
> > comments won't have any lines of exactly 72 hyphens, a circumstance
> > that would confuse my script.  At the same time, I have included a
> > --paranoid switch that will do some extra work making sure that the
> > log entries have been split properly.
> >
> > It's fine with me if calling it search-svnlog.pl is preferred.  Feel
> > free to make any changes in name, copyright, or anything else that
> > will make it fit in better.  It's very much my pleasure to give a
> > little something back.
>
> David, I was going through my mailbox looking for undone stuff, and
> found this.  I'm still happy to replace search-svnlog.pl, but there is
> one problem: it turns out that lines of 72 hyphens in log messages do
> exist in the wild.  I know this from experience now, because I
> encountered *multiple* such log messages in Subversion's own
> repository during a log message sweep.
>
> So I think the parsing of the "N lines" portion of the header should
> be considerede mandatory after all.  Would you be willing to implement
> that?
>
> -Karl
>
>
> > Index: contrib/client-side/svnlogfilter
> > ===================================================================
> > --- contrib/client-side/svnlogfilter    (revision 0)
> > +++ contrib/client-side/svnlogfilter    (revision 0)
> > @@ -0,0 +1,167 @@
> > +#!/usr/bin/perl
> > +use warnings;
> > +use strict;
> > +
> > +# ====================================================================
> > +# Show log messages matching certain patterns.  Usage:
> > +#
> > +#    svnlogfilter [--user USER] [--not-user USER] [--regex REGEX]
> > +#        [--not-regex REGEX] [--paranoid]
> > +#
> > +# See pod for details.
> > +#
> > +# ====================================================================
> > +# Copyright (c) 2005 David Marshall <ma...@chezmarshall.com>.
> > +# All rights reserved.
> > +#
> > +# This software is licensed as described in the file COPYING, which
> > +# you should have received as part of this distribution.  The terms
> > +# are also available at http://subversion.tigris.org/license-1.html.
> > +# If newer versions of this license are posted there, you may use a
> > +# newer version instead, at your option.
> > +#
> > +# This software consists of voluntary contributions made by many
> > +# individuals.  For exact contribution history, see the revision
> > +# history and logs, available at http://subversion.tigris.org/.
> > +# ====================================================================
> > +
> > +use Getopt::Long;
> > +use Pod::Usage;
> > +
> > +my ($user, $not_user, $regex, $not_regex, $help, $man, $paranoid);
> > +
> > +GetOptions(
> > +       'user=s' => \$user,
> > +       'not-user=s' => \$not_user,
> > +       'regex=s' => \$regex,
> > +       'not-regex=s' => \$not_regex,
> > +        'paranoid' => \$paranoid,
> > +       'help|?' => \$help,
> > +       'man' => \$man,
> > +) or pod2usage(2);
> > +
> > +pod2usage(1) if $help;
> > +pod2usage(-exitstatus => 0, -verbose => 2) if $man;
> > +
> > +pod2usage("$0: either user, not-user, regex, or not-regex required\n")
> > +    unless $user || $not_user || $regex || $not_regex;
> > +
> > +if ($regex) {
> > +       eval {
> > +               $regex = qr/$regex/;
> > +       };
> > +
> > +       pod2usage("$0: regex error: $@\n") if $@;
> > +}
> > +
> > +if ($not_regex) {
> > +       eval {
> > +               $not_regex = qr/$not_regex/;
> > +       };
> > +
> > +       pod2usage("$0: regex error: $@\n") if $@;
> > +}
> > +
> > +pod2usage("$0: bogus user name: $user\n") if $user && $user =~ /\W/;
> > +pod2usage("$0: bogus user name: $not_user\n") if $not_user &&
> > $not_user =~ /\W/;
> > +
> > +local $/ = '-' x 72 . "\n";
> > +local $\ = $/;
> > +local $, = $/;
> > +local $" = $/;
> > +
> > +chomp (my @log = <>);
> > +
> > +# PARANOIA!!!  If we are feeling paranoid, make sure that each entry
> > +# looks right.  That is, make a basic check to see whether a line of
> > +# exactly 72 hyphens *really* is a separator between two log entries
> > +# and not just pretending to be one.  We do not want to be sucked into
> > +# its vortex of lies.
> > +if ($paranoid) {
> > +        for (my $i = $#log; $i > 0; $i--) {
> > +                next if $log[$i] =~ /\Ar\d+.*\d lines/; # probably a real entry
> > +                $log[$i - 1] = "@log[$i-1,$i]"; # joined by $"
> > +                splice @log, $i, 1;
> > +        }
> > +}
> > +
> > +if ($user) {
> > +       my $user_regex = qr/^r\d+\s+\|\s+$user/;
> > +       @log = grep m/$user_regex/, @log;
> > +}
> > +
> > +if ($not_user) {
> > +        my $not_regex = qr/^r\d+\s+\|s+$not_user/;
> > +        @log = grep !/$not_regex/, @log;
> > +}
> > +
> > +if ($regex) {
> > +       @log = grep m/$regex/, @log;
> > +}
> > +
> > +if ($not_regex) {
> > +        @log = grep !/$not_regex/, @log;
> > +}
> > +
> > +print ('', @log);
> > +__END__
> > +
> > +=head1 NAME
> > +
> > +svnlogfilter - filter Subversion log output
> > +
> > +=head1 SYNOPSIS
> > +
> > +svnlogfilter [options] [file...]
> > +svn log ... | svnlogfilter [options]
> > +
> > + Options:
> > +  --user       include only changes made by the named user
> > +  --not-user    exclude changes made by the named user
> > +  --regex      include only changes that match this regex
> > +  --not-regex   exclude changes that match this regex
> > +  --paranoid    look out for 72-hyphen lines in log entries
> > +
> > +  --help       brief help message
> > +  --man                full documentation
> > +
> > +=head1 OPTIONS
> > +
> > +=over 8
> > +
> > +=item B<--user>
> > +
> > +Includes only log entries describing changes by this user.
> > +
> > +=item B<--not-user>
> > +
> > +Excludes log entries describing changes by this user.
> > +
> > +=item B<--regex>
> > +
> > +Includes only log entries that match this regex.
> > +
> > +=item B<--not-regex>
> > +
> > +Excludes log entries that match this regex.
> > +
> > +=item B<--paranoid>
> > +
> > +Tries to verify that a log entry is really a log entry and not just an
> > +artifact of splitting on lines that are 72 hyphens
> > +
> > +=item B<--help>
> > +
> > +Prints a brief help message and exits.
> > +
> > +=item B<--man>
> > +
> > +Prints the manual page and exits.
> > +
> > +=back
> > +
> > +=head1 DESCRIPTION
> > +
> > +This script filters a list of Subversion log messages to find those
> > committed by a particular user, those that match a particular regular
> > expression, or both.  It is also possible to filter out users and
> > regexes.
> > +
> > +=cut
> >
> > Property changes on: contrib/client-side/svnlogfilter
> > ___________________________________________________________________
> > Name: svn:executable
> >    + *
> > Name: svn:eol-style
> >    + native
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> > For additional commands, e-mail: dev-help@subversion.tigris.org
> >
>
> --
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org