You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Nathan Hartman <ha...@gmail.com> on 2020/09/11 14:38:24 UTC

[PATCH] check-mime-type.pl

The attached two patches for contrib/hook-scripts/check-mime-type.pl
came via a GitHub PR [1] a few days ago...

They can be applied in sequence:
svn-patch-01-c73cb4415d.txt
svn-patch-02-68cc555740.txt

In principle I think the changes are useful, though might change
behavior in a surprising way for those who are using the hook as it
stands now.

Would it make sense to copy check-mime-type.pl, calling it, say,
check-mime-type-2.pl, and apply the changes to that file, keeping the
original as it is now?

[1] https://github.com/apache/subversion/pull/22

Cheers,
Nathan

Re: [PATCH] check-mime-type.pl

Posted by Nathan Hartman <ha...@gmail.com>.
On Sun, Jan 17, 2021 at 10:10 AM M. Buecher
<ma...@maddes.net> wrote:
>
> Dear all,
>
> is anyone able to re-review and commit.
>
> Regards
> Maddes

For convenience I'm attaching the patch from Maddes's Pull Request
(it's PR #22) at GitHub. It applies cleanly to trunk (as of r1885600),
however I haven't had a chance to review it yet. I hope someone more
comfortable than me in Perl can review it soon.

Cheers,
Nathan

Re: [PATCH] check-mime-type.pl

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
M. Buecher wrote on Sun, 17 Jan 2021 15:10 +00:00:
> is anyone able to re-review and commit.

Not me, sorry.

Re: [PATCH] check-mime-type.pl

Posted by "M. Buecher" <ma...@maddes.net>.
Dear all,

is anyone able to re-review and commit.

Regards
Maddes


On 2020-12-21 16:22, M. Buecher wrote:
> Hello Daniel,
> Dear all,
> 
> thanks for your detailed review, Daniel. Finally got some more leisure
> time to care about it.
> I deceided to concentrate on the first part to handle
> property-modified files plus your overall comments.
> Left the Windows support aside as this has to be fine-tuned a lot, as
> I learned a lot from your comments, thanks.
> 
> 
> On 2020-09-12 05:14, Daniel Shahaf wrote:
>> Nathan Hartman wrote on Fri, Sep 11, 2020 at 10:38:24 -0400:
>>> Subject: [PATCH] Check also property-modified files in hook script
>>>  check-mime-type.pl
>>> 
>>> * contrib/hook-scripts/check-mime-type.pl:
>>>   (proplist) add property-modified files for checks
>> 
>> There are several changes not mentioned in the log message:
>> 
>> - Script name fix in comments
>> 
>> - Move 'use' before 'BEGIN'
>>   (incidentally, I'm not actually sure that's a correct change, given
>>   the BEGIN block implies perls as old as 5.4 are supported.)
>> 
>> - Add docstrings
>> 
>>> +++ b/contrib/hook-scripts/check-mime-type.pl
>>> @@ -1,18 +1,19 @@
>>>  #!/usr/bin/env perl
>>> 
>>>  # 
>>> ====================================================================
>>> -# commit-mime-type-check.pl: check that every added file has the
>>> -# svn:mime-type property set and every added file with a mime-type
>>> -# matching text/* also has svn:eol-style set.
>>> +# check-mime-type.pl: check that every added or property-modified 
>>> file
>>> +# has the svn:mime-type property set and every added or 
>>> property-modified
>>> +# file with a mime-type matching text/* also has svn:eol-style set.
>> 
>> I don't think it's acceptable for a script called "check-mime-type.pl"
>> to start enforcing svn:eol-style as well.  This violates the 
>> principles
>> of least surprise and of "Do one thing and do it well".  Also, the
>> choice of text/* is a bit arbitrary, and would seem to have both false
>> positives and false negatives: e.g., application/xml and text/x-diff.
>> 
>> I'm not even sure it's acceptable for this script to start enforcing
>> svn:mime-type's presence upon propmods, either.  It's not exactly
>> a natural extension that existing users would expect.
> 
> First I am not the original author of that hook script, just using it
> with those modifications with Subversion 1.6 up to 1.14.0 on
> Debian/Ubuntu and Windows.
> Some people (like me) want to enforce "svn:mime-type" for some
> repositories and using that script is optional to admins.
> I agree with the original author to check mime-type plus EoL in a
> single script, as EoL setting can be important to text files.
> Enforcing a policy on creation but not on changes is inconsistent,
> that's the reason for this fix.
> 
> But I do understand your worries and made the functionalities
> optional, setting the default exactly as it acts right now.
> Completely removing the EoL check would change its fucntionality after 
> 16 years.
> New version of the patch, log and script itself attached. Will change
> the Pull Request according to upcoming comments.
> 
> Additionally it could be renamed, e.g. to 
> "check-mime-type-and-text-eol.pl"?
> 
> 
>>>  # 
>>> ====================================================================
>>>  # Copyright (c) 2000-2004 CollabNet.  All rights reserved.
>>> +# Copyright (c) 2014-2020 Apache Software Foundation (ASF).
>> 
>> How did you come up with these numbers?  The script received commits 
>> in
>> 2013 and 2015, so the end point should be 2013.  (Also, the next line
>> talks of a "COPYING" file which doesn't exist, but that's a 
>> preëxisting
>> problem.)
> 
> Those lines are inherited from commit-access-control.pl, when it was
> copied as a base for check-mime-type.pl.
> Copyright lines and license were not updated by CollabNet or ASF
> despite new commits since 2004.
> I used 2014 as a start as this was the time this patch was posted on
> the ASF mailing list.
> 
> Maybe coyprigght and license should be adapted to the current version
> of commit-access-control.pl either with or without copyright lines.
> According to the overall Subversion history it may should be changed 
> to:
> # Copyright (c) 2000-2009 CollabNet.  All rights reserved.
> # Copyright (c) 2010-2020 Apache Software Foundation (ASF).
> 
> 
> 
>>> @@ -100,19 +101,20 @@ my $tmp_dir = '/tmp';
>>> -# Figure out what files have added using svnlook.
>>> -my @files_added;
>>> +# Figure out what files have been added/property-modified using 
>>> svnlook.
>>> +my @paths_to_check;
>>> +my $props_changed_re = qr/^(?:A |[U_ ]U)  (.*[^\/])$/;
>>>  foreach my $line (&read_from_process($svnlook, 'changed', $repos, 
>>> '-t', $txn))
>>>    {
>>> -		# Add only files that were added to @files_added
>>> -    if ($line =~ /^A.  (.*[^\/])$/)
>>> +    # Add only files that were added/property-modified to 
>>> @paths_to_check
>>> +    if ($line =~ /$props_changed_re/)
>> 
>> Why did you change /^A./ to /^A /?
> 
> The RegEx was taken from Leo Davis' mail, but I agree it should be
> /^A./ for added files and also /^.U/ for property-modified files.
> 
> 
>>> Subject: [PATCH] Enhanced platform support of hook script 
>>> check-mime-type.pl
>>> 
>>> * contrib/hook-scripts/check-mime-type.pl:
>>>   (svnlook) Cross Platform: sane executable determination
>> 
>>> +++ b/contrib/hook-scripts/check-mime-type.pl
>>> @@ -28,6 +28,15 @@
>>> 
>>>  use strict;
>>>  use Carp;
>>> +use Config;
>>> +use Cwd;
>> 
>> «use Cwd qw(getcwd);» would be better.
>> 
>>> +use Env;
>> 
>> Why?
>> 
>> This syntax (according to «perldoc Env») "ties all existing 
>> environment
>> variables ("keys %ENV") to scalars".  This seems like a dangerous 
>> thing
>> to do; in fact, it negates the «use strict» a few lines above.
>> 
>> I suspect this line should be deleted.
>> 
>>> +use File::Which  qw(which);
>>> +use File::Temp  qw(tempdir tempfile);
>>> +
>>> +my $old_dir;
>>> +my $os_windows = $^O eq 'MSWin32';
>>> +my $os_dos = $^O eq 'dos';
>>> 
>>>  # Turn on warnings the best way depending on the Perl version.
>>>  BEGIN {
>>> @@ -35,6 +44,12 @@ BEGIN {
>>>      { require warnings; import warnings; }
>>>    else
>>>      { $^W = 1; }
>>> +
>>> +  $old_dir = getcwd;
>>> +}
>>> +
>>> +END {
>>> +  chdir($old_dir);
>>>  }
>> 
>> Why is this necessary?
> 
> On Windows a sub script called by the precommit hook can change the
> current directory whcih also affects the precommit hook, which had led
> to weird results for the following sub scripts by the precommit hook.
> This should have been explained in a comment.
> 
> 
>>> @@ -42,31 +57,58 @@ BEGIN {
>>>  # Configuration section.
>>> 
>>>  # Svnlook path.
>>> -my $svnlook = "/usr/bin/svnlook";
>>> +my $svnlook;
>>> 
>>>  # Since the path to svnlook depends upon the local installation
>>>  # preferences, check that the required program exists to insure that
>>>  # the administrator has set up the script properly.
>>>  {
>>> -  my $ok = 1;
>>> -  foreach my $program ($svnlook)
>>> +  my $svnlook_exe = 'svnlook' . $Config{_exe};
>>> +  my @svnlook_paths;
>>> +
>>> +  if ($os_windows || $os_dos)
>>> +    {
>>> +      # On Windows/DOS the environment is available
>>> +      my $svn_bindir = $ENV{'SVN_BINDIR'};
>> 
>> Shouldn't the condition be «if (defined $ENV{SVN_BINDIR})»?
>> 
>>> +      if (defined $svn_bindir and length $svn_bindir)
>>> +        {
>>> +          $svnlook_paths[++$#svnlook_paths] = 
>>> "$svn_bindir/$svnlook_exe";
>> 
>> Why does this line use «$arrayvar[++$#arrayvar] = $foo» rather than
>> «push @arrayvar, $foo», which is equivalent and idiomatic?
> 
> Just because I'm not a Perl expert.
> That's why I postponed a patch for Windows support.
> 
> 
>>> +        }
>>> +
>>> +      $svnlook = which($svnlook_exe);
>>> +      if (defined $svnlook and length $svnlook)
>>> +        {
>>> +          $svnlook_paths[++$#svnlook_paths] = $svnlook;
>>> +        }
>> 
>> Why do we bother to have $svnlook tested for existence and
>> executability?  Don't we trust File::Which::which() to work as
>> advertised?
>> 
>>> +      undef $svnlook;
>>> +    }
>>> +
>>> +  # On Unix the environment is empty, define fallback with absolute 
>>> path
>>> +  $svnlook_paths[++$#svnlook_paths] = "/usr/bin/$svnlook_exe";
>>> +  foreach my $program (@svnlook_paths)
>>>      {
>>>        if (-e $program)
>>>          {
>>>            unless (-x $program)
>>>              {
>>> -              warn "$0: required program `$program' is not 
>>> executable, ",
>>> -                   "edit $0.\n";
>>> -              $ok = 0;
>>> +              warn "$0: program `$program' is not executable.\n";
>>> +              next;
>>>              }
>>> +          $svnlook = $program;
>>> +          last;
>>>          }
>>>        else
>>>          {
>>> -          warn "$0: required program `$program' does not exist, edit 
>>> $0.\n";
>>> -          $ok = 0;
>>> +          warn "$0: program `$program' does not exist.\n";
>>> +          next;
>>>          }
>>>      }
>> 
>> What's the point of issuing a warning when there are some more
>> iterations yet to go?
>> 
>> If I'm reading this correctly, this entire block should be reduced to:
>> 
>>     # Use «||» rather than «//» since the BEGIN block implies versions
>> of Perl that lack «//» are supported.
>>     my $svnlook = +(defined($ENV{SVN_BINDIR}) ?
>> "$ENV{SVN_BINDIR}/$svnlook_exe" : undef) || which("svnlook") ||
>> "/usr/bin/svnlook";
>>     die … unless -x $svnlook and (-f _ or -l _);
>> 
>> I changed -e to (-f or -l) since -x implies -e.
>> 
>> I also disabled the fallbacks when the envvar is set but doesn't point
>> to a usable executable.  I'm not married to these semantics, but I 
>> think
>> these semantics may be expected _given the particular variable name_:
>> SVN_BINDIR is the name of the configure AC_SUBST() variable that's
>> cooked into various scripts installed by install-tools.
>> 
>>> -  exit 1 unless $ok;
>>> +  unless (defined $svnlook and length $svnlook)
>>> +    {
>>> +      warn "$0: required program `$svnlook_exe' does not exist or is 
>>> not executable, maybe have to edit $0.\n";
>>> +      exit 1;
>>> +    }
>>>  }
>>> 
>>>  
>>> ######################################################################
>>> @@ -95,9 +137,9 @@ sub ACCESS_READ_WRITE () { 'read-write' }
>>>  
>>> ######################################################################
>>>  # Harvest data using svnlook.
>>> 
>>> -# Change into /tmp so that svnlook diff can create its .svnlook
>>> +# Change into temp dir so that svnlook diff can create its .svnlook
>> 
>> When this script was first added, svnlook would create
>> a $TMPDIR/svnlook.$unique_integer directory (without a leading dot, 
>> and
>> in $TMPDIR rather than in cwd).  svnlook doesn't do that nowadays, so
>> I see no reason to chdir at all — …
>> 
>>>  # directory.
>>> -my $tmp_dir = '/tmp';
>>> +my $tmp_dir = tempdir( TMPDIR => 1, CLEANUP => 1 );
>> 
>> … in which case, this line — though otherwise correct — should simply 
>> be
>> deleted altogether, and the File::Temp and Cwd uses with it.
>> 
>>> @@ -220,19 +262,66 @@ sub safe_read_from_pipe
>>> +      while ($command = shift)
>>> +        {
>>> +          if ("-m" eq $command)
>> 
>> - This is a general-purpose "run a child process" function, so it 
>> can't
>>   special-case "-m".
>> 
>> - If it could, it would have to check for all four variants: «-m foo»,
>>   «-mfoo», «--message=foo», and «--message foo».
>> 
>> - It would also have to look for the «--» end-of-options guard, since
>>   «-m» may occur after one.  (This can actually happen in the second
>>   callsite of read_from_process().)
>> 
>> - None of the callers pass -m in the first place, so this is dead 
>> code.
>> 
>>> +            {
>>> +              $comment = shift;
>>> +              my ($handle, $tmpfile) = tempfile( DIR => $tmp_dir);
>>> +              print $handle $comment;
>>> +              close($handle);
>>> +
>>> +              push(@commandline, "--file");
>>> +              push(@commandline, $tmpfile);
>> 
>> There may have been a "--" end-of-options guard earlier on the command
>> line.
>> 
>>> +            }
>>> +          else
>>> +            {
>>> +              # Munge the command to protect it from the command 
>>> line
>>> +              $command =~ s/\"/\\\"/g;
>>> +              if ($command =~ m"\s") { $command = "\"$command\""; }
>>> +              if ($command eq "") { $command = "\"\""; }
>> 
>> I'm pretty sure this escaping is incomplete.  Use a library function
>> rather than roll your own escaping.
>> 
>>> +              if ($command =~ m"\n")
>>> +                {
>>> +                  warn "$0: carriage return detected in command - 
>>> may not work\n";
>> 
>> If it "may not work", then die() rather than warn().
>> 
>>> +                }
>>> +              push(@commandline, $command);
>> ⋮
>>> +      open(SAFE_READ, "@commandline |")
>> 
>>> They can be applied in sequence:
>>> svn-patch-01-c73cb4415d.txt
>>> svn-patch-02-68cc555740.txt
>>> 
>>> In principle I think the changes are useful, though might change
>>> behavior in a surprising way for those who are using the hook as it
>>> stands now.
>>> 
>> 
>> Usefulness aside, I think the first patch is unacceptable because of
>> backwards compatibility considerations and the second patch would not 
>> be
>> acceptable unless revised.
>> 
>>> Would it make sense to copy check-mime-type.pl, calling it, say,
>>> check-mime-type-2.pl, and apply the changes to that file, keeping the
>>> original as it is now?
>> 
>> For the first patch, moving the new behaviours to a new script would
>> address the backwards compatibility point, as would making the new
>> behaviours optional and opt-in.  However, neither would address the
>> other review points.
>> 
>> The second patch doesn't appear to raise any backwards compatibility
>> concerns.
>> 
>> Cheers,
>> Daniel
> 
> Kind regards
> Maddes

Re: [PATCH] check-mime-type.pl

Posted by "M. Buecher" <ma...@maddes.net>.
Hello Daniel,
Dear all,

thanks for your detailed review, Daniel. Finally got some more leisure 
time to care about it.
I deceided to concentrate on the first part to handle property-modified 
files plus your overall comments.
Left the Windows support aside as this has to be fine-tuned a lot, as I 
learned a lot from your comments, thanks.


On 2020-09-12 05:14, Daniel Shahaf wrote:
> Nathan Hartman wrote on Fri, Sep 11, 2020 at 10:38:24 -0400:
>> Subject: [PATCH] Check also property-modified files in hook script
>>  check-mime-type.pl
>> 
>> * contrib/hook-scripts/check-mime-type.pl:
>>   (proplist) add property-modified files for checks
> 
> There are several changes not mentioned in the log message:
> 
> - Script name fix in comments
> 
> - Move 'use' before 'BEGIN'
>   (incidentally, I'm not actually sure that's a correct change, given
>   the BEGIN block implies perls as old as 5.4 are supported.)
> 
> - Add docstrings
> 
>> +++ b/contrib/hook-scripts/check-mime-type.pl
>> @@ -1,18 +1,19 @@
>>  #!/usr/bin/env perl
>> 
>>  # 
>> ====================================================================
>> -# commit-mime-type-check.pl: check that every added file has the
>> -# svn:mime-type property set and every added file with a mime-type
>> -# matching text/* also has svn:eol-style set.
>> +# check-mime-type.pl: check that every added or property-modified 
>> file
>> +# has the svn:mime-type property set and every added or 
>> property-modified
>> +# file with a mime-type matching text/* also has svn:eol-style set.
> 
> I don't think it's acceptable for a script called "check-mime-type.pl"
> to start enforcing svn:eol-style as well.  This violates the principles
> of least surprise and of "Do one thing and do it well".  Also, the
> choice of text/* is a bit arbitrary, and would seem to have both false
> positives and false negatives: e.g., application/xml and text/x-diff.
> 
> I'm not even sure it's acceptable for this script to start enforcing
> svn:mime-type's presence upon propmods, either.  It's not exactly
> a natural extension that existing users would expect.

First I am not the original author of that hook script, just using it 
with those modifications with Subversion 1.6 up to 1.14.0 on 
Debian/Ubuntu and Windows.
Some people (like me) want to enforce "svn:mime-type" for some 
repositories and using that script is optional to admins.
I agree with the original author to check mime-type plus EoL in a single 
script, as EoL setting can be important to text files.
Enforcing a policy on creation but not on changes is inconsistent, 
that's the reason for this fix.

But I do understand your worries and made the functionalities optional, 
setting the default exactly as it acts right now.
Completely removing the EoL check would change its fucntionality after 
16 years.
New version of the patch, log and script itself attached. Will change 
the Pull Request according to upcoming comments.

Additionally it could be renamed, e.g. to 
"check-mime-type-and-text-eol.pl"?


>>  # 
>> ====================================================================
>>  # Copyright (c) 2000-2004 CollabNet.  All rights reserved.
>> +# Copyright (c) 2014-2020 Apache Software Foundation (ASF).
> 
> How did you come up with these numbers?  The script received commits in
> 2013 and 2015, so the end point should be 2013.  (Also, the next line
> talks of a "COPYING" file which doesn't exist, but that's a preëxisting
> problem.)

Those lines are inherited from commit-access-control.pl, when it was 
copied as a base for check-mime-type.pl.
Copyright lines and license were not updated by CollabNet or ASF despite 
new commits since 2004.
I used 2014 as a start as this was the time this patch was posted on the 
ASF mailing list.

Maybe coyprigght and license should be adapted to the current version of 
commit-access-control.pl either with or without copyright lines.
According to the overall Subversion history it may should be changed to:
# Copyright (c) 2000-2009 CollabNet.  All rights reserved.
# Copyright (c) 2010-2020 Apache Software Foundation (ASF).



>> @@ -100,19 +101,20 @@ my $tmp_dir = '/tmp';
>> -# Figure out what files have added using svnlook.
>> -my @files_added;
>> +# Figure out what files have been added/property-modified using 
>> svnlook.
>> +my @paths_to_check;
>> +my $props_changed_re = qr/^(?:A |[U_ ]U)  (.*[^\/])$/;
>>  foreach my $line (&read_from_process($svnlook, 'changed', $repos, 
>> '-t', $txn))
>>    {
>> -		# Add only files that were added to @files_added
>> -    if ($line =~ /^A.  (.*[^\/])$/)
>> +    # Add only files that were added/property-modified to 
>> @paths_to_check
>> +    if ($line =~ /$props_changed_re/)
> 
> Why did you change /^A./ to /^A /?

The RegEx was taken from Leo Davis' mail, but I agree it should be /^A./ 
for added files and also /^.U/ for property-modified files.


>> Subject: [PATCH] Enhanced platform support of hook script 
>> check-mime-type.pl
>> 
>> * contrib/hook-scripts/check-mime-type.pl:
>>   (svnlook) Cross Platform: sane executable determination
> 
>> +++ b/contrib/hook-scripts/check-mime-type.pl
>> @@ -28,6 +28,15 @@
>> 
>>  use strict;
>>  use Carp;
>> +use Config;
>> +use Cwd;
> 
> «use Cwd qw(getcwd);» would be better.
> 
>> +use Env;
> 
> Why?
> 
> This syntax (according to «perldoc Env») "ties all existing environment
> variables ("keys %ENV") to scalars".  This seems like a dangerous thing
> to do; in fact, it negates the «use strict» a few lines above.
> 
> I suspect this line should be deleted.
> 
>> +use File::Which  qw(which);
>> +use File::Temp  qw(tempdir tempfile);
>> +
>> +my $old_dir;
>> +my $os_windows = $^O eq 'MSWin32';
>> +my $os_dos = $^O eq 'dos';
>> 
>>  # Turn on warnings the best way depending on the Perl version.
>>  BEGIN {
>> @@ -35,6 +44,12 @@ BEGIN {
>>      { require warnings; import warnings; }
>>    else
>>      { $^W = 1; }
>> +
>> +  $old_dir = getcwd;
>> +}
>> +
>> +END {
>> +  chdir($old_dir);
>>  }
> 
> Why is this necessary?

On Windows a sub script called by the precommit hook can change the 
current directory whcih also affects the precommit hook, which had led 
to weird results for the following sub scripts by the precommit hook.
This should have been explained in a comment.


>> @@ -42,31 +57,58 @@ BEGIN {
>>  # Configuration section.
>> 
>>  # Svnlook path.
>> -my $svnlook = "/usr/bin/svnlook";
>> +my $svnlook;
>> 
>>  # Since the path to svnlook depends upon the local installation
>>  # preferences, check that the required program exists to insure that
>>  # the administrator has set up the script properly.
>>  {
>> -  my $ok = 1;
>> -  foreach my $program ($svnlook)
>> +  my $svnlook_exe = 'svnlook' . $Config{_exe};
>> +  my @svnlook_paths;
>> +
>> +  if ($os_windows || $os_dos)
>> +    {
>> +      # On Windows/DOS the environment is available
>> +      my $svn_bindir = $ENV{'SVN_BINDIR'};
> 
> Shouldn't the condition be «if (defined $ENV{SVN_BINDIR})»?
> 
>> +      if (defined $svn_bindir and length $svn_bindir)
>> +        {
>> +          $svnlook_paths[++$#svnlook_paths] = 
>> "$svn_bindir/$svnlook_exe";
> 
> Why does this line use «$arrayvar[++$#arrayvar] = $foo» rather than
> «push @arrayvar, $foo», which is equivalent and idiomatic?

Just because I'm not a Perl expert.
That's why I postponed a patch for Windows support.


>> +        }
>> +
>> +      $svnlook = which($svnlook_exe);
>> +      if (defined $svnlook and length $svnlook)
>> +        {
>> +          $svnlook_paths[++$#svnlook_paths] = $svnlook;
>> +        }
> 
> Why do we bother to have $svnlook tested for existence and
> executability?  Don't we trust File::Which::which() to work as
> advertised?
> 
>> +      undef $svnlook;
>> +    }
>> +
>> +  # On Unix the environment is empty, define fallback with absolute 
>> path
>> +  $svnlook_paths[++$#svnlook_paths] = "/usr/bin/$svnlook_exe";
>> +  foreach my $program (@svnlook_paths)
>>      {
>>        if (-e $program)
>>          {
>>            unless (-x $program)
>>              {
>> -              warn "$0: required program `$program' is not 
>> executable, ",
>> -                   "edit $0.\n";
>> -              $ok = 0;
>> +              warn "$0: program `$program' is not executable.\n";
>> +              next;
>>              }
>> +          $svnlook = $program;
>> +          last;
>>          }
>>        else
>>          {
>> -          warn "$0: required program `$program' does not exist, edit 
>> $0.\n";
>> -          $ok = 0;
>> +          warn "$0: program `$program' does not exist.\n";
>> +          next;
>>          }
>>      }
> 
> What's the point of issuing a warning when there are some more
> iterations yet to go?
> 
> If I'm reading this correctly, this entire block should be reduced to:
> 
>     # Use «||» rather than «//» since the BEGIN block implies versions
> of Perl that lack «//» are supported.
>     my $svnlook = +(defined($ENV{SVN_BINDIR}) ?
> "$ENV{SVN_BINDIR}/$svnlook_exe" : undef) || which("svnlook") ||
> "/usr/bin/svnlook";
>     die … unless -x $svnlook and (-f _ or -l _);
> 
> I changed -e to (-f or -l) since -x implies -e.
> 
> I also disabled the fallbacks when the envvar is set but doesn't point
> to a usable executable.  I'm not married to these semantics, but I 
> think
> these semantics may be expected _given the particular variable name_:
> SVN_BINDIR is the name of the configure AC_SUBST() variable that's
> cooked into various scripts installed by install-tools.
> 
>> -  exit 1 unless $ok;
>> +  unless (defined $svnlook and length $svnlook)
>> +    {
>> +      warn "$0: required program `$svnlook_exe' does not exist or is 
>> not executable, maybe have to edit $0.\n";
>> +      exit 1;
>> +    }
>>  }
>> 
>>  
>> ######################################################################
>> @@ -95,9 +137,9 @@ sub ACCESS_READ_WRITE () { 'read-write' }
>>  
>> ######################################################################
>>  # Harvest data using svnlook.
>> 
>> -# Change into /tmp so that svnlook diff can create its .svnlook
>> +# Change into temp dir so that svnlook diff can create its .svnlook
> 
> When this script was first added, svnlook would create
> a $TMPDIR/svnlook.$unique_integer directory (without a leading dot, and
> in $TMPDIR rather than in cwd).  svnlook doesn't do that nowadays, so
> I see no reason to chdir at all — …
> 
>>  # directory.
>> -my $tmp_dir = '/tmp';
>> +my $tmp_dir = tempdir( TMPDIR => 1, CLEANUP => 1 );
> 
> … in which case, this line — though otherwise correct — should simply 
> be
> deleted altogether, and the File::Temp and Cwd uses with it.
> 
>> @@ -220,19 +262,66 @@ sub safe_read_from_pipe
>> +      while ($command = shift)
>> +        {
>> +          if ("-m" eq $command)
> 
> - This is a general-purpose "run a child process" function, so it can't
>   special-case "-m".
> 
> - If it could, it would have to check for all four variants: «-m foo»,
>   «-mfoo», «--message=foo», and «--message foo».
> 
> - It would also have to look for the «--» end-of-options guard, since
>   «-m» may occur after one.  (This can actually happen in the second
>   callsite of read_from_process().)
> 
> - None of the callers pass -m in the first place, so this is dead code.
> 
>> +            {
>> +              $comment = shift;
>> +              my ($handle, $tmpfile) = tempfile( DIR => $tmp_dir);
>> +              print $handle $comment;
>> +              close($handle);
>> +
>> +              push(@commandline, "--file");
>> +              push(@commandline, $tmpfile);
> 
> There may have been a "--" end-of-options guard earlier on the command
> line.
> 
>> +            }
>> +          else
>> +            {
>> +              # Munge the command to protect it from the command line
>> +              $command =~ s/\"/\\\"/g;
>> +              if ($command =~ m"\s") { $command = "\"$command\""; }
>> +              if ($command eq "") { $command = "\"\""; }
> 
> I'm pretty sure this escaping is incomplete.  Use a library function
> rather than roll your own escaping.
> 
>> +              if ($command =~ m"\n")
>> +                {
>> +                  warn "$0: carriage return detected in command - may 
>> not work\n";
> 
> If it "may not work", then die() rather than warn().
> 
>> +                }
>> +              push(@commandline, $command);
> ⋮
>> +      open(SAFE_READ, "@commandline |")
> 
>> They can be applied in sequence:
>> svn-patch-01-c73cb4415d.txt
>> svn-patch-02-68cc555740.txt
>> 
>> In principle I think the changes are useful, though might change
>> behavior in a surprising way for those who are using the hook as it
>> stands now.
>> 
> 
> Usefulness aside, I think the first patch is unacceptable because of
> backwards compatibility considerations and the second patch would not 
> be
> acceptable unless revised.
> 
>> Would it make sense to copy check-mime-type.pl, calling it, say,
>> check-mime-type-2.pl, and apply the changes to that file, keeping the
>> original as it is now?
> 
> For the first patch, moving the new behaviours to a new script would
> address the backwards compatibility point, as would making the new
> behaviours optional and opt-in.  However, neither would address the
> other review points.
> 
> The second patch doesn't appear to raise any backwards compatibility
> concerns.
> 
> Cheers,
> Daniel

Kind regards
Maddes

Re: [PATCH] check-mime-type.pl

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Nathan Hartman wrote on Fri, Sep 11, 2020 at 10:38:24 -0400:
> Subject: [PATCH] Check also property-modified files in hook script
>  check-mime-type.pl
> 
> * contrib/hook-scripts/check-mime-type.pl:
>   (proplist) add property-modified files for checks

There are several changes not mentioned in the log message:

- Script name fix in comments

- Move 'use' before 'BEGIN'
  (incidentally, I'm not actually sure that's a correct change, given
  the BEGIN block implies perls as old as 5.4 are supported.)

- Add docstrings

> +++ b/contrib/hook-scripts/check-mime-type.pl
> @@ -1,18 +1,19 @@
>  #!/usr/bin/env perl
>  
>  # ====================================================================
> -# commit-mime-type-check.pl: check that every added file has the
> -# svn:mime-type property set and every added file with a mime-type
> -# matching text/* also has svn:eol-style set.
> +# check-mime-type.pl: check that every added or property-modified file
> +# has the svn:mime-type property set and every added or property-modified
> +# file with a mime-type matching text/* also has svn:eol-style set.

I don't think it's acceptable for a script called "check-mime-type.pl"
to start enforcing svn:eol-style as well.  This violates the principles
of least surprise and of "Do one thing and do it well".  Also, the
choice of text/* is a bit arbitrary, and would seem to have both false
positives and false negatives: e.g., application/xml and text/x-diff.

I'm not even sure it's acceptable for this script to start enforcing
svn:mime-type's presence upon propmods, either.  It's not exactly
a natural extension that existing users would expect.

>  # ====================================================================
>  # Copyright (c) 2000-2004 CollabNet.  All rights reserved.
> +# Copyright (c) 2014-2020 Apache Software Foundation (ASF).

How did you come up with these numbers?  The script received commits in
2013 and 2015, so the end point should be 2013.  (Also, the next line
talks of a "COPYING" file which doesn't exist, but that's a preëxisting
problem.)

> @@ -100,19 +101,20 @@ my $tmp_dir = '/tmp';
> -# Figure out what files have added using svnlook.
> -my @files_added;
> +# Figure out what files have been added/property-modified using svnlook.
> +my @paths_to_check;
> +my $props_changed_re = qr/^(?:A |[U_ ]U)  (.*[^\/])$/;
>  foreach my $line (&read_from_process($svnlook, 'changed', $repos, '-t', $txn))
>    {
> -		# Add only files that were added to @files_added
> -    if ($line =~ /^A.  (.*[^\/])$/)
> +    # Add only files that were added/property-modified to @paths_to_check
> +    if ($line =~ /$props_changed_re/)

Why did you change /^A./ to /^A /?

> Subject: [PATCH] Enhanced platform support of hook script check-mime-type.pl
> 
> * contrib/hook-scripts/check-mime-type.pl:
>   (svnlook) Cross Platform: sane executable determination

> +++ b/contrib/hook-scripts/check-mime-type.pl
> @@ -28,6 +28,15 @@
>  
>  use strict;
>  use Carp;
> +use Config;
> +use Cwd;

«use Cwd qw(getcwd);» would be better.

> +use Env;

Why?

This syntax (according to «perldoc Env») "ties all existing environment
variables ("keys %ENV") to scalars".  This seems like a dangerous thing
to do; in fact, it negates the «use strict» a few lines above.

I suspect this line should be deleted.

> +use File::Which  qw(which);
> +use File::Temp  qw(tempdir tempfile);
> +
> +my $old_dir;
> +my $os_windows = $^O eq 'MSWin32';
> +my $os_dos = $^O eq 'dos';
>  
>  # Turn on warnings the best way depending on the Perl version.
>  BEGIN {
> @@ -35,6 +44,12 @@ BEGIN {
>      { require warnings; import warnings; }
>    else
>      { $^W = 1; }
> +
> +  $old_dir = getcwd;
> +}
> +
> +END {
> +  chdir($old_dir);
>  }

Why is this necessary?

>  
>  
> @@ -42,31 +57,58 @@ BEGIN {
>  # Configuration section.
>  
>  # Svnlook path.
> -my $svnlook = "/usr/bin/svnlook";
> +my $svnlook;
>  
>  # Since the path to svnlook depends upon the local installation
>  # preferences, check that the required program exists to insure that
>  # the administrator has set up the script properly.
>  {
> -  my $ok = 1;
> -  foreach my $program ($svnlook)
> +  my $svnlook_exe = 'svnlook' . $Config{_exe};
> +  my @svnlook_paths;
> +
> +  if ($os_windows || $os_dos)
> +    {
> +      # On Windows/DOS the environment is available
> +      my $svn_bindir = $ENV{'SVN_BINDIR'};

Shouldn't the condition be «if (defined $ENV{SVN_BINDIR})»?

> +      if (defined $svn_bindir and length $svn_bindir)
> +        {
> +          $svnlook_paths[++$#svnlook_paths] = "$svn_bindir/$svnlook_exe";

Why does this line use «$arrayvar[++$#arrayvar] = $foo» rather than
«push @arrayvar, $foo», which is equivalent and idiomatic?

> +        }
> +
> +      $svnlook = which($svnlook_exe);
> +      if (defined $svnlook and length $svnlook)
> +        {
> +          $svnlook_paths[++$#svnlook_paths] = $svnlook;
> +        }

Why do we bother to have $svnlook tested for existence and
executability?  Don't we trust File::Which::which() to work as
advertised?

> +      undef $svnlook;
> +    }
> +
> +  # On Unix the environment is empty, define fallback with absolute path
> +  $svnlook_paths[++$#svnlook_paths] = "/usr/bin/$svnlook_exe";
> +  foreach my $program (@svnlook_paths)
>      {
>        if (-e $program)
>          {
>            unless (-x $program)
>              {
> -              warn "$0: required program `$program' is not executable, ",
> -                   "edit $0.\n";
> -              $ok = 0;
> +              warn "$0: program `$program' is not executable.\n";
> +              next;
>              }
> +          $svnlook = $program;
> +          last;
>          }
>        else
>          {
> -          warn "$0: required program `$program' does not exist, edit $0.\n";
> -          $ok = 0;
> +          warn "$0: program `$program' does not exist.\n";
> +          next;
>          }
>      }

What's the point of issuing a warning when there are some more
iterations yet to go?

If I'm reading this correctly, this entire block should be reduced to:

    # Use «||» rather than «//» since the BEGIN block implies versions of Perl that lack «//» are supported.
    my $svnlook = +(defined($ENV{SVN_BINDIR}) ? "$ENV{SVN_BINDIR}/$svnlook_exe" : undef) || which("svnlook") || "/usr/bin/svnlook";
    die … unless -x $svnlook and (-f _ or -l _);

I changed -e to (-f or -l) since -x implies -e.

I also disabled the fallbacks when the envvar is set but doesn't point
to a usable executable.  I'm not married to these semantics, but I think
these semantics may be expected _given the particular variable name_:
SVN_BINDIR is the name of the configure AC_SUBST() variable that's
cooked into various scripts installed by install-tools.

> -  exit 1 unless $ok;
> +  unless (defined $svnlook and length $svnlook)
> +    {
> +      warn "$0: required program `$svnlook_exe' does not exist or is not executable, maybe have to edit $0.\n";
> +      exit 1;
> +    }
>  }
>  
>  ######################################################################
> @@ -95,9 +137,9 @@ sub ACCESS_READ_WRITE () { 'read-write' }
>  ######################################################################
>  # Harvest data using svnlook.
>  
> -# Change into /tmp so that svnlook diff can create its .svnlook
> +# Change into temp dir so that svnlook diff can create its .svnlook

When this script was first added, svnlook would create
a $TMPDIR/svnlook.$unique_integer directory (without a leading dot, and
in $TMPDIR rather than in cwd).  svnlook doesn't do that nowadays, so
I see no reason to chdir at all — …

>  # directory.
> -my $tmp_dir = '/tmp';
> +my $tmp_dir = tempdir( TMPDIR => 1, CLEANUP => 1 );

… in which case, this line — though otherwise correct — should simply be
deleted altogether, and the File::Temp and Cwd uses with it.

> @@ -220,19 +262,66 @@ sub safe_read_from_pipe
> +      while ($command = shift)
> +        {
> +          if ("-m" eq $command)

- This is a general-purpose "run a child process" function, so it can't
  special-case "-m".

- If it could, it would have to check for all four variants: «-m foo»,
  «-mfoo», «--message=foo», and «--message foo».

- It would also have to look for the «--» end-of-options guard, since
  «-m» may occur after one.  (This can actually happen in the second
  callsite of read_from_process().)

- None of the callers pass -m in the first place, so this is dead code.

> +            {
> +              $comment = shift;
> +              my ($handle, $tmpfile) = tempfile( DIR => $tmp_dir);
> +              print $handle $comment;
> +              close($handle);
> +
> +              push(@commandline, "--file");
> +              push(@commandline, $tmpfile);

There may have been a "--" end-of-options guard earlier on the command
line.

> +            }
> +          else
> +            {
> +              # Munge the command to protect it from the command line
> +              $command =~ s/\"/\\\"/g;
> +              if ($command =~ m"\s") { $command = "\"$command\""; }
> +              if ($command eq "") { $command = "\"\""; }

I'm pretty sure this escaping is incomplete.  Use a library function
rather than roll your own escaping.

> +              if ($command =~ m"\n")
> +                {
> +                  warn "$0: carriage return detected in command - may not work\n";

If it "may not work", then die() rather than warn().

> +                }
> +              push(@commandline, $command);
⋮
> +      open(SAFE_READ, "@commandline |")

> They can be applied in sequence:
> svn-patch-01-c73cb4415d.txt
> svn-patch-02-68cc555740.txt
> 
> In principle I think the changes are useful, though might change
> behavior in a surprising way for those who are using the hook as it
> stands now.
> 

Usefulness aside, I think the first patch is unacceptable because of
backwards compatibility considerations and the second patch would not be
acceptable unless revised.

> Would it make sense to copy check-mime-type.pl, calling it, say,
> check-mime-type-2.pl, and apply the changes to that file, keeping the
> original as it is now?

For the first patch, moving the new behaviours to a new script would
address the backwards compatibility point, as would making the new
behaviours optional and opt-in.  However, neither would address the
other review points.

The second patch doesn't appear to raise any backwards compatibility
concerns.

Cheers,

Daniel